This post walks through two complementary Petri-net formulations of tic-tac-toe, progressing from a minimal turn-based grid to a memory-augmented model capable of reasoning about win conditions. These examples demonstrate how Petri nets naturally encode available actions, enforce alternation, and accumulate structured history for higher-level reasoning.
The final sections outline how these models can be converted into ODE systems via go-pflow and how to apply parameter sweeps to explore strategy, reachability, and path likelihoods.
In the first stage, we represent the tic-tac-toe grid as nine places P00–P22, each initialized with a token indicating an empty cell.
For every square we define:
Xij transitions — claim the square for XOij transitions — claim the square for OA dedicated Next place enforces turn-taking:
NextNext is marked, preserving alternationThis structure cleanly separates:
All purely through token movement and local composability—no global logic required.
The second version keeps the same grid & turn logic but introduces a memory layer.
Whenever a move fires (e.g. X00):
P00_X00Similarly, all O moves populate _Oij history places.
This upgrades the model from state of the board to sequence of moves. Now we can reason about why the board reached its state—not just its current configuration.
Once we have history places, we can build pattern collectors—transitions that listen for combinations like:
X00_X01_X02X00_X10_X20X00_X11_X22A pattern collector consumes history tokens from the three required squares, producing a structural token such as:
X_has_top_rowO_has_diagFinally, we add win detection transitions:
win_x fires when any X pattern place markswin_o fires for O patternsThis creates a fully compositional win-detection system without any global conditional logic.
Notice above how a winning set of moves activates a win detection transition that populates the win_x place.
Using go-pflow, we can convert this Petri net into an ODE system and simulate forward to predict outcomes. The key insight: win place token accumulation predicts game outcomes.
For each possible move, we:
win_x and win_o placesX evaluates all 9 positions. The ODE reveals strategic value through model topology alone:
Board: Score Heatmap (X's view):
╔═══╦═══╦═══╗ ╔═══════╦═══════╦═══════╗
║ 0 ║ 1 ║ 2 ║ ║ 0.316 ║ 0.218 ║ 0.316 ║
╠═══╬═══╬═══╣ ╠═══════╬═══════╬═══════╣
║ 3 ║ 4 ║ 5 ║ => ║ 0.218 ║ 0.430 ║ 0.218 ║
╠═══╬═══╬═══╣ ╠═══════╬═══════╬═══════╣
║ 6 ║ 7 ║ 8 ║ ║ 0.316 ║ 0.218 ║ 0.316 ║
╚═══╩═══╩═══╝ ╚═══════╩═══════╩═══════╝
Best: center (4)
The center position scores highest (0.430) because it participates in 4 winning patterns (2 diagonals + row + column). Corners score next (0.316, 3 patterns each). Edges score lowest (0.218, 2 patterns each).
No game heuristics are coded. Strategic value emerges purely from the Petri net's structure.
After X takes center, O evaluates defensive options:
Board: Score Heatmap (O's view):
╔═══╦═══╦═══╗ ╔════════╦════════╦════════╗
║ 0 ║ 1 ║ 2 ║ ║ -0.375 ║ -0.469 ║ -0.375 ║
╠═══╬═══╬═══╣ ╠════════╬════════╬════════╣
║ 3 ║ X ║ 5 ║ => ║ -0.469 ║ X ║ -0.469 ║
╠═══╬═══╬═══╣ ╠════════╬════════╬════════╣
║ 6 ║ 7 ║ 8 ║ ║ -0.375 ║ -0.469 ║ -0.375 ║
╚═══╩═══╩═══╝ ╚════════╩════════╩════════╝
Best: any corner
All scores are negative (X has the advantage), but corners (-0.375) minimize X's lead better than edges (-0.469). The ODE naturally discovers the defensive principle: corners against center.
X has center and corner. The ODE identifies the diagonal threat:
Board: Score Heatmap (X's view):
╔═══╦═══╦═══╗ ╔═══════╦═══════╦═══════╗
║ O ║ 1 ║ 2 ║ ║ O ║ 0.345 ║ 0.522 ║ <- 02 threatens diagonal
╠═══╬═══╬═══╣ ╠═══════╬═══════╬═══════╣
║ 3 ║ X ║ 5 ║ => ║ 0.345 ║ X ║ 0.339 ║
╠═══╬═══╬═══╣ ╠═══════╬═══════╬═══════╣
║ 6 ║ 7 ║ 8 ║ ║ 0.522 ║ 0.339 ║ 0.361 ║ <- 20 also high
╚═══╩═══╩═══╝ ╚═══════╩═══════╩═══════╝
Best: 02 or 20 (0.522)
Position 02 and 20 score highest (0.522) because they create two-way threats: each completes one diagonal while opening another winning path. The model's pattern collectors naturally amplify these fork positions.
O faces X's imminent win threat (X has two in a row):
Board: Score Heatmap (O's view):
╔═══╦═══╦═══╗ ╔════════╦════════╦════════╗
║ O ║ 1 ║ X ║ ║ O ║ -0.617 ║ X ║
╠═══╬═══╬═══╣ ╠════════╬════════╬════════╣
║ X ║ X ║ 5 ║ => ║ X ║ X ║ -0.191 ║ <- must block!
╠═══╬═══╬═══╣ ╠════════╬════════╬════════╣
║ O ║ 7 ║ 8 ║ ║ O ║ -0.494 ║ -0.501 ║
╚═══╩═══╩═══╝ ╚════════╩════════╩════════╝
Best: 12 (block row)
Position 12 scores best (-0.191) because it blocks X's winning move. Other positions score much worse (-0.5 to -0.6). The ODE simulation naturally discovers blocking as the dominant strategy—no explicit "if opponent has two in a row, block" logic required.
The magic is in the pattern collector transitions. When X occupies two squares of a winning pattern, the ODE flow toward win_x increases. Blocking that pattern cuts off the flow.
Score = win_x_final − win_o_final
This formula encodes minimax-like behavior: maximize your win potential while minimizing your opponent's. The continuous dynamics approximate discrete game tree search.
Data generated with go-pflow examples/tictactoe
This tic-tac-toe model demonstrates how a simple game becomes deeply expressive when framed through Petri-net structure. By building the system in layers—grid and turn alternation, history tracking, pattern collectors, and win detection—we create a model capable of representing not just what happened, but why.
Every rule arises locally from token flow. There's no central controller, no global conditional logic—just emergent behavior from places, transitions, and structured history. The ODE simulation reveals strategic value without any game-specific heuristics.
For more on this approach, see Declarative Differential Models.