This post walks through two complementary Petri-net formulations of tic-tac-toe, progressing from a minimal turn-based grid to a memory-augmented model capable of reasoning about win conditions. These examples demonstrate how Petri nets naturally encode available actions, enforce alternation, and accumulate structured history for higher-level reasoning.
The final sections outline how these models can be converted into ODE systems via go-pflow and how to apply parameter sweeps to explore strategy, reachability, and path likelihoods.
In the first stage, we represent the tic-tac-toe grid as nine places P00–P22, each initialized with a token indicating an empty cell.
For every square we define:
Xij transitions — claim the square for XOij transitions — claim the square for OA dedicated Next place enforces turn-taking:
NextNext is marked, preserving alternationThis structure cleanly separates:
All purely through token movement and local composability—no global logic required.
The second version keeps the same grid & turn logic but introduces a memory layer.
Whenever a move fires (e.g. X00):
P00_X00Similarly, all O moves populate _Oij history places.
This upgrades the model from state of the board to sequence of moves. Now we can reason about why the board reached its state—not just its current configuration.
Once we have history places, we can build pattern collectors—transitions that listen for combinations like:
X00_X01_X02X00_X10_X20X00_X11_X22A pattern collector consumes history tokens from the three required squares, producing a structural token such as:
X_has_top_rowO_has_diagFinally, we add win detection transitions:
win_x fires when any X pattern place markswin_o fires for O patternsThis creates a fully compositional win-detection system without any global conditional logic.
The board uses position IDs 00–22 (row, column). There are exactly 8 winning patterns: 3 rows, 3 columns, and 2 diagonals.
Each position participates in a different number of winning patterns. The center (11) participates in 4 patterns—both diagonals plus its row and column. Corners participate in 3 patterns (one diagonal plus row and column). Edges only participate in 2 (row and column).
Pattern collectors consume 3 history tokens from a complete line and produce a single win token. Higher pattern participation means more paths to victory, which directly translates to higher ODE scores.
The Petri net above shows the full model with pattern collectors and game-halting win detection. When a win is detected, the transition consumes the current turn token without returning it—this prevents further play after the game ends.
Without game-halting, the ODE simulation continues exploring the full state space even after one player has won. This distorts strategic values because we're averaging over impossible continuations—games that would never happen in practice.
The key insight: win transitions must consume turn tokens. When X wins:
x_win_* transition consumes o_turn (it was O's turn when X completed the line)This makes win states into absorbing states in the ODE dynamics—once reached, no further flow is possible. The simulation now correctly weights paths based on reachable game outcomes rather than mathematical completeness.
Using go-pflow, we can convert this Petri net into an ODE system and simulate forward to predict outcomes. The key insight: win place token accumulation predicts game outcomes.
For each possible move, we:
win_x and win_o placesX evaluates all 9 positions. The ODE reveals strategic value through model topology alone:
The center position scores highest (1.27) because it participates in 4 winning patterns (2 diagonals + row + column). Corners score next (0.95, 3 patterns each). Edges score lowest (0.63, 2 patterns each). The dotted box indicates the recommended move.
No game heuristics are coded. Strategic value emerges purely from the Petri net's structure.
After X takes center, O evaluates defensive options:
All scores are negative (X has the advantage), but corners (-1.04) minimize X's lead better than edges (-1.38). The ODE naturally discovers the defensive principle: corners against center.
X has center, O has corner. The ODE identifies the diagonal threat:
Positions (0,2) and (2,0) score highest (1.39) because they create two-way threats: each completes one diagonal while opening another winning path. The model's pattern collectors naturally amplify these fork positions.
O faces X's imminent win threat (X has two in a row):
Position (1,2) scores best (0.80) because it blocks X's winning move. Other positions score lower (0.74–0.77). The ODE simulation naturally discovers blocking as the dominant strategy—no explicit "if opponent has two in a row, block" logic required.
A critical enhancement: the model tracks move tokens to detect draws. Each play transition deposits a token into a move_tokens counter. A draw transition fires when:
game_active is still marked (no winner yet)The draw transition awards a point to win_o, making "not losing" valuable. This has a profound effect on the ODE dynamics:
Without draw detection: Games that end in draws "leak" out of the system without affecting win counts. The ODE only sees win paths, causing it to favor positions with more winning lines (corners) even when blocking is essential.
With draw detection: All outcomes flow to either win_x or win_o. Blocking a threat now has measurable value because it preserves the possibility of a draw—and draws count for O.
This is why scores are now positive for O (0.74–0.80 instead of negative). O's expected outcome includes draws, making blocking the clearly dominant move.
The magic is in the pattern collector transitions. When X occupies two squares of a winning pattern, the ODE flow toward win_x increases. Blocking that pattern cuts off the flow.
Score = my_win_final − opponent_win_final
From each player's perspective, higher is better. The formula encodes minimax-like behavior: maximize your win potential while minimizing your opponent's. With draw detection included, the continuous dynamics correctly approximate discrete game tree search.
Note: Examples 1–3 show scores from X's perspective (before draw detection was added), while Example 4 shows O's perspective with the full model. The relative rankings within each example demonstrate how the ODE naturally discovers strategic principles.
Data generated with go-pflow/examples/tictactoe
This tic-tac-toe model demonstrates how a simple game becomes deeply expressive when framed through Petri-net structure. By building the system in layers—grid and turn alternation, history tracking, pattern collectors, and win detection—we create a model capable of representing not just what happened, but why.
Every rule arises locally from token flow. There's no central controller, no global conditional logic—just emergent behavior from places, transitions, and structured history. The ODE simulation reveals strategic value without any game-specific heuristics.
For more on this approach, see Declarative Differential Models.