Prisoner's Dilemma Simulator: Axelrod Tournament Strategies

simulator intermediate ~10 min
Loading simulation...
TFT ≈ 2.68 per round — Tit-for-Tat wins the tournament

With default parameters (200 rounds, 5% noise, 50% TFT, 25% Defect, 25% Cooperate), Tit-for-Tat achieves the highest average score per round, confirming Axelrod's finding that reciprocal cooperation is the most robust strategy.

Formula

Payoff matrix: T=5, R=3, P=1, S=0 with T > R > P > S and 2R > T + S
TFT strategy: cooperate on round 1, then copy opponent's previous move
Expected score (TFT vs TFT): R = 3 per round

The Prisoner's Dilemma

The Prisoner's Dilemma is the most studied model in game theory. Two players simultaneously choose to cooperate (C) or defect (D). If both cooperate, they each receive a reward R=3. If both defect, they each get a punishment P=1. But if one defects while the other cooperates, the defector gets the temptation payoff T=5 while the cooperator gets the sucker's payoff S=0. The dilemma: rational self-interest drives both to defect, even though mutual cooperation is better for both.

Axelrod's Tournament

In 1980, political scientist Robert Axelrod invited game theorists to submit strategies for a computer tournament of the iterated Prisoner's Dilemma. The winner was Tit-for-Tat (TFT), submitted by Anatol Rapoport — the simplest strategy entered. TFT cooperates on the first move, then copies whatever the opponent did on the previous move. It won not by beating any single opponent, but by accumulating high scores against a wide range of strategies.

Why Reciprocity Works

TFT embodies four principles that Axelrod identified as keys to success: niceness (never be the first to defect), retaliation (respond to defection immediately), forgiveness (return to cooperation if the opponent does), and clarity (be predictable enough that opponents can learn to cooperate with you). These principles have profound implications far beyond game theory — they illuminate the evolution of cooperation in biology, diplomacy, and everyday social interaction.

The Role of Noise

In real-world interactions, signals are imperfect. The 'noise' parameter models accidental defections or misunderstood cooperations. Under noise, strict TFT can get trapped in cycles of mutual retaliation triggered by a single error. This led researchers like Nowak and Sigmund to discover strategies like Win-Stay Lose-Shift (Pavlov) that are more robust to noise. Increase the noise slider to see how error degrades cooperative strategies.

Interpreting the Simulation

Adjust the population shares to see how different ecological compositions change outcomes. In a world dominated by defectors, even TFT struggles. But when enough cooperative or reciprocal strategies are present, they form clusters of mutual cooperation that outperform pure defection. This is the essence of Axelrod's insight: cooperation can evolve and sustain itself even among self-interested agents, given sufficient repetition and the possibility of reciprocity.

FAQ

What is the Prisoner's Dilemma?

The Prisoner's Dilemma is a canonical game in game theory where two rational agents each choose to cooperate or defect. Mutual cooperation yields a good outcome for both (R=3,3), but each has an individual incentive to defect (T=5 vs S=0), leading to mutual defection (P=1,1) — a worse outcome for both.

Why does Tit-for-Tat win in iterated games?

In Robert Axelrod's famous 1984 computer tournaments, Tit-for-Tat won because it combines four key properties: it is nice (never defects first), retaliatory (punishes defection immediately), forgiving (returns to cooperation after one punishment), and clear (its strategy is easy to recognize).

What is the payoff matrix T=5, R=3, P=1, S=0?

These are the standard Prisoner's Dilemma payoffs: T (Temptation to defect) = 5, R (Reward for mutual cooperation) = 3, P (Punishment for mutual defection) = 1, S (Sucker's payoff) = 0. The condition T > R > P > S ensures the dilemma structure.

How does noise affect the Prisoner's Dilemma?

Noise — random errors in executing the intended action — can trigger 'echo effects' in Tit-for-Tat, where a single error cascades into alternating defections. This is why more forgiving strategies like Generous TFT or Win-Stay Lose-Shift can outperform strict TFT in noisy environments.

Sources

Embed

<iframe src="https://homo-deus.com/lab/game-theory/prisoners-dilemma/embed" width="100%" height="400" frameborder="0"></iframe>
View source on GitHub