Stag Hunt

A page for ideas about modeling signalling in the stag-hunt game.

(Anna - can you edit this okay? Use the "raw edit" rather than "edit" as the latter actually wrecks some formatting)

I was trying to think of ways that agents might pay attention to arbitrary cues, which was proving hard to imagine. But I think Joseph wants to take it for granted that some activities enhance the chance that others will cooperate in a subsequent stag hunt game.

Then we might expect to see agents preferentially choosing to take part in such activities, rather than other activities that result in less or no such enhancement. They should do so even if these other activities are more immediately lucrative (at the time they're carried out). Likewise they might take part in the enhancing activities despite these being costly at the time they're carried out.

An illustrative model might go something like...

The causal model is like this:

drawing1.png

Agents have a current state that is binary: $s_i \in \{0,1\}$. If $s_i = 1$ they are (currently) inclined to hunt stags. A stag hunt is a game in which the participants get a payoff 3 (or something...) but this only happens provided there are enough other participants, say $\geq \Gamma$ of them. If there aren't enough, the $s=1$ participants get zero. And if $s_i = 0 $ they get a bankable payoff of 1, regardless of what others choose.

Suppose there are lots of activities (indexed by "a"), with costs $c_a$ and abilities to enhance cooperation $\alpha_a$. We could just say that $\alpha_a$ is the probability that any given agent undertaking the activity will play $s=1$ is a subsequent stag hunt game. That is, the activity totally determines the subsequent behaviour, and over-rides any natural inclination or previous experience. For now!

The i-th agent might have an underlying nature consisting of preferences for engaging in different activities. We could denote the preference of this agent for activity a by, say, $ P^i_a $.

Dynamics might then be idealised as an iterative process in which, each iteration,
  1. agents choose what activity they do, by sampling from the distribution given by $ P^i_a $, where $ \sum_a P^i_a =1$
  2. for each group $G_a$, composed of those agents that chose it:
    • each agent starts off with an initial payoff of $1 - c_a$, reflecting the cost of taking part in the activity.
    • each agent in the group sets $s_i = 1$ with probability $\alpha_a$, and $s_i = 0$ otherwise.
    • those that have $s_i = 1$ engage in a stag hunt together and receive the ensuing payoff, if there is one, while those with $s_i = 0$ all get payoff of 1.
  3. some form of replication occurs: those agents with preferences that lead to high payoffs should be over-represented in the next iteration. But some variation is also introduced: the copying process of $ P^i_a $ is imperfect.

We'd perhaps have lots of groups, with costs between zero and one, and cooperation-induction-abilities between zero and one. A simulation could make these up randomly. Very high $\alpha_a$ are implausible however...

We expect to see everyone "learning" (or "evolving") over time to prefer activities that have low cost $ c_a$ and have high enhancement $\alpha_a$. If the choice is limited and there are no zero-cost activities that enhance, we'd expect to see agents choosing to do activities that enhance subsequent stag hunting, despite the associated costs. They ought to prefer these over activities that are cheaper but don't do enough enhancing.

Open problem: maths - write the expected payoff an agent receives (on average) as a function of its preferences, and look at the gradient. Hence derive the tradeoff between cost and enhancement, as a function of stag-hunt benefit (e.g. 3 but this is arbitrary), population size and gamma. etc.