582c57ba188f4b02.tex
1: \begin{abstract}
2:     Altruistic cooperation is costly yet socially desirable.
3:     As a result, agents struggle to learn cooperative policies through independent reinforcement learning (RL).
4:     Indirect reciprocity, where agents consider their interaction partner's reputation, has been shown to stabilise cooperation in homogeneous, idealised populations.
5:     However, more realistic settings are  comprised of heterogeneous agents with different characteristics and group-based social identities.
6:     We study cooperation when agents are stratified into two such groups, and allow reputation updates and actions to depend on group information.
7:     We consider two modelling approaches: evolutionary game theory, where we comprehensively search for social norms (i.e., rules to assign reputations) leading to cooperation and fairness; and RL, where we consider how the stochastic dynamics of policy learning affects the analytically identified equilibria.
8:     We observe that a defecting majority leads the minority group to defect, but not the inverse.
9:     Moreover, changing the norms that judge in- and out-group interactions can steer a system towards either fair or unfair cooperation.
10:     This is made clearer when moving beyond equilibrium analysis to independent RL agents, where convergence to fair cooperation occurs with a narrower set of norms.
11:     Our results highlight that, in heterogeneous populations with reputations, carefully defining interaction norms is fundamental to tackle both dilemmas of cooperation and of fairness.
12: \end{abstract}
13: