0309:cs0309016/McDWag.tex

1: \documentclass[10pt]{article}

2: \usepackage{latexsym}

3: \usepackage{epsfig}

4: \usepackage{amsmath}

5: \usepackage{graphics}

6: \usepackage{amssymb}

7: \usepackage{amsthm}

8: \usepackage{amsopn}

9: \usepackage{amscd}

10: \usepackage{fullpage}

11: %\usepackage{cec2003,multicol,times}

12:

13: \newtheorem{thm}{Theorem}

14: \newtheorem{cor}[thm]{Corollary}

15: \newtheorem{lem}[thm]{Lemma}

16: \newtheorem{prop}[thm]{Proposition}

17: \newtheorem{defn}[thm]{Definition}

18: \newtheorem{rem}[thm]{Remark}

19: \numberwithin{equation}{section}

20: \numberwithin{figure}{section}

21: \numberwithin{thm}{section}

22: \newtheorem{exm}[thm]{Example}

23:

24: \begin{document}

25:

26: %\pagestyle{empty}

27: %\sloppy

28:

29: %\twocolumn[

30:

31: \title{Using Simulated Annealing to Calculate the Trembles of Trembling Hand

32: Perfection}

33: %\vspace{0.1in}

34: %\begin{multicols}{2}

35: \begin{center}

36:

37: \textbf{Stuart McDonald}  \\

38: School of Economics \\

39: The University of Queensland \\

40: Queensland 4072, \\

41: Australia \\

42: s.mcdonald@mailbox.uq.edu.au\\

43: \end{center}

44:

45: \begin{center}

46: \textbf{Liam Wagner} \\

47: Department of Mathematics and \\

48: St John's College, within \\

49: The University of Queensland \\

50: Queensland 4072, Australia \\

51: LDW@maths.uq.edu.au

52: \end{center}

53: %\end{multicols}

54: %\vspace{0.25in}

55: %]

56:

57:

58: \begin{abstract}

59: \noindent Within the literature on non-cooperative game theory, there have been a

60: number of algorithms which will compute Nash equilibria.

61: This paper shows that the family of algorithms known as Markov chain Monte Carlo (MCMC) can be used to calculate Nash equilibria. MCMC is a type of Monte Carlo simulation that relies on Markov chains to ensure its regularity conditions. MCMC has been widely used

62: throughout the statistics and optimization literature, where variants of

63: this algorithm are known as simulated annealing. This paper shows that there

64: is interesting connection between the trembles that underlie the functioning

65: of this algorithm and the type of Nash refinement known as trembling hand

66: perfection. This paper shows that it is possible to use simulated annealing to compute this refinement.

67: \end{abstract}

68:

69: \noindent \textit{Keywords:}Trembling Hand Perfection, Equilibrium Selection and Computation, Simulated

70: Annealing, Markov Chain Monte Carlo

71:

72: \section{Introduction}

73: This paper develops an algorithm to compute a desired type of Nash Equilibrium.

74: Furthermore we use this algorithm to show existance and uniqness of sensible Nash Equilibrium. Our novel approach to this

75: problem has been motivated by the number of existance algorithms. The basis of the general

76: approach of the literature has been to rely on the geometric properties of the equilibrium.

77:

78: This paper is interested in computing Nash

79: equilibria that satisfy the type of Nash  of refinement refered to as "trembling hand"

80: perfection \cite{Selt75} \cite{Selt78}. This paper shows that simulated annealing can be used to compute the above refinement. Simulated annealing is a type of Monte Carlo sampling procedure that relies on Markov chains to ensure its regularity conditions.

81: Most applications have mainly concentrated on problems of

82: combinatorial optimization such as routing and packing problems, or problems

83: from statistical pattern recognition like image processing.

84:

85: Another well known group of algorithms for calculating Perfect Nash Equilibria are the

86: trace algorithms of Harsanyi and Selten

87: \cite{H+S}, where an outcome for the game is selected by ``tracing'' a

88: feasible path through a family of auxiliary games. The solution progress

89: along the feasible path is intended to represent the way in which players

90: adjust their expectations and predictions about the play of the game.

91:

92:

93: A major limitation of the tracing procedure is that the logarithmic version of

94: this method, does not always provide a path that traces to a perfect

95: equilibrium. Harsanyi \cite[p.69]{harsanyi}, has argued that this problem can be

96: resolved by eliminating all dominated pure strategies before applying the tracing

97: procedure. However van Damme \cite[p.77]{vDam91} constructs examples which do

98: not rquire dominated pure strategies in which the tracing procedure yields a

99: non-perfect equilibrium. Furthermore it was suggested by van Damme that the

100: inconsistancy lies in the logarithmic control costs. Games which have a control

101: cost parameter are of normal form so that players may also choose strategies,

102: incur depending on how well they choose to control their actions.

103:

104: Another limitation of the tracing procedure it relies on the algeobro-geometric

105: properties of the equilibrium. This approach has been commonly used throughout

106: the literature for computing the equilibrium of non-cooperative games. For

107: example the focus of Lemke and Howson \cite{L+H} for bimatrix

108: games and the Wilson \cite{Wils71} and Scarf \cite{Scar73} algorithm for the

109: $N$-person games has also been to utilise the fundamental geometry of games to

110: calculate equilibrium. In general these approaches to Equilibrium calculation

111: are computationally expensive.

112:

113: However, within game theory there is a history of Monte Carlo methods being

114: applied to solve non-cooperative games, e.g. starting with Ulam \cite{Ulam50}

115: in 1954. From the view point of applying global optimization techniques to

116: infinite games, Monte Carlo simulation has been used by Georgobiani and

117: Torondzadze as a means of providing Nash equilibria for rectangular games

118: \cite{GT80}. This is the approach that we will be developing in this paper.

119:

120: This paper is organised as follows.

121: The second section of this paper introduces the MCMC algorithm and provides

122: some discussion of its convergence properties in terms of Markov chain

123: theory. As a starting point for this discussion the connection between MCMC

124: sampling techniques and Monte Carlo sampling techniques is explored. The

125: MCMC algorithms include the Gibbs sampler and the Metropolis algorithm and

126: are often called simulated annealing. The third section of this paper will

127: provide a characterization of these algorithms in terms of the trembling

128: hand of trembling hand perfection. With this in mind, we provide an example

129: of the use of simulated annealing applied to calculating Nash equilibrium.

130: In this example the solution leads to equilibria that result from trembling

131: hand perfection.

132:

133: \section{A Review of Simulated Annealing}

134:

135: Monte Carlo simulation has been used extensively for solving complicated

136: problems that defy an analytic formulation. The main idea behind Monte Carlo

137: simulation is to either construct a stochastic model that is in agreement

138: with the actual problem analytically, or to simulate the problem directly.

139: One problem with Monte Carlo methods is that if the underlying probability

140: distribution is non-standard, then the convergence of sampled stochastic

141: process cannot be assured by the SLLN. One way around this is to realize

142: that a stochastic process can be generated from any process that draws its

143: samples from the support of underlying distribution. Markov chain Monte

144: Carlo (MCMC) does this by constructing a Markov chain that uses the

145: underlying distribution as its stationary distribution. This enables the

146: simulation of the stochastic process for non-standard distributions, while

147: ensuring that the SLLN will hold.

148:

149: As an illustration of the MCMC we will discuss the \emph{Metropolis algorithm} \cite

150: {MRRTT53}. In this algorithm, each iteration will comprise $h$ updating

151: steps. Let $X_{t.i}$ denote the state of $X_{i}$ at the end of the $t$th

152: iteration. For step $i$ of iteration $t+1$, $X_{i}$ is updated using the

153: Metropolis algorithm. The candidate $Y_{i}$ is generated from a \emph{%

154: proposal distribution} $q_{i}\left( Y_{i}|X_{t,i},X_{t,-i}\right) $, where $%

155: X_{t,-i}$ denotes the value of

156: \begin{equation*}

157: X_{-i}=\left\{ X_{1},...,X_{i-1},X_{i+1},...,X_{h}\right\}

158: \end{equation*}

159: after completing step $i-1$ of iteration $t+1$, i.e.

160: \begin{equation*}

161: X_{t,-i}=\left\{ X_{t+1,1},...,X_{t+1,i-1},X_{t.i+1},...,X_{t.h}\right\} ,

162: \end{equation*}

163: where the components $X_{t,i+1},...,X_{t,h}$ have yet to be updated and

164: components $X_{t+1,1},...,X_{t+1,i-1}$ have already been updated. Thus the

165: proposal distribution of the $i$th component $q_{i}\left( \cdot |\cdot

166: ,\cdot \right) $, generates a candidate for only the $i$th component of $X$.

167: The candidate is accepted with probability

168: \begin{equation*}

169: \alpha \left( X_{-i},X_{i},Y_{i}\right) =\min \left( 1,\frac{\pi \left(

170: Y_{i}|X_{-i}\right) q\left( X_{i}|Y_{i},X_{-i}\right) }{\pi \left(

171: X_{i}|X_{-i}\right) q\left( Y_{i}|X_{i},X_{-i}\right) }\right) ,

172: \end{equation*}

173: where

174: \begin{equation*}

175: \pi \left( X_{i}|X_{-i}\right) =\frac{\pi \left( X\right) }{\int \pi \left(

176: X\right) dX_{.i}}

177: \end{equation*}

178: is the full conditional distribution for $X_{i}$ under $\pi \left( \cdot

179: \right) $. If $Y_{.i}$ is accepted, then $X_{t+1,i}=Y_{i}$; otherwise $%

180: X_{t+1,i}=X_{t,i}$. For this reason $\alpha \left(

181: X_{.-i},X_{.i},Y_{.i}\right) $ is known as the \emph{Metropolis criterion}.

182:

183: One of the disadvantages of this algorithm is the complexity of the

184: Metropolis criterion\emph{\ }$\alpha \left( X_{.-i},X_{.i},Y_{.i}\right) $.

185: In practice $\alpha \left( X_{.-i},X_{.i},Y_{.i}\right) $ often simplifies

186: considerably, particularly when $\pi \left( \cdot \right) \,$derives from a

187: conditional independence model \cite{Gilks96} \cite{Rob96}. However, the

188: single component Metropolis algorithm has the advantage of employing the

189: full conditional distributions for $\pi \left( \cdot \right) $ and Besag

190: \cite{Besag74} has shown that $\pi \left( \cdot \right) $ will be uniquely

191: determined by its full conditional distribution. As a result $\alpha \left(

192: X_{.-i},X_{.i},Y_{.i}\right) $ will generate samples from a unique target

193: distribution $\pi \left( \cdot \right) $.

194:

195: An alternative approach for constructing a Markov chain with a stationary

196: distribution $\pi \left( \cdot \right) ,$ that provides a generalization of

197: the approach suggested by Metropolis et al. \cite{MRRTT53}, has been

198: suggested by Hastings \cite{Hast70}. At each point in time $t$, the next

199: state $X_{t+1}$ is chosen by first sampling a candidate point $Y$ from a

200: proposal distribution $q\left( \cdot |X_{t}\right) $. The candidate point $Y$

201: is then accepted in accordance with the criterion

202: \begin{equation*}

203: \alpha \left( X,Y\right) =\min \left( 1,\frac{\pi \left( Y\right) }{\pi

204: \left( X\right) }\right) .

205: \end{equation*}

206: Under this criterion, if the candidate point is accepted, then $X_{t+1}=Y$,

207: otherwise $X_{t+1}=X_{t}$. The main difference between this algorithm and

208: the one proposed by Metropolis et al. \cite{MRRTT53}, is that the \emph{%

209: Metropolis-Hastings algorithm}, as it is named, assumes that the proposal

210: distributions are symmetric, i.e. $q\left( Y|X\right) =q\left( X|Y\right) $.

211: The Metropolis-Hastings algorithm is therefore ruled out for higher

212: dimensional problems, as these problems generally have little symmetry. The

213: main advantage of the Metropolis-Hastings algorithm is that proposal

214: distribution has no impact on the decision criterion, and therefore will not

215: impact on the convergence of this algorithm towards the stationary

216: distribution $\pi \left( \cdot \right) $.

217:

218: To provide a fuller explanation, the transition kernel of the

219: Metropolis-Hastings algorithm is given by

220: \begin{equation}

221: \begin{split}

222: &P\left( X_{t+1}|X_{t}\right) =q\left( X_{t+1}|X_{t}\right) \alpha \left(

223: X_{t},X_{t+1}\right) \\

224: &+I\left( X_{t+1}=X_{t}\right) \left[ 1-\int q\left( Y|X_{t}\right) \alpha

225: \left( X_{t},Y\right) dY\right] ,

226: \end{split}

227: \end{equation}

228:

229:

230: where $I\left( \cdot \right) $ is the indicator function. From $\alpha

231: \left( X_{t},X_{t+1}\right) $, we can see that

232: \begin{equation*}

233: \begin{split}

234: &\pi \left( X_{t}\right) q\left( X_{t+1}|X_{t}\right) \alpha \left(

235: X_{t},X_{t+1}\right) =\\

236: &\pi \left( X_{t+1}\right) q\left( X_{t}|X_{t+1}\right)

237: \alpha \left( X_{t+1},X_{t}\right) .

238: \end{split}

239: \end{equation*}

240: This implies that

241: \begin{equation*}

242: \pi \left( X_{t}\right) P\left( X_{t+1}|X_{t}\right) =\pi \left(

243: X_{t+1}\right) P\left( X_{t}|X_{t+1}\right) .

244: \end{equation*}

245: Integrating both sides of this equation, we get

246: \begin{equation*}

247: \int \pi \left( X_{t}\right) P\left( X_{t+1}|X_{t}\right) dX_{t}=\pi \left(

248: X_{t+1}\right) .

249: \end{equation*}

250: This equation states that if $X_{t}$ is drawn from $\pi $, then so must $%

251: X_{t+1}$. In other words, once one sample value has been obtained from the

252: stationary distribution, then all subsequent samples must be drawn from the

253: same distribution.

254:

255: This is only a partial justification of the Metropolis-Hastings algorithm. A

256: full proof requires that $P^{\left( t\right) }\left( X_{t}|X_{0}\right) $

257: converges on the stationary distribution. For a heuristic justification of

258: this result, it can be noted that this distribution will depend only on the

259: starting value $X_{0}$, therefore the proof must show that Markov chain

260: gradually forgets its starting point, and converges on a unique stationary

261: distribution. Thus, after a sufficiently long \emph{burn-in} of $m$

262: iterations, points $\left\{ X_{t};t=m+1,\,...,n\right\} $ will be dependent

263: sample approximations of the stationary distribution. Hence the \emph{%

264: burn-in sample} is usually discarded when calculating the ergodic mean for $%

265: f\left( X\right) $%

266: \begin{equation*}

267: \bar{f}=\frac{1}{m-n}\sum_{t=m}^{n}f\left( X_{t}\right) .

268: \end{equation*}

269:

270: %The most widely used variant of the MCMC is the \emph{Gibbs sampler} \cite

271: %{G+G84}. The Gibbs sampler draws its name from the Gibbs distribution of

272: %statistical physics. The algorithm uses the Gibbs distribution as itsstationary distribution and combines stochastic %relaxation and annealing to

273: %compute estimates of posterior probabilities. Sampling occurs from a \emph{%

274: %local} conditional probability distribution. The local conditional

275: %distribution is dependent on the global control parameter $T$ (the

276: %``temperature''), that varies between $0$ and $\infty $, depending

277: %respectively on whether the algorithm is directed or undirected.

278:

279: %The Gibbs sampler is similar to the single component Metropolis algorithm in

280: %its construction, and can be considered an extension of this algorithm. The

281: %algorithm exploits the equivalence between the Gibbs distribution and Markov

282: %random fields. In the Gibbs sampler, its $i$th component proposal

283: %distribution for $X_{t+1.i}$

284: %\begin{equation*}

285: %q_{i}\left( Y_{.i}|X_{.i},X_{.-i}\right) =\pi \left( Y_{.i}|X_{.-i}\right) ,

286: %\end{equation*}

287: %where $\pi \left( Y_{.i}|X_{.-i}\right) $ is the full conditional

288: %distribution. Thus the Gibbs sampler cuts through the intermediate step of

289: %satisfying the proposal distribution by sampling purely from the full

290: %conditional distribution -- a consequence of the Gibbs stationary

291: %distribution. The Gibbs sampler therefore has the advantage of the

292: %Metropolis-Hastings algorithm without requiring the symmetry of its Markov

293: %chain.

294:

295: %The transition kernel of the Gibbs sampler is then expressed by the product

296: %of the conditional densities of the individual steps required for each

297: %iteration:

298: %\begin{equation*}

299: %K\left( X,Y\right) =\prod_{i=1}^{d}\pi \left( Y_{.i}|X_{.-i}\right) .

300: %\end{equation*}

301: %The transition probabilities can then be expressed as follows:

302: %\begin{equation*}

303: %P\left[ X\left( t\right) =\omega |X\left( 0\right) =\eta ,\right]

304: %=\int_{A}K\left( X,Y\right) dY.

305: %\end{equation*}

306: %Given that $\pi \left( \cdot \right) $ will also depend on the temperature

307: %parameter $T$ (i.e. $\pi _{T}\left( \cdot \right) $), it can be shown that

308: %for any decreasing sequence $T\left( t\right) $, if $T\left( t\right) \geq

309: %N\Delta /\log t$, for all $t\geq t_{0}$ for some $t_{0}>2$ (where $\Delta $

310: %is the step size and $N$ is the number of iterations taken), \noindent then

311: %\begin{equation*}

312: %\lim_{t\rightarrow \infty }P\left[ X\left( t\right) =\omega |X\left(

313: %0\right) =\eta \right] =\pi _{0}\left( \omega \right) ,

314: %\end{equation*}

315: %for any starting configuration $\eta \in \Omega $ \cite[p.731]{G+G84}. Geman

316: %and Geman \cite[p.732]{G+G84} show that for any random function $f\left(

317: %X\right) $, for fixed $T$

318: %\begin{equation*}

319: %\lim_{n\rightarrow \infty }\frac{1}{n}\sum_{t=1}^{n}f\,\left( X\left(

320: %t\right) \right) =\int_{\Omega }f\,\left( \omega \right) d\pi \left( \omega

321: %\right) .

322: %\end{equation*}

323: %If we assume that there exists a $\tau $ such that

324: %\begin{equation*}

325: %S\subset \left\{ n_{t+1},...,n_{t+\tau }\right\}

326: %\end{equation*}

327: %for all $t$, then the above relationship will hold with probability one.

328: %Gelfand and Smith \cite{Gel+S90} indicate that these results can be

329: %generalized to enable sampling from arbitrary distributions.

330:

331: \section{Trembling Hand Algorithm}

332:

333: \subsection{A MCMC Algorithm for Computing Perfect Equilibria in

334: Strategic Games}

335:

336: In this sub-section we provide an algorithm for computing a perfect

337: equilibrium for a strategic game and show that this algorithm

338: provides a sequence of perturbed mixed strategies that will

339: eventually converge on perfection. The basic idea is to construct

340: select a Markov chain and then use this Markov to deliver a Nash

341: equilibrium via Markov chain approximation. The trick is to

342: nominate the appropriate Markov chain with the most suitable

343: convergence properties to deliver convergence of the sequence

344: completely mixed Nash equilibria of perturbed games or $\varepsilon

345: $-perfect equilibria to a perfect equilibrium. This is the

346: objective that is undertaken in this section.

347:

348: Consider an $n$-person game in strategic form $G=\left( N,\left(

349: S_{i}\right) _{i\in N},\left( u_{i}\right) _{i\in N}\right) $ in which $%

350: N=\left\{ 1,...,n\right\} $ is the player set, each player $i\in N$ has a

351: finite set of pure strategies $S_{i}=\left\{ s_{i1},...,s_{ik_{i}}\right\} $

352: and a pay-off function $u_{i}:\times _{i\in N}S_{i}\rightarrow \mathbb{R}$

353: mapping the set of pure strategy profiles $\times _{i\in N}S_{i}$ into the

354: real number line.

355:

356: In the strategic game $G$, for each player $i\in N$ there is a set of

357: probability measures $\Delta _{i}$ that can be defined over the pure

358: strategy set $S_{i},$ this is player $i$'s mixed strategy set. The elements

359: of the set $\Delta _{i}$ are of the form $p_{i}:S_{i}\rightarrow \left[

360: 0,1\right] $ where $\sum_{j=1}^{k_{i}}p_{ij}=1,$ with $p_{ij}=p\left(

361: s_{ij}\right) ,$ i.e. $\Delta _{i}$ is isomorphic to the unit simplex.

362:

363: We denote the elements of the space of mixed strategy profiles $\times

364: _{i\in N}\Delta _{i}$ by $p=\left( p_{1},...,p_{n}\right) ,$ where $%

365: p_{i}=\left( p_{i1},...,p_{ik_{i}}\right) \in \Delta _{i}$. As is the

366: convention we use the following short-hand notation $p=\left(

367: p_{i},p_{-i}\right) $, where $p_{-i}$ denotes the other components of $p$.

368:

369: For each player $i$, the pay-off function $u_{i}:\times _{i\in N}\Delta

370: _{i}\rightarrow \mathbb{R}$ can be extended to the domain of mixed strategy

371: profiles $\times _{i\in N}\Delta _{i}$. The pay-off function for each player

372: $i\in N$ will be defined as follows $u_{i}\left( p_{i},p_{-i}\right)

373: =\sum_{j=1}^{k_{i}}p_{ij}u_{i}\left( s_{ij},p_{-i}\right) $. A mixed

374: strategy $p\in $ $\times _{i\in N}\Delta _{i}$ is \textbf{Nash equilibrium}

375: of the strategic game $G$, if for all players $i\in N$ and all $%

376: p_{i}^{\prime }\in \Delta _{i}$

377: \begin{equation}

378: u_{i}\left( p_{i},p_{-i}\right) \geq u_{i}\left( p_{i}^{\prime

379: },p_{-i}\right) .

380: \end{equation}

381:

382: Suppose that as well there being a positive probability $p_{ij}$ of a player

383: $i$ selecting a pure strategy s$_{ij}\in S_{i}$, there is a small

384: probability $\varepsilon _{ij}$ that the pure strategy $s_{ij}$ will be

385: chosen by $i$ out of error. In the case where player $i$ selects his $j$th

386: pure strategy $s_{ij}$ by mistake, the probability of doing so is given by $%

387: q_{ij}$. The total probability of player $i$ selecting a pure strategy s$%

388: _{ij}\in S_{i}$ is then given by

389: \begin{equation}

390: \hat{p}_{ij}=\left( 1-\varepsilon _{ij}\right) p_{ij}+\varepsilon

391: _{ij}q_{ij}.

392: \end{equation}

393:

394: It can be seen that in this case, the total probability of player $i$

395: selecting a pure strategy s$_{ij}\in S_{i}$ will be bounded below by

396: \begin{equation}

397: \hat{p}_{ij}\geq \varepsilon _{ij}q_{ij}.

398: \end{equation}

399: Equating $\eta _{ij}=\varepsilon _{ij}q_{ij}$ we can see that this condition

400: can be rewritten as

401: \begin{equation}

402: \hat{p}_{ij}\geq \eta _{ij}\quad \forall \,s_{ij}\in S_{i}\text{ and }i\in N,

403: \end{equation}

404: with

405: \begin{equation}

406: \sum_{j=1}^{k_{i}}\eta _{ij}<1\quad \forall \,i\in N.

407: \end{equation}

408:

409: This leads to the definition of a perturbed game $\left( G,\eta \right) $ as

410: a finite strategic game derived from the strategic game $G$, in which each

411: player $i$'s mixed strategy set is the set of completely mixed strategies

412: for player $i$ constrained by the probability of making an error

413: \begin{equation}

414: \Delta _{i}\left( \eta _{i}\right) = p_{i}=\left\{ \left(

415: p_{i1},....,p_{ik_{i}}\right) \in \Delta _{i};p_{ij}\geq \eta _{ij}\, \text{and }

416: \sum\nolimits_{j=1}^{k_{i}}\eta _{ij}<1\right\}

417: \end{equation}

418: A mixed strategy combination $p\in \times _{i\in N}\Delta _{i}\left( \eta

419: _{i}\right) $ is a Nash equilibrium of the perturbed game $\left( G,\eta

420: \right) $ iff the following condition is satisfied

421: \begin{equation}

422: u_{i}\left( s_{ij},p_{-i}\right) <u_{i}\left( s_{il},p_{-i}\right) \text{

423: then }p_{ij}=\eta _{ij},\quad \forall \,s_{ij}\text{,\thinspace }s_{il}\in

424: S_{j}.

425: \end{equation}

426:

427: A mixed strategy $p\in $ $\times _{i\in N}\Delta _{i}$ is a \textbf{perfect

428: equilibrium} in the strategic game $G$ if there exists a sequence of

429: completely mixed strategy profiles $\left\{ p^{k}\right\} _{k=1}^{\infty }$

430: where $\lim_{k\rightarrow \infty }p^{k}=p$, and for every player $i\in N$

431: and for every $p_{i}^{\prime }\in \Delta _{i}$%

432: \begin{equation}

433: u_{i}\left( p_{i},p_{-i}^{k}\right) \geq u_{i}\left( p_{i}^{\prime

434: },p_{-i}^{k}\right) \quad \forall \,k=1,2,....

435: \end{equation}

436: In terms of our definition of a perturbed game, a mixed strategy is a

437: perfect equilibrium iff there exist some sequences $\left\{ \eta ^{k}=\left(

438: \eta _{1}^{k},...\eta _{n}^{k}\right) \right\} _{k=1}^{\infty }$ and $%

439: \left\{ p^{k}=\left( p_{1}^{k},...p_{n}^{k}\right) \right\} _{k=1}^{\infty }$

440: such that

441:

442: \begin{enumerate}

443: \item  each $\eta ^{k}>0$ and $\lim_{k\rightarrow \infty }\eta _{k}=0$,

444:

445: \item  each $p^{k}$ is a Nash equilibrium of a perturbed game equilibrium $%

446: \left( G,\eta ^{k}\right) $, and

447:

448: \item  $\lim_{k\rightarrow \infty }p^{k}=p$ where for every player $i\in N$

449: and for every $p_{i}^{\prime }\in \Delta _{i}$%

450: \begin{equation}

451: u_{i}\left( p_{i},p_{-i}^{k}\right) \geq u_{i}\left( p_{i}^{\prime

452: },p_{-i}^{k}\right) \quad \forall \,k=1,2,....

453: \end{equation}

454: \end{enumerate}

455:

456: An alternative definition of perfection has been made Myerson

457: \cite[pp 75--76]{Myers78} and is based on the idea that every pure strategy

458: in a player's set of pure strategies has associated with it a small positive

459: probability of at least $\varepsilon >0,$ but on strategies that are best

460: responses have associated probabilities greater that $\varepsilon .$ More

461: formally, for any player $i\in N$ a mixed strategy $p_{i}\in \Delta _{i}$ is

462: an $\varepsilon $\textbf{-perfect equilibrium} iff it is completely mixed

463: and

464: \begin{equation}

465: u_{i}\left( s_{ij},p_{-i}\right) <u_{i}\left( s_{il},p_{-i}\right) \text{

466: then }p_{ij}\leq \varepsilon ,\text{\quad }\forall \,s_{ij}\text{,\thinspace

467: }s_{il}\in S_{j}.

468: \end{equation}

469: Unlike Nash equilibria of perturbed games, the $\varepsilon $-perfect

470: equilibria of a game $G$ will not necessarily be one of its Nash equilibria.

471: However, Myerson does show that $p=\left( p_{1},...,p_{n}\right) \in \times

472: _{i\in N}\Delta _{i}$ will be a perfect equilibrium iff

473:

474: \begin{enumerate}

475: \item  each $\varepsilon ^{k}>0$ and $\lim_{k\rightarrow \infty }\varepsilon

476: ^{k}=0$,

477:

478: \item  each $p^{k}$ is an $\varepsilon ^{k}$-perfect equilibrium of the game

479: $G$, and

480:

481: \item  $\lim_{k\rightarrow \infty }p_{i}^{k}=p_{i}$ for every player $i\in

482: N. $

483: \end{enumerate}

484:

485: The starting basis for the MCMC algorithm for calculating

486: perfection will be to follow Myerson by constructing a sequence of

487: $\varepsilon $-perfect equilibria for the strategic game $G$. As

488: stated above, we know that for the

489: strategic game $G$, $p\in \times _{i\in N}\Delta _{i}$ is an $\varepsilon $%

490: -perfect equilibrium iff for each player $i\in N$, $p_{i}\in \Delta

491: _{i}$ is a completely mixed strategy and

492: \begin{equation}

493: \begin{split}

494: &u_{i}\left( s_{ij},p_{-i}\right) <u_{i}\left( s_{il},p_{-i}\right)

495: \text{ then }p_{ij}\leq \varepsilon,\\

496: &\text{\quad }\forall

497: \,s_{ij}\text{,\thinspace }s_{il}\in S_{j}.

498: \end{split}

499: \end{equation}

500:

501: Following Myerson \cite[p 79]{Myers78} we define the following set

502: of mixed strategies for each player $i\in N$

503: \begin{equation}

504: \Delta _{i}^{*}=\left\{ p_{i}\in \Delta _{i};p_{ij}\geq \delta

505: \;\,\forall \,s_{ij}\in S_{i}\right\} ,

506: \end{equation}

507: where

508: \begin{equation}

509: \delta =\frac{1}{m}\varepsilon ^{m},\quad 0<\varepsilon <1

510: \end{equation}

511: with $m=\max_{i\in N}\left| S_{i}\right| $. We then define a

512: point-to-set mapping $F_{i}:\times _{i\in N}\Delta

513: _{i}^{*}\rightarrow \Delta _{i}^{*}$ to be a family of completely

514: mixed distributions contained in $\Delta _{i}^{*}$

515: \begin{equation}

516: \begin{split}

517: &F_{i}\left( p_{1},...,p_{n}\right) =\left\{ p_{i}^{*}\in \Delta

518: _{i}^{*};u_{i}\left( s_{ij},p_{-i}\right) <u_{i}\left(

519: s_{il},p_{-i}\right)\right.\\

520: &\left.\text{ then }p_{ij}\leq \varepsilon ,\text{\quad }\forall \,s_{ij}\text{%

521: ,\thinspace }s_{il}\in S_{j}\right\}

522: \end{split}

523: \end{equation}

524:

525: If we then define, for each player $i\in N$, a mixed strategy

526: \begin{equation}

527: p_{il}^{*}=\frac{e^{\rho \left( s_{ij}\right)

528: }}{\sum_{l=1}^{k_{i}}e^{\rho \left( s_{il}\right) }},

529: \end{equation}

530: where

531: \begin{equation}

532: \rho \left( s_{ij}\right) =\left| \left\{ s_{il}\in

533: S_{i};u_{i}\left( s_{ij},p_{-i}\right) <u_{i}\left(

534: s_{il},p_{-i}\right) \text{ and }p\in \times _{i\in N}\Delta

535: _{i}^{*}\right\} \right|

536: \end{equation}

537: Then it can be seen that $p_{i}^{*}\in F_{i}\left(

538: p_{1},...,p_{n}\right) $ will be non-empty. As each $F_{i}\left(

539: p_{1},...,p_{n}\right) $ will a finite collection of linear

540: inequalities, they will also be closed convex sets. In addition

541: each $F_{i}\left( p_{1},...,p_{n}\right) $, by the continuity of

542: the pay-off function $u_{i}\left( s_{ij},\cdot \right) ,$ will also

543: be upper semi-continuous.

544:

545: As a consequence the mapping $F:\times _{i\in N}\Delta

546: _{i}^{*}\rightarrow \times _{i\in N}\Delta _{i}^{*}$ satisfies all

547: the conditions of the Kakutani Fixed Point Theorem. In other words

548: there exists some completely mixed strategy $p_{\varepsilon }\in

549: \times _{i\in N}\Delta _{i}^{*}$ such

550: that $p_{\varepsilon }$ is an $\varepsilon $-perfect equilibrium of $G$. As $%

551: \times _{i\in N}\Delta _{i}$ is compact, the sequence $\varepsilon

552: $-perfect

553: equilibria $p_{\varepsilon }\rightarrow $ $p$ as $\varepsilon \rightarrow 0$%

554: , where $p$ is the perfect equilibrium of $G$.

555:

556: An alternative route to the same result can be arrived at as

557: follows using an argument based on the convergence properties

558: Markov chain.

559:

560: \begin{thm}

561: For any normal form game $G=\left( N,\left( S_{i}\right) _{i\in

562: N},\left( u_{i}\right) _{i\in N}\right) $, it is possible to define

563: a MCMC algorithm such that its transition probabilities will

564: converge to a perfect equilibrium as long as the following

565: conditions hold:

566:

567: \begin{enumerate}

568: \item  if $u_{i}\left( s_{ij},p_{-i}^{k}\right) -u_{i}\left(

569: s_{il},p_{-i}^{k}\right) \geq 0$ then accept, where $p_{-i}^{k}$ is

570: the tuple mixed strategies selected on the $k$th iteration;

571:

572: \item  otherwise, accept if probability $\exp \left( \frac{u_{i}\left(

573: s_{il},p_{-i}^{k}\right) -u_{i}\left( s_{il},p_{-i}^{k}\right)

574: }{T}\right)

575: >\varepsilon ,$ where $\varepsilon \sim U\left[ 0,1\right] ;$ and

576:

577: \item  in addition it can be seen that for all $s_{ij}$ and $s_{il}\in S_{i}$

578: such that $u_{i}\left( s_{ij},p_{-i}^{k}\right) <u_{i}\left(

579: s_{il},p_{-i}^{k}\right) $, $\alpha _{jl}^{i}\left( T\right)

580: \rightarrow 0$ as $T\rightarrow \infty $.

581: \end{enumerate}

582: \end{thm}

583:

584: \noindent

585: %TCIMACRO{\TeXButton{Proof}{\proof}}

586: %BeginExpansion

587: \proof%

588: %EndExpansion

589: For each player $i\in N$, there will be a collection these subsets

590: \begin{equation}

591: N_{ij}=\left\{ s_{il}\in S_{i};u_{i}\left( s_{ij},p_{-i}\right)

592: <u_{i}\left( s_{il},p_{-i}\right) \text{ and }p\in \times _{i\in N}\Delta _{i}^{*}\right\}

593: \end{equation}

594: of $i$'s pure strategy space $S_{i}$. The collection of these sets

595: will referred to as player $i$'s local neighborhood structure. What

596: we would like to do is for any two pure strategies

597: $s_{ij}$,$\,s_{il}\in S_{i}$ define a path from $s_{ij}$ to

598: $s_{il}$ such that

599: \begin{equation}

600: s_{ij_{1}}\in N_{ij},s_{ij_{2}}\in N_{ij_{1}},...,s_{il}\in

601: N_{ij_{m}}.

602: \end{equation}

603:

604: In order to do this, we observe that the point-set mapping defined

605: by the set

606: \begin{equation}

607: F_{i}\left( p_{1},...,p_{n}\right) =\left\{ p_{i}^{*}\in \Delta

608: _{i}^{*};u_{i}\left( s_{ij},p_{-i}\right) <u_{i}\left(

609: s_{il},p_{-i}\right)\text{ then }p_{ij}\leq \varepsilon ,\text{\quad }\forall \,s_{ij}\text{%

610: ,\thinspace }s_{il}\in S_{i}\right\}

611: \end{equation}

612: is a collection homogenous transition probabilities $S_{i}$

613: \begin{equation}

614: p_{jl}^{i}\left( k\right) =\Pr \left\{ s_{i}\left( k\right)

615: =s_{il}|s_{i}\left( k-1\right) =s_{ij}\right\} =\Pr \left\{

616: s_{il}|s_{ij}\right\} .

617: \end{equation}

618: Further more we can see that these transition probabilities have

619: the Markov property, i.e. given the path from $s_{ij}$ to $s_{il}$

620: such that

621: \begin{equation}

622: s_{ij_{1}}\in N_{ij},s_{ij_{2}}\in N_{ij_{1}},...,s_{il}\in

623: N_{ij_{m}}.

624: \end{equation}

625: the conditional probability

626: \begin{equation}

627: \begin{split}

628: &\Pr \left\{s_{il}s_{ij_{1}},s_{ij_{2}},...s_{ij_{m}},s_{ij}\right\} \\

629: &=\Pr

630: \left\{ s_{il}|s_{ij_{m}}\right\} \Pr \left\{

631: s_{ij_{m}}|s_{ij_{m-1}}\right\} ..\Pr \left\{

632: s_{ij_{2}}|s_{ij_{1}}\right\}

633: \end{split}

634: \end{equation}

635:

636: We define the following generating probability for the Markov chain

637: for each

638: player $i\in N$%

639: \begin{equation}

640: g_{jl}^{i}=\left\{

641: \begin{array}{l}

642: \frac{1}{\rho \left( s_{ij}\right) }\text{,\quad if }s_{il}\in

643: N_{ij} \\ 0,\quad \quad \;\;\text{otherwise},

644: \end{array}

645: \right.

646: \end{equation}

647: where

648: \begin{equation}

649: \rho \left( s_{ij}\right) =\left| \left\{ s_{il}\in

650: S_{i};u_{i}\left( s_{ij},p_{-i}\right) <u_{i}\left(

651: s_{il},p_{-i}\right)

652: \text{ and }p\in \times _{i\in N}\Delta

653: _{i}^{*}\right\} \right| .

654: \end{equation}

655: We now introduce the following acceptance probability

656: \begin{equation}

657: \begin{split}

658: \alpha _{jl}^{i}\left( T\right) &=\left\{ 1,\exp \left(

659: \frac{u_{i}\left(

660: s_{ij},p_{-i}^{k-1}\right) -u_{i}\left( s_{il},p_{-i}^{k-1}\right) }{T}%

661: \right) \right\} ,\\

662: &T>0

663: \end{split}

664: \end{equation}

665: where $T$ is a control parameter. This last condition implies that

666:

667: \begin{enumerate}

668: \item  if $u_{i}\left( s_{ij},p_{-i}^{k}\right) -u_{i}\left(

669: s_{il},p_{-i}^{k}\right) \geq 0$ then accept, where $p_{-i}^{k}$ is

670: the tuple mixed strategies selected on the $k$th iteration;

671:

672: \item  otherwise, accept if probability $\exp \left( \frac{u_{i}\left(

673: s_{il},p_{-i}^{k}\right) -u_{i}\left( s_{il},p_{-i}^{k}\right)

674: }{T}\right)

675: >\varepsilon ,$ where $\varepsilon \sim U\left[ 0,1\right] ;$ and

676:

677: \item  in addition it can be seen that for all $s_{ij}$ and $s_{il}\in S_{i}$

678: such that $u_{i}\left( s_{ij},p_{-i}^{k}\right) <u_{i}\left(

679: s_{il},p_{-i}^{k}\right) $, $\alpha _{jl}^{i}\left( T\right)

680: \rightarrow 0$ as $T\rightarrow \infty $.

681: \end{enumerate}

682:

683: Given theses three conditions we can now see that the following

684: will hold:

685:

686: \begin{itemize}

687: \item  We know that under this acceptance criterion as $k\rightarrow \infty $

688: The transition probability matrix $p_{i}^{k}$ of the homogenous

689: Markov chain generated by the game $G$ will converge on a

690: stationary distribution $\pi \left( T\right) $ as $k\rightarrow

691: \infty $.

692: \begin{equation}

693: p_{i}^{k}\rightarrow \pi _{i}\left( T\right) =\frac{e^{-C\left( i\right) /T}%

694: }{\sum_{k\in E}e^{-C\left( k\right) /T}}

695: \end{equation}

696: and as $T\rightarrow \infty $

697: \begin{equation}

698: \pi _{i}\left( T\right) =\left\{

699: \begin{array}{l}

700: \frac{1}{\left| N_{i}\right| }\quad \text{ if }i\in H \\ 0\quad

701: \quad \;\text{otherwise}

702: \end{array}

703: \right.

704: \end{equation}

705: where

706: \begin{equation}

707: N_{i}=\left\{ s_{il}\in S_{i};u_{i}\left( s_{ij},p_{-i}\right)

708: <u_{i}\left( s_{il},p_{-i}\right) ,p_{i}=0\right\} .

709: \end{equation}

710: (See van Laarhoven and Aarts \cite[p.22--25]{LA} for the proof of

711: this last statement.)

712:

713: \item  The transition probability matrix $p_{i}^{k}$ satisfies Myerson's

714: definition of an $\varepsilon $-perfect equilibria and as Myerson

715: has shown, the fixed point that this sequence converges on is also

716: a perfect

717: equilibrium.%

718: %TCIMACRO{\TeXButton{End Proof}{\endproof}}

719: %BeginExpansion

720: \endproof%

721: %EndExpansion

722: \end{itemize}

723:

724:

725: \section{An Application to Extensive Form Games}

726:

727:

728: There are problems with viewing the existence of Nash equilibria as an end

729: in itself. The most immediate problem with this has been the possible large

730: number of Nash equilibria that can be found for any game, together with the

731: likelihood that not all of these Nash equilibria will be reasonable in some

732: sense. One way around this is to view the decision process of each agent

733: participating in the game from a decision theoretic perspective. From this

734: viewpoint, only those equilibria that can be found by backwards induction

735: will be self-enforcing. This leads to a technique for strategy space

736: reduction by iteratively removing strategies that lead to outcomes that are

737: not \emph{strongly dominated}. As shown by Kuhn \cite[Corollary 1]{Kuhn53},

738: under the assumption of perfect information, this leads to a recursion that

739: is equivalent to the Bellman equation of dynamic programming.

740:

741: An alternative to this is to construct a recursion that iteratively

742: eliminates \emph{weakly dominated strategies}. However, the removal of

743: weakly dominated strategies can lead to the elimination of strategy profiles

744: that would otherwise provide suitable outcomes if only strongly dominated

745: strategies were to have been removed. From the viewpoint of this paper these

746: recursive strategy space reduction techniques can be considered to be an

747: algorithm that reduces the size of a game, making equilibrium selection

748: easier. However, these iterative reduction techniques becomes unwieldy once

749: the assumption of perfect information is relaxed and information sets

750: contain more than one node of the game tree.

751:

752: This has led to a number of refinements to the definition of Nash

753: equilibrium. Among the first of these was the notion of \emph{subgame

754: perfection} \cite{Selt75}, which removes strategies that are not optimal for

755: every subgame of a extensive game's game tree. However, Selten \cite{Selt75}

756: has shown that subgame perfection can also prescribe non-optimizing

757: behaviour at information sets that are not reached when the equilibrium is

758: played. This is because the expected payoff for the player whose information

759: set is not reached will not depend on their own strategy. As a result every

760: strategy will maximize their payoff. As van Damme \cite[p. 8--9]{vDam91}

761: states, that this can be removed if the equilibrium prescribes a choice, at

762: each information set that is a singleton, that maximizes the expected payoff

763: after the information set. The problem is that not all subgame perfect

764: equilibria satisfying this criteria are sensible.

765:

766: %\textbf{

767: %Nash refinement has also extended for example to the behaviour of animals in the

768: %wild. The development of the \emph{Evolutionary Stable Strategy} (ESS) by Maynard

769: %Smith and Price \cite{maysmith} introduced the reduction of the stratgey space

770: %into a new form. The idea of a ESS is to model situations in which an agents

771: %actions can be determined by the forces of evolution. However the main

772: %restriction on the ESS is that a strategy is stable if a whole population using

773: %this strategy can not be subject to invasion by a small group with a mutant

774: %genotype \cite{gin}. This condition in a nutshell means that the equilibrium is

775: %of the basic Nash form with a specific stability condition attached.

776: %}

777: %\\\\

778: %\textbf{

779: %Having such a condition tacted onto the end of the Nash equilibrium, means

780: %that the ESS is essentially static. The ESS like all equilibrium has its

781: %own limitations for both discrete and continuous systems

782: %\cite{gin}. If we were to apply a genetic algorithm approach to

783: %finding the ESS we could encounter a whole group of equilibrium which are

784: %inappropriate. Furthermore these equilibrium may not eliminate weakly dominated

785: %strategies and present a dis-equlibrium \cite{osb}.

786: %}

787:

788: Another approach which was suggested by Selten \cite{Selt75}, was to eliminate

789: ``unreasonable'' subgame perfect equilibria by allowing the possibility of

790: ``mistakes'' or ``trembles'' on the part of decision makers. In this way,

791: isolated information sets are removed, as every information set can now be

792: reached with positive probability. The other advantage of trembling hand

793: perfection is that, unlike subgame perfection, it can be applied directly to

794: the normal form of any game. Although, as van Damme shows, the perfect

795: equilibria of a game's strategic and extensive forms need not coincide. An

796: equivalence relationship holds for only the \emph{agent normal form }and

797: extensive form of any game \cite{Selt75}. This is because the agent normal

798: form of any game views each node of the game tree, of the extensive form of

799: the game, as a player in the game. As a consequence each player represents

800: an information set held by the player and will have an identical payoff

801: function to the player.

802:

803: As was shown by Selten \cite{Selt75}, the perfect equilibria of a game's

804: strategic and extensive forms need not coincide. However he showed that an

805: equivalence relationship holds between the equilibria of any extensive game

806: and its associated \emph{agent normal form }\cite{Selt75}. This is because

807: the agent normal form of any game views each node of the game tree, of the

808: extensive form of the game, as a player in the game. As a consequence each

809: player represents an information set held by the player and will have an

810: identical pay-off function to the player.

811:

812: We let $\Gamma ^{e}$ define an extensive game consisting of a set of $n$

813: players, a game tree $K=\left( T,R\right) $ consisting of a set of nodes $T$

814: and a binary relation $R$ which is a partial ordering on the set of nodes.

815: The nodes of the game tree are classified as either non-terminal or terminal

816: according to whether or not their are succeeding nodes in the game tree. The

817: partial ordering is used to define a path of successive nodes. The

818: non-terminal nodes of the game tree are partitioned into the sets $%

819: P_{0},P_{1},...,P_{n}$ that specify the moves associated with each player,

820: with $P_{0}$ being the partition associated with random moves that are not

821: associated with any player. All of the non-terminal nodes is the information

822: partition $U=$ $\left( U_{1},....,U_{n}\right) $, where each set $U_{i}$ is

823: a partition of $P_{i}$ into information sets, such that all nodes within an

824: information set $u\in U_{i}$ have the same number of immediate successors

825: and path intersects an information set at most once. Under the assumption of

826: perfect information each information set $u\in U_{i}$ will be a singleton.

827: This paper will assume \emph{imperfect information} -- this implies that if

828: the information set $u\in U_{i}$ contains a node $x\in P_{i}$, player $i$

829: will not be able to distinguish other nodes contained in this information

830: set based on information possessed when moving to $x$. Throughout this paper

831: it will also be assumed that \emph{complete information} is present -- i.e.

832: each player has \emph{perfect recall} and will remember everything from

833: earlier in the game, including their own moves.

834:

835: Associated with each random move is a probability distribution $p$. The

836: payoffs associated with the set of terminal points $Z$ of the game tree are

837: denoted by the $n$-tuple $r=\left( r_{1},...,r_{n}\right) $, where each

838: player's payoff is a function of the terminal points $r_{i}\left( z\right) $%

839: , $z\in Z$. With the information partition $U$ a choice set $C=\left\{

840: C_{u}:u\in \cup _{i=1}^{n}U_{i}\right\} $ can be defined, where each $C_{u}$

841: is a partition of the union of sets of successors $S\left( x\right) =\left\{

842: y;x\in P\left( y\right) \right\} $ for each $x\in u$: $\cup _{x\in u}S\left(

843: x\right) $. The interpretation is that if player $i$ takes the choice $c\in

844: C_{u}$ at information set $u$ $\in U_{i}$ , then if $i$ is at $x\in u$, the

845: next node reached is the element of $S\left( x\right) $ contained in $c$.

846: Under the assumption of imperfect information and perfect recall, a

847: probability distribution $b_{i}$ is assigned on $C_{u}$ to each information

848: set $u\in U_{i}.$ This distribution $b_{i}$ is a behavioural strategy, with

849: the set of all these strategies for player $i$ defined by $B_{i}$. The

850: profile of all players behavioural strategies is denoted by $b\in B:=\times

851: _{i=1}^{n}B_{i}$, where $B$ is the set of all behavioural strategy

852: combinations. The probability of a particular realization of the game $%

853: \Gamma ^{e}$ is denoted by $\mathbb{P}_{b}\left( z\right) $.

854:

855: The definition of perfect equilibrium we will use is based Selten \cite

856: {Selt75} and Friedman \cite{Fried91}. Kuhn \cite{Kuhn53} has shown that

857: these behavioural and mixed strategies are realization equivalent.

858: Therefore, for an extensive form game $\Gamma ^{e}$ we let $\Gamma =\left(

859: S,R\right) $ define its strategic form representation, with $S$ denoting the

860: set of all mixed strategy profiles. The payoff profile $R$ is an $n$-tuple,

861: where the $i$th element is defined as

862: \begin{equation*}

863: R_{i}=\sum_{z\in Z}\Bbb{P}_{b}\left( z\right) r_{i}\left( z\right) .

864: \end{equation*}

865: A perturbed game of $\Gamma $ is defined by $\left( \Gamma ,\eta \right) $,

866: where $\eta $ is a mapping that assigns to every choice in $\Gamma $ a

867: positive number $\eta _{c}$ such that

868: \begin{equation*}

869: \sum_{c\in C_{u}}\eta _{c}<1

870: \end{equation*}

871: for every information set $u$. An equilibrium point $b$ of the strategic

872: game $\Gamma $ is a perfect equilibrium if $b$ is a limit point of a

873: sequence $\left\{ b\left( \eta \right) \right\} $ as $\eta \rightarrow 0$,

874: where each $b\left( \eta \right) $ is an equilibrium points of the

875: associated perturbed game $\left( \Gamma ,\eta \right) $.

876:

877: The algorithm is constructed using a simulated annealing algorithm found in

878: van Laarhoven and Aarts \cite[p. 10]{LA}. The pseudo-code for this algorithm

879: is given below:

880:

881: \begin{itemize}

882: \item[ ]  begin

883:

884: \item[ ]  \textbf{Intitialize};

885:

886: \item[ ]  $M:=0$;

887:

888: \item[ ]  repeat

889:

890: \begin{itemize}

891: \item[ ]  repeat

892:

893: \begin{itemize}

894: \item[ ]  \textbf{Perturb}(config. $i\rightarrow j$, $\Delta R_{ij}()$) for

895: player 1;

896:

897: \item[ ]  if $\left( \Delta R_{ij}\geq 0\right) $ then accept

898:

899: \begin{itemize}

900: \item[ ]  elseif $\left( \exp \left( \frac{-\Delta R_{ij}}{c}\right)

901: >rand\left[ 0,1\right) \right) $ then accept;

902: \end{itemize}

903:

904: \item[ ]  if accept then \textbf{Update}(config. $j$);

905:

906: \item[ ]  \textbf{Perturb}(config. $i\rightarrow j$, $\Delta R_{ij}()$) for

907: player $n$;

908:

909: \item[ ]  if $\left( \Delta R_{ij}\geq 0\right) $ then accept

910:

911: \begin{itemize}

912: \item[ ]  elseif $\left( \exp \left( \frac{-\Delta R_{ij}}{c}\right)

913: >rand\left[ 0,1\right) \right) $ then accept;

914: \end{itemize}

915:

916: \item[ ]  if accept then \textbf{Update}(config. $j$);

917: \end{itemize}

918:

919: \item[ ]  until \textbf{equilibrium is approached sufficiently closely};

920:

921: \item[ ]  $c_{M+1}:=f\left( c_{M}\right) $;

922:

923: \item[ ]  $M:=M+1;$

924: \end{itemize}

925:

926: \item[ ]  until \textbf{stop criterion = true;}

927:

928: \item[ ]  end

929: \end{itemize}

930:

931: \noindent The energy function differential for this algorithm is defined as

932: follows:

933: \begin{equation*}

934: \Delta R_{ij}=R_{j}-R_{i,}\quad i<j

935: \end{equation*}

936: where the $R_{i}$ are the expected pay-off functions for each player

937: participating in the perturbed game. The temperature function $c$ controls

938: the trembles and is updated by the decrement rule

939: \begin{equation*}

940: c_{M+1}=\alpha \cdot c_{M},\quad 0<\alpha <1,\,M=1,2,...\,\text{.}

941: \end{equation*}

942:

943: We apply it to the following example taken from Friedman \cite[p. 51]

944: {Fried91}. This example is based on the three player extensive form game

945: used by Selten \cite{Selt75} to illustrate the existence of perfect

946: equilibrium. The game tree is defined as follows in Figure \ref{tree}

947: \cite[p. 50]{Fried91}.

948:

949: \begin{figure}[!ht]

950: \begin{center}\label{tree}

951: \scalebox{0.5}{\epsfig{file=tree1.eps

952: ,angle=0,width=\linewidth}}

953: \caption{Selten's Horse Game Tree}

954: \end{center}

955: \end{figure}

956:

957:

958: This game possesses both a perfect equilibrium as well as ``non-sensical''

959: subgame perfect equilibria. The perfect equilibrium for this extensive form

960: game is defined via the perturbed pay-off functions:

961:

962: \begin{eqnarray*}

963: R_{1} &=&\alpha _{1}(1-\varepsilon _{2}-3\varepsilon _{3}+4\varepsilon

964: _{2}\varepsilon _{3})+3\varepsilon _{3} \\

965: R_{2} &=&2\varepsilon _{3}(2-\varepsilon _{1})+\alpha _{2}(1-\varepsilon

966: _{1}-4e_{3}+4\varepsilon _{1}\varepsilon _{3}) \\

967: R_{3} &=&1-\varepsilon _{1}+\alpha _{3}(2\varepsilon _{1}-\varepsilon

968: _{2}+\varepsilon _{1}\varepsilon _{2}),

969: \end{eqnarray*}

970: where the $\alpha _{i}$ are the mixed strategies and $\varepsilon _{i}$ are

971: errors defined for $i=1,2,3$. Letting the errors approach zero, it can be

972: seen that perfect equilibrium is defined by $\left( 1,1,0\right) $.

973:

974: The results of the simulation are shown below in Figure \ref{payoffs}

975: and indicate convergence to the trembling hand perfect

976: equilibrium.

977: \begin{figure}[!ht]

978: \begin{center}

979: \scalebox{0.5}{\epsfig{file=payoff1.eps

980: ,angle=0,width=\linewidth}}

981: \caption{Three-person game with imperfect competition and payoff solutions\label{payoffs}}

982: \end{center}

983: \end{figure}

984:

985: \section{Conclusion}

986:

987: This paper has concentrated on some of the underlying theoretical mechanics

988: of simulated annealing and how they relate to the trembling hand perfect

989: refinement of Nash equilibrium. It has been argued that the trembles that

990: underlie global optimization by simulated annealing are analogous to the

991: ``mistakes'' of trembling hand perfection, in that they present a means of

992: moving from local equilibria. The main contribution of this paper has been

993: to apply simulated annealing to solve a game that is known to possess both a

994: perfect equilibrium and ``nonsensical'' subgame perfect equilibrium.

995: Preliminary results indicate a convergence to the perfect equilibrium, with

996: a mixing strategy occurring for two of the three players.

997:

998:

999:

1000:

1001:

1002:

1003: \begin{thebibliography}{199}

1004: \bibitem{Besag74}  Besag, J. (1974) Spatial interaction and the statistical

1005: analysis of lattice systems (with discussion). \emph{Journal of the Royal

1006: Statistical Society Series B} 36, 192--236.

1007:

1008: %\bibitem{Chan89}  Chan, K.S. (1989) A note on the geometric ergodicity of a

1009: %Markov chain. \emph{Advances in Applied Probability} 21, 702--704.

1010:

1011: %\bibitem{Chan93}  Chan, K.S. (1993) Asymptotic behaviour of the Gibbs

1012: %sampler. \emph{Journal of the American Statistical Association} 88, 320--326.

1013:

1014: \bibitem{Fried91}  Friedman, J.W. (1991) \emph{Game Theory with Applications

1015: to Economics}. Oxford University Press, Oxford.

1016:

1017: %\bibitem{Fro}  Fr\"{o}berg, C. (1979) \emph{Introduction to Numerical

1018: %Analysis} (2nd ed.) Addison-Wesley, Reading, Ma.

1019:

1020: %\bibitem{G+G84}  Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs

1021: %distributions and Bayesian restoration of images. \emph{IEEE Transactions on

1022: %Pattern Recognition and Machine Intelligence} PAMI 6(6), 721--741.

1023:

1024: %\bibitem{Gel+S90}  Gelfand and Smith (1990), Sampling based approaches to

1025: %calculating marginal densities. \emph{Journal of the American Statistical

1026: %Society} 85, 398--409.

1027:

1028: \bibitem{GT80}  Georgobiani, D. A. and Torondzadze, A. F (1980) Solution of

1029: rectangular games by the Monte Carlo method. \emph{Trudy Vychisl. Tsentra Akad. Nauk Gruzin. SSR}

1030: 20(2), 5--10.

1031:

1032: \bibitem{GRS96}  Gilks, W.R., Richardson, S. Spiegelhalter, D.J. (1996)

1033: Introducing Markov Chain Monte Carlo. In Gilks, W.R., Richardson, S.

1034: Spiegelhalter, D.J. (Eds..) \emph{Markov Chain Monte Carlo in Practice},

1035: 1--19. Chapman and Hall, London.

1036:

1037: \bibitem{Gilks96}  Gilks, W.R. (1996) Full conditional distributions. In

1038: Gilks, W.R., Richardson, S. Spiegelhalter, D.J. (Eds.) \emph{Markov Chain

1039: Monte Carlo in Practice}, 75--88. Chapman and Hall, London.

1040:

1041:

1042: %\bibitem{gin} Gintis, H., \textit{Game Theory Evolving} Princeton University

1043: %Press 2000

1044:

1045: %\bibitem{G+S92}  Grimmet, G.R. and Stirzaker, D.R. (1992) \emph{Probability

1046: %and Random Processes}. Oxford University Press, Oxford.

1047:

1048: \bibitem{harsanyi} Harsanyi, J.C., (1975) The tracing procedure: a

1049: Bayesian approach to defining a solution for $n$-person non-cooperative games.

1050: \emph{International Journal of Game Theory} 4, 1-22.

1051:

1052: \bibitem{H+S}  Harsanyi, J.C. and Selten, R. (1988) \emph{A General Theory

1053: of Equilibrium Selection in Games}. MIT Press, Cambridge, MA.

1054:

1055: \bibitem{Hast70}  Hastings, W.K. (1970) Monte Carlo sampling methods using

1056: Markov chains and their application. \emph{Biometrika} 57, 97--109.

1057:

1058: %\bibitem{Ko+M86}  Kohlberg, E. and Mertons, J-F. (1986) On the strategic

1059: %stability of equilibria. \emph{Econometrica} 54, 1003--1039.

1060:

1061: %\bibitem{K+W82}  Kreps, D.M. and Wilson, R. (1982) Sequential equilibrium.

1062: %\emph{Econometrica} 50, 863--894.

1063:

1064: \bibitem{Kuhn53}  Kuhn, H.W. (1953) Extensive games and the problem of

1065: information. In Kuhn, H.W. and Tucker, A.W. \emph{Contributions to the

1066: Theory of Games Vol I}, 193--216. Princeton University Press, Princeton N.J.

1067:

1068: \bibitem{L+H}  Lempke, C.E. and Howson, J.T. (1964) Equilibrium points of

1069: bimatrix games. \emph{SIAM Journal on Applied Mathematics} 12, 413--423.

1070:

1071: %\bibitem{maysmith} Maynard Smith, J. and Price G.,R., (1973) \textit{The

1072: %Logic of Animal Conflict}, Nature, 246, 15-18

1073:

1074: %\bibitem{McKMcL96}  McKelvey, R.D. and McLennan A. (1996) Computation of

1075: %Equilibria in Finite Games. \emph{Handbook of Computational Economics Vol. 1}

1076: %. Elsevier Science B.V., Amersterdam.

1077:

1078: %\bibitem{McKP95}  McKelvey, R.D. and Palfrey, T.R. (1995) Quantal Response

1079: %Equilibria for Normal Form Games. \emph{Games and Economic Behavior} 10,

1080: %6-38.

1081:

1082: %\bibitem{McKP98}  McKelvey, R.D. and Palfrey, T.R. (1998) Quantal Response

1083: %Equilibria for Extensive Form Games. \emph{Experimental Economics} 1, 9-41.

1084:

1085: \bibitem{MRRTT53}  Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N.,

1086: Teller, A.H., Teller, E., (1953) Equations of state calculations by fast

1087: computing machines. \emph{Journal of Chemistry Physics} 21, 1087--1091.

1088:

1089: \bibitem{Myers78}  Myerson, R.B. (1978) Refinements of the concept of Nash

1090: equilibrium. \emph{International Journal of Game Theory} 7, 73--80.

1091:

1092: \bibitem{Myers91}  Myerson, R.B. (1991) \emph{Game Theory: Analysis of

1093: Conflict.} Harvard University Press, Cambridge, MA.

1094:

1095: %\bibitem{Okada81}  Okada, A. (1981) On stability of perfect equilibrium

1096: %points. \emph{International Journal of Game Theory} 10, 67-73.

1097:

1098:

1099: %\bibitem{osb} Osbourne, M.J., and Rubinstein, A., (1994)

1100: %\emph{A Course in Game Theory} MIT Press

1101:

1102: %\bibitem{Rob94}  Robert, C.P. (1994) Discussion. In Tierney, L. Markov

1103: %chains for exploring posterior distributions. \emph{Annals of Statistics}

1104: %22(4), 1742--1747.

1105:

1106: \bibitem{Rob96}  Roberts, G.O. (1996) Markov chain concepts related to

1107: sampling algorithms. In Gilks, W.R., Richardson, S. Spiegelhalter, D.J.

1108: (Eds.) \emph{Markov Chain Monte Carlo in Practice}, 45--57. Chapman and

1109: Hall, London.

1110:

1111: \bibitem{Scar73}  Scarf, H.E. (1973) \emph{Computation of Economic

1112: Equilibria.} Yale University Press, New Haven, Conn.

1113:

1114: \bibitem{Selt75}  Selten, R. (1975) Reexamination of the Perfectness Concept

1115: for Equilibrium Concepts in Extensive Form Games. \emph{International

1116: Journal of Game Theory }4, 25--55.

1117:

1118: \bibitem{Selt78}  Selten, R. (1978) The Chain Store Paradox. \emph{Theory

1119: and Decision} 9, 127--159.

1120:

1121: %\bibitem{SBGI96}  Spiegelhalter, D.J., Best, N.G., Gilks, W.R. and Inskip,

1122: %H. (1996) Hepatitis B: a case study in MCMC methods. In Gilks, W.R.,

1123: %Richardson, S. Spiegelhalter, D.J. (Eds.) \emph{Markov Chain Monte Carlo in

1124: %Practice}, 20--43. Chapman and Hall, London.

1125:

1126: %\bibitem{S+C91}  Schwervish, M.J. and Carlin, B.P. (1992) On the convergence

1127: %of successive substitution sampling. \emph{Journal of Computational and

1128: %Graphical Statistic}s 1 111--127.

1129:

1130: %\bibitem{Tiern94}  Tierney, L. (1994) Markov chains for exploring posterior

1131: %distributions (with discussion). \emph{Annals of Statistics} 22(4),

1132: %1701--1762

1133:

1134: %\bibitem{Tiern96}  Tierney, L. (1996) Introduction to general state-space

1135: %Markov chain theory. In Gilks, W.R., Richardson, S. Spiegelhalter, D.J.

1136: %(Eds..) \emph{Markov Chain Monte Carlo in Practice}, 59--74. Chapman and

1137: %Hall, London.

1138:

1139: \bibitem{Ulam50}  Ulam, S. (1954) Applications of Monte Carlo methods to

1140: tactical games. In Meyer, H.A. (Ed.)\emph{\ Symposium on Monte Carlo

1141: Methods, University of Florida 1954}, p. 63. John Wiley and Sons, New York.

1142:

1143: \bibitem{vDam91}  van Damme, E. (1991) \emph{Stability and Perfection of

1144: Nash Equilibria (2nd ed. rev. enl.)}. Springer-Verlag, Berlin.

1145:

1146: \bibitem{LA}  van Laarhoven, P.J.M. and Aarts, E.H.L. (1987) \emph{Simulated

1147: Annealing: Theory and Applications}. D. Reidel Publishing, Dordrecht,

1148: Holland.

1149:

1150: \bibitem{Wils71}  Wilson, R. (1971) Computing Equilibria of $N$-Person

1151: Games. \emph{SIAM Journal on Applied Mathematics} 21, 80--87.

1152: \end{thebibliography}

1153:

1154: \end{document}

1155: