0306:cs0306106/lex.tex

1: %joe5: corrections: restated theorem

2: %joe4: discsuss representation of beliefs in conclusion a la

3: %Asheim/Slovik; reference Stal98.

4: %joe2: check what Popper called Popper algebras: Q175.P83 (Uris

5: %and Carpenter)

6: %joe2: What's new?  More motivation in intro, Proposition 4.3, major

7: %rewriting in Section~\ref{indep}:

8: %joe9: corrections in response to GEB reviews

9:

10: %\documentstyle[chicagob,times,uai97]{article}

11: %\documentstyle[chicagob,times,11pt]{article}

12: \documentstyle[chicagob,12pt]{article}

13: \input{defn}

14: \input{spage}

15: \input{bghkmac}

16: \newcommand{\vecmu}{\vec{\mu}}

17: \newcommand{\Supp}{\mbox{\it Supp}}

18: \newcommand{\FCP}{F_{S \rightarrow P}}

19: \newcommand{\FOP}{F_{O \rightarrow C}}

20: \newcommand{\FDC}{F_{D \rightarrow C}}

21: \newcommand{\FLN}{F_{L \rightarrow N}}

22: \newcommand{\FSN}{F_{S \rightarrow N}}

23: \newcommand{\FNP}{F_{N \rightarrow P}}

24: \newcommand{\LPS}{\mbox{\em LPS\/}}

25: \newcommand{\SLPS}{\mbox{\em SLPS\/}}

26: \newcommand{\NPS}{\mbox{\em NPS\/}}

27: \newcommand{\Bas}{{\it Basic}}

28: \newcommand{\Popper}{\mbox{\em Pop\/}}

29: \renewcommand{\T}{T}

30: \newcommand{\lab}{\mbox{\em label\/}}

31: \renewcommand{\aeq}{\approx}

32: \renewcommand{\naeq}{\!\!\approx}

33: \newcommand{\nsim}{\!\!\sim}

34: \newcommand{\stand}[1]{\mbox{\em st}\left (#1 \right )}

35: \renewcommand{\mid}{\, | \,}

36:

37: \begin{document}

38: %UAI

39: %\begin{titlepage}

40:

41: \title{Lexicographic probability, conditional probability, and

42: nonstandard probability%

43: \thanks{The work was supported in part by NSF under

44: grants IRI-96-25901, IIS-0090145, and CTC-0208535, by

45: ONR under grant

46: N00014-02-1-0455, and by the DoD Multidisciplinary University Research

47: Initiative (MURI) program administered by the ONR under

48: grant N00014-01-1-0795.

49: %joe2

50: A preliminary version appeared in the Proceedings of the Eighth

51: Conference on Theoretical Aspects of Rationality and Knowledge, 2001

52: [Halpern 2001].

53: %\cite{Hal26}.

54: This version includes detailed proofs and more discussion

55: and more examples; in addition, the material in Section~\ref{sec:indep}

56: (on independence) is new.

57: }}

58: \author{

59: Joseph Y.\ Halpern\\

60: Dept. Computer Science\\

61: Cornell University\\

62: Ithaca, NY 14853\\

63: halpern@cs.cornell.edu\\

64: http:/$\!$/www.cs.cornell.edu/home/halpern

65: }

66: \date{\today}

67: \maketitle

68: %joe:UAI

69: %\thispagestyle{empty}

70: \begin{abstract}

71: The relationship between {\em Popper spaces\/} (conditional probability

72: spaces that satisfy some regularity conditions),

73: lexicographic probability systems (LPS's) \cite{BBD1,BBD2}, and

74: nonstandard probability spaces (NPS's) is considered.  If countable

75: additivity is assumed, Popper spaces and a subclass of

76: LPS's are equivalent; without the assumption of countable additivity,

77: the equivalence no longer holds.  If the state space is finite, LPS's are

78: equivalent to NPS's.  However, if the state space is infinite, NPS's are

79: shown to be more general than LPS's.

80: \end{abstract}

81:

82: %joe:UAI

83: %\end{titlepage}

84:

85: \nocite{Hal26}

86: \section{Introduction}

87: Probability  is certainly the most commonly-used approach for

88: representing uncertainty and conditioning the standard way of updating

89: probabilities in the light of new information.  Unfortunately, there is a

90: well-known problem with conditioning:  Conditioning on events of measure

91: 0 is not defined.  That makes it unclear how to proceed if an agent

92: learns something to which she initially assigned probability 0.

93: Although consideration of events of measure 0 may seem to be of little

94: practical interest, it turns out to play a critical role in game theory,

95: particularly in the analysis of strategic reasoning in extensive-form

96: %joe3

97: %games and in the analysis of various solution concepts in games

98: games and in the analysis of weak dominance in normal-form games

99: (see, for example,

100: %joe2: added BK00, Hammond99

101: %joe3

102: %\cite{BBD1,BBD2,BK00,Hammond94,Hammond99,KR97,KW82,Myerson86,Selten65}).

103: %joe4:

104: \cite{Bat96,BS02,BBD1,BBD2,BFK04,FT91a,Hammond94,Hammond99,KR97,KW82,Myerson86,Selten65,Selten75}).

105: It also arises in the analysis of conditional

106: statements by philosophers (see \cite{Adams66,McGee94}), and in dealing

107: with nonmonotonicity in Artificial Intelligence (see, for example,

108: \cite{LehmannMagidor}).

109:

110: There have been various attempts to deal with the problem of

111: conditioning on events of measure 0.  Perhaps the

112: %joe2: added Popper34

113: best known

114: %joe9

115: involves \emph{conditional probability spaces} (CPS's).  The idea,

116: which goes back to Popper \citeyear{Popper34,Popper68} and

117: de Finetti

118: \citeyear{Finetti36}, is to take as

119: primitive not probability, but conditional probability.  If $\mu$ is a

120: %joe9

121: %conditional probability measure, then $\mu(V \mid U)$ may still be

122: %undefined

123: conditional probability measure on a space $W$, then $\mu(V \mid U)$ may

124: still be undefined

125: for some pairs $V$ and $U$, but it is also possible that $\mu(V \mid U)$ is

126: %joe9

127: %defined even if $\mu(U) = 0$.  A second approach, which goes back to at

128: defined even if $\mu(U \mid W) = 0$.  A second approach, which goes back to at

129: least Robinson~\citeyear{Robinson73} and has been explored in

130: the economics literature \cite{Hammond94,Hammond99}, the AI literature

131: \cite{LehmannMagidor,Wilson95}, and the philosophy literature (see

132: \cite{McGee94} and the references therein) is to consider {\em

133: nonstandard probability spaces\/} (NPS's), where there are

134: infinitesimals that can be used to model

135: events that, intuitively, have infinitesimally small probability  yet

136: may still be learned or observed.

137:

138: There is a third approach to this problem, which  uses

139: sequences of probability measures to represent uncertainty.  The most

140: recent exemplar of this approach, which I focus on here,

141: are the {\em lexicographic probability systems\/}

142: %joe9

143: (LPS's)

144: of Blume, Brandenburger, and

145: Dekel \citeyear{BBD1,BBD2} (BBD from now on). However, the idea of using

146: a system of measures to represent uncertainty actually

147: was explored as far back as the 1950s by R\'{e}nyi

148: \citeyear{Renyi56}

149: (see Section~\ref{related}).

150: A {\em lexicographic

151: probability system\/} is a sequence

152: $\<\mu_0,\mu_1, \ldots\>$ of probability measures.  Intuitively, the

153: first measure in the sequence, $\mu_0$, is the most important one,

154: followed by $\mu_1$, $\mu_2$, and so on.

155: %joe9

156: One way to understand LPS's is in terms of NPS's.

157: Roughly speaking, the

158: probability assigned to an event $U$ by a sequence such as

159: $\<\mu_0,\mu_1\>$ can be taken to be $\mu_0(U) + \epsilon\mu_1(U)$,

160: where $\epsilon$ is an infinitesimal.

161: Thus, even if the probability of $U$ according to $\mu_0$ is 0, $U$

162: still has a positive (although infinitesimal) probability if $\mu_1(U) > 0$.

163:

164: %joe9

165: %How are all these approaches related?

166: What is the precise relationship between these approaches?

167: %That question is the focus of this paper.

168: %joe9

169: %This question, which is the focus of the paper, has been considered

170: %before.

171: The relationship between LPS's and CPS's has been considered before.

172: For example, Hammond \citeyear{Hammond94} shows that

173: conditional probability spaces are equivalent to a subclass of LPS's

174: %joe9

175: %called lexicographic conditional probability spaces if the state space

176: called \emph{lexicographic conditional probability spaces} (LCPS's) if

177: the state space

178: is finite and it is possible

179: to condition on any nonempty set.%

180: %joe4:

181: \footnote{Despite this isomorphism; it is not clear that conditional

182: probability spaces are \emph{equivalent} to LPS's.   It depends on

183: exactly what we mean by equivalence.  The same comment applies below

184: where the word ``equivalent'' is used.  See Section~\ref{discussion}

185: for further discussion.  I thank Geir Asheim for bringing this point to

186: my attention.}

187: As shown by Spohn \citeyear{Spohn86},

188: Hammond's result can be extended to arbitrary countably additive {\em

189: Popper spaces}, where a Popper space is a conditional probability space

190: %joe9

191: %that satisfies certain regularity conditions.

192: where the events on which conditioning is allowed satisfy certain

193: regularity conditions.

194: %joe

195: As I show, this result is depends critically on a number of assumptions.

196: In particular, it does not work without the assumption of countable

197: additivity, it requires that we extend LCPS's appropriately to the

198: infinite case, and it is sensitive to the choice of conditioning events.

199: %The extension is nontrivial  and, indeed, does not work without the

200: %assumption of countable additivity.

201: %joe9

202: For example, if we consider CPS's where

203: the conditioning events can be viewed as information sets, and so are

204: are not closed under supsersets (this is

205: essentially the case considered by Battigalli and Sinischalchi

206: \citeyear{BS02}), then the result no longer holds.

207: %joe9

208: %R\'{e}nyi \citeyear{Renyi56} and van Fraassen \citeyear{vF76} provide

209: %other representations of conditional probability spaces as sequences of

210: %measures, although not LPS's.  Their results apply even if the

211: %underlying state space is infinite, but countable additivity does not play a

212: %role in their representations.

213: %(See Section~\ref{FCP} for further discussion of this issue.)

214:

215: Turning to the relationship between LPS's and NPS's,

216: I show that if the state space is finite, then LPS's are in a sense

217: equivalent to NPS's.  More precisely, say that two measures of

218: uncertainty $\nu_1$ and $\nu_2$ (each of which can be either an LPS or

219: an NPS) are equivalent, denoted $\nu_1 \aeq \nu_2$, if they cannot be

220: distinguished by (real-valued)

221: random variables; that is, for all random variables $X$ and $Y$,

222: $E_{\nu_1}(X) \le E_{\nu_1}(Y)$ iff $E_{\nu_2}(X) \le E_{\nu_2}(Y)$

223: (where $E_\nu(X)$ denotes the expected value of $X$ with respect to

224: $\nu$).

225: To the extent that we are interested in these representations

226: of uncertainty for decision making, then we should not try to

227: distinguish two representations that are equivalent.

228: I show that, in finite spaces, there is a straightforward bijection

229: between $\aeq$-equivalence classes of LPS's and NPS's.    This

230: equivalence breaks down if the state space is infinite; in this case,

231: NPS's are strictly more general than LPS's

232: (whether or not countable additivity is assumed).

233:

234: Finally, I consider the relationship between Popper spaces and NPS's,

235: and show that NPS's are more general.

236: (The theorem I prove is a generalization of one proved

237: by McGee \citeyear{McGee94}, but my interpretation of it is quite

238: different; see Section~\ref{PopperNPS}.)

239:

240: These results give some useful insight into independence of random

241: variables.  There have been a number of alternative notions of

242: independence considered in the literature of extended probability spaces

243: (i.e., approaches that deal with the problem of conditioning on sets of

244: measure 0):  BBD considered three; Kohlberg and Reny \citeyear{KR97}

245: considered two others.   It turns out that these notions are perhaps

246: %joe9

247: %best understood in the context of NPS's. I describe and compare them

248: best understood in the context of NPS's; I describe and compare them

249: here.

250:

251: %joe3

252: %The most significant new results in this paper involve infinite spaces.

253: Many of the new results in this paper involve infinite spaces.

254: Given that most games studied by game theorists are finite, it is fair

255: to ask whether these results have any significance for game theory.

256: I believe they do.  Even if the underlying game is finite, the set of

257: types is infinite.  Epistemic characterizations of solution concepts

258: %joe3

259: %often make use of {\em universal\/} type spaces, which

260: %joe6

261: %joe9

262: %often make use of infinite type spaces,

263: often make use of \emph{complete} type spaces,

264: which include every possible type of every

265: player, where a type determines an (extended) probability over the strategies

266: and types of the other players; this must be an infinite space.

267: %joe9

268: %There are a number of closely related notions of such type spaces,

269: %which have been variously called \emph{terminal}, \emph{universal}, and

270: %\emph{complete} (see \cite{Sin07} for an overview).

271: %joe3

272: %When dealing with extensive-form games, the universal type spaces have

273: %to deal with the problem of conditioning on events of measure 0.

274: %Typically this has been done by using conditional probability spaces or

275: %LPS's.

276: For example, Battigalli and Siniscalchi \citeyear{BS02} use a

277: complete type space where the uncertainty is represented by cps's to

278: give an epistemic characterization of extensive-form rationalizability and

279: backward induction, while Brandenburger,  Friedenberg, and Keisler

280: \citeyear{BFK04} use a complete type space where the uncertainty is

281: represented by LPS's to get a characterization of weak dominance in

282: normal-form games.

283: As the results of this paper show, the set of types depends

284: %joe3

285: to some extent

286: on the notion of extended probability used.

287: Similarly, a number of characterizations of solution concepts depend on

288: %joe4: added Bat96

289: independence (see, for example, \cite{Bat96,KR97,BS99a}).  Again, the results

290: of this paper show that these notions can be somewhat sensitive to

291: exactly how uncertainty is represented,

292: %joe3

293: even with a finite state space.

294: While I do not present any new

295: game-theoretic results here, I believe that the characterizations I have

296: provided may be useful both in terms of defending particular choices of

297: representation used and suggesting new solution concepts.

298:

299:

300: The remainder of the paper is organized as follows.  In the next

301: section, I review all the relevant definitions for the three

302: representations of uncertainty considered here.  Section~\ref{FCP}

303: considers the relationship between Popper spaces and

304: LPS's.  Section~\ref{LPSNPS} considers the relationship between

305: LPS's and NPS's.  Finally,

306: Section~\ref{PopperNPS} considers the relationship between Popper spaces

307: and NPS's.  In Section~\ref{sec:indep} I consider what these results

308: have to say about independence.  I conclude with

309: some discussion in Section~\ref{discussion}.

310:

311: \section{Conditional, lexicographic, and nonstandard probability spaces}

312: In this section I briefly review the three approaches to representing

313: likelihood discussed in the introduction.

314:

315: \subsection{Popper spaces}\label{cpsdef}

316:

317: A {\em conditional probability measure\/} takes {\em pairs\/} $U, V$ of subsets

318: as arguments; $\mu(V,U)$ is generally written $\mu(V \mid U)$ to stress the

319: conditioning aspects. The first argument comes from some

320: algebra $\F$ of

321: subsets of a space $W$; if $W$ is infinite, $\F$ is often taken to be a

322: $\sigma$-algebra.  (Recall that an algebra of subsets of $W$ is a set of

323: subsets containing $W$ and closed under union and complementation.  A

324: $\sigma$-algebra is an algebra that is closed under union countable.)

325: %joe2:

326: The second argument comes from a set $\F'$ of conditioning

327: %joe9

328: %events, i.e., that is, events on which conditioning is allowed.

329: events, that is, that is, events on which conditioning is allowed.

330: %joe9

331: One natural choice is to take $\F'$ to be $\F - \emptyset$.  But it may

332: be reasonable to consider other restrictions on $\F'$.  For example,

333: Battigalli and Sinischalchi \citeyear{BS02} take $\F'$ to consist of the

334: information sets in a game, since they are interested only in agents who

335: update their beliefs conditional on getting some information.

336: The question is what

337: %joe2

338: %constraints, if any, should be placed on the second argument.   I start

339: constraints, if any, should be placed on $\F'$.

340: %joe10:

341: %I start with three minimal requirements, and later add a fourth.

342: For most of this paper, I focus on \emph{Popper spaces} (named after

343: Karl Popper), defined next, where the set $\F'$ satisfies four arguably

344: reasonable requirements, but I occasionally consider other requirements

345: (see Section~\ref{sec:BS}).

346:

347: \commentout{

348: \dfn A {\em Popper

349: algebra\/} over $W$ is a set $\F \times \F'$ of subsets of $W \times W$

350: such that (a) $\F$ is an algebra over $W$, (b) $\F'$ is a nonempty

351: subset of $\F$ (not necessarily an algebra over $W$) that does not

352: contain $\emptyset$, and

353: (c) $\F'$ is closed under supersets in $\F$, in that if

354: $V \in \F'$, $V \subseteq V'$, and $V' \in \F$, then $V' \in \F'$.

355: (Popper algebras are named after Karl Popper.)

356: \edfn

357:

358:

359: %joe6

360: %joe9

361: %The role of the requirement that $\F'$ be closed under supersets is

362: %elucidated in Example~\ref{xam:noPopper}.

363: While I have called these requirements ``minimal'', note that if $\F'$

364: is taken to consist of information sets, then it is not closed under

365: supersets in $\F$.  I return to this issue in Section~\ref{sec:BS}.

366: %Notice that the set $\F'$ in a Popper algebra $\F \times \F'$ is not

367: %itself required to be an algebra over $W$ (and, indeed,

368: %typically is not).

369: }

370:

371: \dfn\label{dfn.condprob} A {\em conditional probability space (cps) over

372: $(W,\F)$\/} is a tuple

373: $(W,\F,\F',\mu)$ such that

374: %$\F \times \F'$ is a Popper algebra over $W$ and

375: $\F$ is an algebra over $W$, $\F'$ is a set of subsets of $W$

376: (not necessarily an algebra over $W$) that does not

377: contain $\emptyset$, and

378: $\mu: \F \times \F' \rightarrow [0,1]$ satisfies the

379: following conditions:

380: \begin{itemize}

381: \item[CP1.] $\mu(U \mid U) = 1$ if $U \in\F'$.

382: \item[CP2.] $\mu(V_1 \union V_2 \mid U) = \mu(V_1 \mid U) + \mu(V_2 \mid U)$ if $V_1

383: \inter V_2 = \emptyset$, $U \in \F'$, and $V_1, V_2 \in \F$.

384: %if the

385: %$V_i$'s are pairwise disjoint sets in $\F$ and $U \in \F'$.

386: \item[CP3.] $\mu(V \mid U) = \mu(V \mid X) \times \mu(X \mid U)$ if $V \subseteq X

387: \subseteq U$, $U, X \in \F'$, $V \in \F$.

388: \end{itemize}

389: %joe6

390: Note that it follows from CP1 and CP2 that $\mu(\cdot \mid U)$ is a

391: probability measure on $(W,\F)$ (and, in particular, that $\mu(\emptyset

392: \mid U) = 0$) for each $U \in \F'$.

393: %joe10

394: A {\em Popper space over $(W,\F)$\/} is a conditional probability space

395: $(W,\F,\F',\mu)$

396: that satisfies

397: %an additional condition: if

398: three additional conditions: (a) $\F' \subseteq \F$, (b)

399: $\F'$ is closed under supersets in $\F$, in that if

400: $V \in \F'$, $V \subseteq V'$, and $V' \in \F$, then $V' \in \F'$, and

401: (c) if $U \in \F'$ and $\mu(V \mid U) \ne 0$ then $V \inter U \in \F'$.

402: If $\F$ is a $\sigma$-algebra and $\mu$ is countably additive

403: (that is, if $\mu(\union V_i \mid U) = \sum_{i = 1}^\infty \mu(V_i \mid U)$ if the

404: $V_i$'s are pairwise disjoint elements of $\F$ and $U \in \F'$), then the

405: Popper space is said to be {\em countably additive}.

406: Let $\Popper(W,\F)$ denote the set of Popper spaces over $(W,\F)$.

407: If $\F$ is a $\sigma$-algebra, I use a superscript $c$ to

408: denote the restriction to countably additive Popper spaces, so

409: $\Popper^c(W,\F)$ denotes the set of

410: countably additive Popper spaces over $(W,\F)$.

411: The probability measure $\mu$ in a Popper space is

412: called a {\em Popper measure}.

413: \edfn

414: %joe10

415: %The additional regularity condition on $\F'$ required in a Popper space

416: The last regularity condition on $\F'$ required in a Popper space

417: corresponds to the observation that for an unconditional

418: probability measure $\mu$, if $\mu(V \mid U) \ne 0$ then $\mu(V \inter U)

419: \ne 0$, so conditioning on $V \inter U$ should be defined.

420: %joe9

421: Note that, since this regularity condition depends on the Popper

422: measure, it may well be the case that $(W,\F,\F',\mu)$ and

423: $(W,\F,\F',\nu)$ are both cps's over $(W,\F)$, but only the former is a

424: Popper space over $(W,\F)$.

425:

426: %joe2: rewrote

427: Popper \citeyear{Popper34,Popper68}\index{Popper, K.~R.}

428: and de Finetti \citeyear{Finetti36} were the first to

429: formally consider conditional probability as the basic notion, although

430: as R\'{e}nyi \citeyear{Renyi64}\index{R\'{e}nyi, A.} points out, the

431: idea of taking conditional probability as primitive seems to go back as

432: far as Keynes \citeyear{Keynes}.

433: CP1--3 are essentially due to R\'{e}nyi \citeyear{Renyi55}.

434: Van Fraassen \citeyear{vF76} defined what I have called Popper measures;

435: he called them Popper functions, reserving the name Popper measure for

436: what I am calling a countably additive Popper measure.

437: Starting from the work of de Finetti, there has been a general study of

438: \emph{coherent conditional probabilities}.  A coherent conditional

439: probability is essentially a

440: %joe10

441: %generalization of a cps, since it is defined

442: cps that is not necessarily a Popper space, since it is

443: defined

444: on a set $\F \times \F'$

445: %more general than a Popper algebra (for example,

446: where $\F'$ does not have to be a subset of $\F$); see, for example,

447: \cite{CS02} and the references therein.

448: Hammond \citeyear{Hammond94} discusses the use of conditional

449: probability spaces in philosophy and game theory, and provides an

450: extensive list of references.

451:

452:

453:

454: %Van Fraassen \cite{vF76} first Popper measures; he showed that Popper

455: %measures could be represented as sequences of probability measures that

456: %satisfied certain constraints, but these sequences seem to be quite

457: %different in spirit from lexicographic probability spaces, which I

458: %consider next.

459:

460: \subsection{Lexicographic probability spaces}\label{LPSdef}

461:

462: \dfn A {\em lexicographic probability space (LPS) (of length

463: $\alpha$) over $(W,\F)$\/} is a tuple

464: $(W,\F,\vecmu)$ where, as before, $W$ is a set of possible worlds and

465: $\F$ is an algebra over $W$, and $\vecmu$ is a sequence of finitely additive

466: probability measures on $(W,\F)$ indexed by ordinals $< \alpha$.

467: %\footnote{All probability measures are assumed to be only finitely

468: %additive, unless I explicitly say that they are countably additive.}

469: (Technically, $\vecmu$ is a function from the ordinals less

470: than $\alpha$ to probability measures on $(W,\F)$.)

471: I typically write $\vecmu$ as $(\mu_0, \mu_1, \ldots)$ or as

472: $(\mu_\beta: \beta < \alpha)$.

473: If $\F$ is a $\sigma$-algebra and each of the probability measures in

474: $\vecmu$ is countably additive, then $\vecmu$ is a {\em countably

475: additive LPS}.

476: Let $\LPS(W,\F)$ denote the set of LPS's over $(W,\F)$.

477: %joe9

478: %$\LPS(W,\F,\F')$ denote the set of LPS's $(W,\F,\vecmu)$ such that

479: %$\vecmu(U) > 0$ (i.e., $\mu_\beta(U) > 0$ for some

480: %$\beta$) iff $U \in \F'$.

481: Again, if $\F$ is a $\sigma$-algebra, a superscript $c$ is used to

482: denote countable additivity, so $\LPS^c(W,\F)$ denote the set of

483: countably additive LPS's over $(W,\F)$.

484: %joe10

485: % and $\LPS^c(W,\F,\F')$ consists

486: %of the countably additive LPS's $(W,\F,\vecmu)$ in $(W,\F,\F')$.

487: When $(W,\F)$ are understood, I often refer to $\vecmu$ as

488: the LPS.

489: %joe10

490: I write $\vecmu(U) > 0$ if $\mu_\beta(U) > 0$ for some $\beta$.

491: \edfn

492:

493: %joe9*

494: %joe10

495: %$\LPS(W,\F)$ is richer than $\Popper(W,\F)$, even if we restrict to

496: %finite %spaces $W$ (so that countable additivity is not an issue).

497: There is a sense in which $\LPS(W,\F)$ can capture a richer set of

498: preferences than $\Popper(W,\F)$,

499: even if we restrict to finite

500: spaces $W$ (so that countable additivity is not an issue).

501: For example, suppose that $W = \{w_1,w_2\}$, $\mu_0(w_1) = \mu_0(w_2) =

502: 1/2$, and $\mu_1(w_1) = 1$.  The LPS $\vecmu = (\mu_0,\mu_1)$  can be

503: thought of describing the situation where $w_1$ is very slightly more

504: likely than $w_2$.  Thus, for example, if $X_i$ is a bet that pays off 1

505: in state $w_i$ and 0 in state $w_{3-i}$, then according to $\vecmu$,

506: $X_1$ should be (slightly) prefereed to $X_2$, but for all $r > 1$,

507: $rX_2$ is preferred to $X_1$.  There is no CPS on $\{w_1,w_2\}$ that

508: leads to these preferences

509:

510: Note that, in this example, the support of $\mu_2$ is a subset of that

511: of $\mu_1$.  To obtain a bijection between LPS's and CPS's, we cannot

512: allow much overlap between the supports of the measures that make an

513: LPS.  What counts as ``much overlap'' turns out to be a somewhat subtle.

514: One way to formalize it was proposed by BBD.  They defined a {\em

515: lexicographic conditional probability space (LCPS)\/} to be an LPS such

516: that,

517: %joe6

518: roughly speaking,

519: the probability measures in the sequence have disjoint supports;

520: more precisely, there exist sets $U_\beta \in \F$ such that $\mu_\beta(U_\beta)

521: = 1$ and the sets $U_\beta$ are pairwise disjoint for $\beta < \alpha$.

522: One motivation for considering disjoint sets is to consider an agent who

523: has a sequence of hypotheses $(h_0, h_1, \ldots)$ regarding how the

524: world works.  If the primary hyothesis $h_0$ is discarded, then the

525: agent judges events according to $h_1$; if $h_1$ is discarded, then the

526: agent uses $h_2$, and so on.  Associated with hypothesis $h_\beta$ is

527: the probability measure $\mu_\beta$.  What would cause $h_\beta$ to be

528: discarded is observing an event $U$ such that $\mu_\beta(U) = 0$.

529: The set $U_\beta$ is the support of the hypothesis $h_\beta$.  In some

530: cases, it seems reasonable to think of the supports of these hypotheses

531: as disjoint.  This leads to LCPS's.

532:

533: BBD considered only finite spaces.  When we move to infinite spaces,

534: requiring disjointness of the supports of hypotheses may be too strong.

535: Brandenburger, Friedenberg, and Keisler \citeyear{BFK04} consider

536: finite-length LPS's $\vecmu$ that satisfy the property that

537: there exist sets $U_\beta$ (not necessarily disjoint) such that

538: $\mu_\beta(U_\beta) = 1$ and $\mu_\beta(U_\gamma) = 0$ for $\gamma \ne

539: \beta$.  Call such an LPS an MSLPS (for \emph{mutually singular LPS}).

540: Let a {\em structured LPS (SLPS)\/} be an LPS $\vecmu$ such that there

541: exist sets $U_\beta \in \F$ such that $\mu_\beta(U_\beta) = 1$ and

542: $\mu_\beta(U_\gamma) = 0$ for $\gamma > \beta$.

543: Thus, in an SLPS, later hypotheses are given probability 0 according to

544: the probability measure induced by earlier hypotheses, but earlier

545: hypotheses do not necessarily get probability 0 according the later

546: hypotheses.  (Spohn~\citeyear{Spohn86} also considered SLPS's; he called

547: them {\em dimensionally well-ordered families of probability measures}.)

548: Clearly every LCPS is an MSLPS, and every MSLPS is an SLPS.  If $\alpha$

549: is countable and we require countable additivity (or if $\alpha$ is

550: finite) then the notions are easily seen to coincide.  Given an SLPS

551: $\vecmu$ with associated sets $U_\beta, \beta <

552: \alpha$, define $U_\beta' = U_\beta - (\union_{\gamma > \beta} U_\gamma)$.

553: The sets $U_\beta'$ are

554: clearly pairwise disjoint elements of $\F$, and

555: %joe6

556: %$U_i'$ is a support for $\mu_i$.

557: $\mu_\beta(U_\beta') = 1$.

558: %Of course, the same argument holds even without the assumption

559: %of countable additivity if $\alpha$ is finite.

560: However, in general, LCPS's are a strict subset of MSLPS's, and MSLPS's

561: are a strict subsets of SLPS's, as the following two examples show.

562:

563:

564: %joe6: new paragraph

565: \commentout{

566: To understand the motivation for SLPS's, consider an agent with a

567: sequence of hypotheses (modeled as probability distributions).  The first

568: hypothesis, modeled by $\mu_0$, is used as long as it is not

569: controverted by  evidence.  If an event $E$ is discovered that shows

570: that the first hypothesis must be wrong (i.e., $\mu_0(E) = 0)$, then

571: next hypothesis that gives $E$ positive measure is used.  With this

572: intuition, it seems reasonable that if $i > j$, the set of states where

573: $\mu_j$ is used, namely, $U_j$, should be a set that is given measure 0

574: by all $\mu_i$ with $i < j$; hypothesis $j$ should not be used unless

575: all higher-ranking hypotheses have been discarded.

576: }

577:

578: % However, in general, countable additivity is required, as a

579: %simple modification of Example~\ref{SLPSxam} below shows.)

580:

581:

582: \xam\label{SLPSxam}  Consider a well-ordering of the interval $[0,1]$,

583: that is,

584: %joe2

585: %an isomorphism

586: a bijection

587: from $[0,1]$ to an initial segment of the ordinals.

588: %a well-ordering of the reals.  The existence of such a well-ordering is

589: %known to be equivalent to the Axiom of Choice.  In any case,

590: Suppose that this initial segment of the ordinals has length $\alpha$.

591: Let $([0,1],\F,\vecmu)$ be an LPS of length $\alpha$ where $\F$ consists

592: of the Borel subsets of $[0,1]$.

593: Let $\mu_0$ be the standard Borel measure on $[0,1]$,

594: and let $\mu_\beta$ be the measure that gives probability 1 to

595: $r_\beta$, the $\beta$th real in the well-ordering.  This clearly gives

596: an SLPS, since

597: %joe6

598: %the support of $\mu_0$ is $[0,1]$ and the support of $\mu_\beta$ for $0

599: we can take $U_0 = [0,1]$ and $U_\beta = \{r_\beta\}$ for $0

600: < \beta < \alpha$; note that $\mu_\alpha(U_\beta) = 0$ for $\beta > \alpha$.

601: %joe9

602: %However, this SLPS is not equivalent to any LCPS; there

603: However, this SLPS is not equivalent to any MSLPS (and hence not to any

604: LCPS); there

605: %joe6

606: %is no support of $\mu_0$ which is disjoint from the supports of

607: %joe9

608: is no

609: set $U_0'$ such that $\mu_0(U_0') = 1$ and $U_0'$ is disjoint from

610: $r_\beta$ for all $\beta$ with $0 < \beta < \alpha$.

611: %The intuition given above for SLPS's is also given by Brandenburger,

612: %Friedenberg, and Keisler for MSLPS's.}

613: %joe8

614: %I would argue that this example

615: %illustrates that the mutual singularity requirement is too strong to capture

616: %that intuition; it suffices that $\mu_\alpha(U_\beta) = 0$ for $\beta >

617: %\alpha$ (as is required for SLPS's).  If $r_\beta$ is observed, then

618: %the agent should discard hypothesis $\mu_0$ and use hypothesis

619: %$\mu_\beta$, even though $\mu_\beta([0,1]) = 1$.}

620: \exam

621:

622: %joe9

623: \xam\label{MSLPSxam}  Suppose that $W = [0,1] \times [0,1]$.  Again,

624: consider a well-ordering on $[0,1]$.  Using the notation of

625: Example~\ref{SLPSxam}, define $U_{0,\beta} = r_{\beta} \times [0,1]$ and

626: $U_{1,\beta} = [0,1] \times \{r_\beta\}$.  Define $\mu_{i,\beta}$ to be

627: the Borel measure on $U_{i,\beta}$.  Consider the LPS $(\mu_{0,0},

628: \mu_{0,1}, \ldots, \mu_{1,0}, \mu_{1,1}, \ldots)$.  Clearly this is an

629: MSLPS, but not an LCPS.  \exam

630:

631: The difference between LCPS's, MSLPS's, and SLPS's does not arise in the work

632: of BBD, since they consider only finite

633: sequences of measures.  The restriction to finite sequences, in turn, is

634: due to their restriction to finite sets $W$ of possible worlds.

635: Clearly, if $W$ is finite, then all LCPS's over $W$ must have length $\le

636: |W|$, since the measures in an LCPS have disjoint supports.  Here it

637: will play a more significant role.

638:

639: We can put an obvious lexicographic order $<_L$ on sequences $(x_0, x_1,

640: \ldots)$ of numbers in $[0,1]$ of length $\alpha$: $(x_0, x_1, \ldots)

641: <_L (y_0, y_1, \ldots)$ if there exists $\beta < \alpha$ such that

642: $x_\beta < y_\beta$ and $x_\gamma

643: = y_\gamma$ for all $\gamma < \beta$.  That is, we

644: compare two sequences by comparing their components at the first place

645: they differ.  (Even if $\alpha$ is infinite, because we are dealing with

646: ordinals, there will be a least ordinal at which the sequences differ if

647: they differ at all.)  This lexicographic order will be used

648: to define decision rules.

649: %I return to this issue in Section~\ref{??}.

650:

651: BBD define conditioning in LPS's as follows.  Given $\vecmu$ and $U \in

652: \F$ such that $\vecmu(U) > 0$, let $\vecmu|U =

653: (\mu_{k_0}(\cdot \mid U), \mu_{k_1}(\cdot \mid U), \ldots )$, where $(k_0,

654: k_1, \ldots)$ is the subsequence of all indices for which the

655: probability of $U$ is positive.

656: Formally, $k_0 = \min\{k: \mu_k(U) > 0\}$

657: and for an arbitrary ordinal $\beta > 0$, if $\mu_{k_\gamma}$ has been

658: defined for all $\gamma < \beta$ and there exists a measure $\mu_{\delta}$ in

659: $\vecmu$ such that $\mu_{\delta}(U) > 0$ and $\delta > k_\gamma$ for all

660: $\gamma < \beta$, then $k_\beta = \min\{\delta: \mu_{\delta}(U) > 0, \,

661: \delta > k_\gamma \mbox{ for all } \gamma < \beta\}$.

662: Note that

663: $\vecmu|U$ is undefined if $\vecmu(U) = 0$.

664:

665: \subsection{Nonstandard probability spaces}\label{NPSdef}

666:

667: It is well known that there exist {\em non-Archimedean fields}---fields

668: that include the real  numbers as a subfield but also have

669: {\em infinitesimals\/}, numbers that are positive but still less than

670: any positive real number.   The smallest such non-Archimedean field,

671: commonly denoted $\IR(\epsilon)$, is the smallest field generated by

672: adding to the reals a single infinitesimal $\epsilon$.%

673: \footnote{The construction of $\IR(\epsilon)$ apparently goes back to

674: Robinson \citeyear{Robinson73}.  It is reviewed by

675: Hammond \citeyear{Hammond94,Hammond99} and Wilson

676: \citeyear{Wilson95} (who calls $\IR(\epsilon)$ the {\em extended

677: reals\/}).}

678: %joe3

679: %The {\em hyperreals}, nonstandard models of the reals that satisfy

680: %all the first-order properties that hold of the real numbers (see

681: %\cite{(Davis77}), are also instances of non-Archimedean fields.

682: We can further restrict to non-Archimedean fields that are

683: \emph{elementary extensions} of the standard reals: they

684: agree with the

685: standard reals on all properties that can be expressed in a first-order

686: language with a predicate $N$ representing the natural numbers.

687: For most of this paper, I use only the following properties

688: of non-Archimedean fields:

689: %in the language of arithmetic (i.e., first-order formulas involving 0,

690: %1, $+$, and $\times$).  That is, there are nonstandard models of the

691: %reals that include infinitesimals that satisfy a first order formula

692: %$\phi$ in the language of arithmetic iff $\phi$ is satisfied in the

693: %%nstandard reals.  The existence of such nonstandard models follows easily

694: %from the compactness theorem for first-order logic.  (See \cite{Davis}

695: %for an introduction to both nondstandard reals and the relevant logic.)

696: %The logical details do not matter.  All that I need for the purposes of

697: %this paper are the following basic facts, which I shall use without

698: %further comment in the remainder of the paper.

699: %\begin{itemize}

700: %\item There exist nonstandard models of the reals.  (By nonstandard

701: %model here I mean any model that satisfies all the first-order

702: %properties of the reals but is not isomorphic to the reals.)

703: %Each of these nonstandard models embeds a nonstandard model of the

704: %natural numbers.

705: %\item

706: %Moreover, for any ordinal alpha, there exists a nonstandard model with

707: %%nnonstandard natural number $N_\beta$ for all ordinals $\beta \le \alpha$ such

708: %that if $\beta < \beta'$ then $N_\beta < N_{\beta'}$.

709: %(The existence of such models follows immediately from the compactness

710: %theorem too.)

711: \begin{enumerate}

712: \item If $\IR^*$ is a non-Archimedean field, then for all $b \in \IR^*$

713: such that $-r < b < r$ for some standard real $r > 0$,

714: there is a unique closest real number $a$ such that $|a - b|$ is an

715: infinitesimal.  (Formally, $a$ is the inf of the set of real numbers

716: that are at least as large as $b$.)  Let $\stand{b}$ denote the closest

717: standard real to $b$; $\stand{b}$ is sometimes read ``the standard

718: part of $b$''.

719: \item If $\stand{\epsilon/\epsilon'} =0$, then $a \epsilon < \epsilon'$

720: for all positive standard real numbers $a$.  (If $a \epsilon$ were

721: greater than $\epsilon'$, then $\epsilon/\epsilon'$ would be greater

722: than $1/a$,

723: contradicting the assumption that $\stand{\epsilon/\epsilon'} = 0$.)

724: \end{enumerate}

725:

726: Given a non-Archimedean field $\IR^*$, a  {\em

727: nonstandard probability space (NPS) over $(W,\F)$ (with range $\IR^*$)\/} is

728: a tuple $(W,\F,\mu)$, where $W$ is a set of possible worlds,  $\F$ is an

729: algebra

730: of subsets of $W$, and $\mu$ assigns to sets in $\F$

731: %joe2

732: %an element of

733: a nonnegative element of

734: $\IR^*$ such that $\mu(W) = 1$ and $\mu(U \union V) = \mu(U) + \mu(V)$ if

735: $U$ and $V$ are disjoint.%

736: \footnote{Note that, unlike Hammond \citeyear{Hammond94,Hammond99},

737: I do not restrict the range of probability measures to consist of

738: ratios of polynomials in $\epsilon$ with nonnegative coefficients.}

739:

740: If $W$ is infinite, we may also require that

741: $\F$ be a $\sigma$-algebra and that $\mu$ be countably additive.

742: (There are some subtleties involved with countable additivity in

743: nonstandard probability spaces; see Section~\ref{countableadditivity}.)

744:

745:

746:

747: \section{Relating Popper Spaces to (S)LPS's}\label{FCP}

748:

749: In this section, I consider a mapping $\FCP$ from SLPS's over

750: $(W,\F)$ to Popper spaces over $(W,\F)$, for each fixed $W$ and $\F$,

751: and show that, in many cases of interest, $\FCP$ is a bijection.

752: Given an SLPS $(W,\F,\vecmu)$ of length $\alpha$,

753: consider the cps $(W,\F,\F',\mu)$ such that $\F' = \union_{\beta <

754: \alpha} \{V \in \F: \mu_\beta(V)

755: > 0 \}$.  For $V \in \F'$, let $\beta_V$

756: be the smallest index such $\mu_{\beta_V}(V) > 0$.  Define $\mu(U \mid V) =

757: \mu_{\beta_V}(U \mid V)$.  I leave it to the reader to check that

758: $(W,\F,\F',\mu)$ is a Popper space.

759:

760: %joe2

761: %There are many isomorphisms between two spaces.

762: There are many bijections between two spaces.

763: Why is $\FCP$ of interest?

764: Suppose that $\FCP(W,\F,\vecmu) = (W,\F,\F',\mu)$.  It is easy

765: to check that the following two important properties hold:

766: \begin{enumerate}

767: \item $\F'$ consists precisely of those events for which conditioning in

768: the LPS is defined; that is, $\F' = \{U:

769: %joe9

770: %\mu_\beta(U) \ne 0 \mbox{ for some } \mu_\beta \in \vecmu\}$.

771: \vecmu(U) > 0\}$.

772: \item For $U \in \F'$, $\mu(\cdot \mid U) = \mu'(\cdot \mid U)$, where

773: $\mu'$ is the first probability measure in the sequence $\vecmu|U$.

774: That is, the

775: Popper measure agrees with the most significant probability measure

776: in the conditional LPS given $U$.  Given that an LPS assigns to an event

777: $U$ a sequence of numbers and a Popper measure assigns to $U$ just a

778: single number, this is clearly the best single number to take.

779: \end{enumerate}

780: %It seems that these are minimal properties that an

781: %isomorphism should satisfy.  Moreover,

782: It is clear that these two properties in fact characterize $\FCP$.

783: Thus, $\FCP$ preserves the events on which conditioning is possible and

784: the most significant term in the lexicographic probability.

785:

786: \subsection{The finite case}

787: It is useful to separate the analysis of $\FCP$ into two cases, depending

788: on whether or not the state space is finite.  I consider the finite case

789: first.

790:

791: BBD claim without proof that $\FCP$ is a bijection

792: from LCPS's to

793: conditional probability spaces.  They work in finite spaces $W$ (so that

794: LCPS's are equivalent to SLPS's) and restrict

795: attention to LPS's where $\F

796: %joe4

797: %= 2^W$ and $\F' = 2^W - \emptyset$ (so that conditioning is defined for

798: = 2^W$ and $\F' = 2^W - \{\emptyset\}$ (so that conditioning is defined for

799: all nonempty sets).  Since $\F' = 2^W - \{\emptyset\}$, the cps's they

800: consider are all Popper spaces.

801: Hammond \citeyear{Hammond94} provides a careful proof of this result,

802: under the restrictions considered by BBD.

803: I generalize Hammond's result by considering

804: %joe9

805: %arbitrary finite Popper spaces

806: finite Popper spaces

807: with arbitrary conditioning events.

808: No new conceptual issues arise in doing this extension; I

809: include it here only to be able to contrast it with the other

810: results.

811: %Hammond's result holds for arbitrary finite Popper spaces, with

812: %essentially no change in proof.

813:

814: Let $\SLPS(W,\F)$ denote the set of LPS's over $(W,\F)$; let

815: $\SLPS(W,\F,\F')$ denote the set of LPS's $(W,\F,\vecmu)$ such that

816: $\vecmu(U) > 0$ for all $U \in \F'$ (i.e., $\mu_\beta(U) > 0$ for some

817: $\beta$); as usual, I use a superscript $c$ to denote countable

818: additivity, so, for example, $\SLPS^c(W,\F)$ denotes the set of

819: countably additive SLPS's over $(W,\F)$.

820: Let $\Popper(W,\F,\F')$ denote the set of Popper spaces of the form

821: $(W,\F,\F')$ and let  $\Popper^c(W,\F,\F')$

822: denote the set of Popper spaces of the form

823: $(W,\F,\F',\mu)$ where $\mu$ is countably additive.

824:

825:

826: \thm\label{FCPfin}

827: %joe9

828: %If $W$ is finite, then $\FCP$ is a bijection

829: %from $\SLPS(W,\F)$ to $\Popper(W,\F)$.  \ethm

830: If $W$ is finite, then

831: $\FCP$ is a bijection from $\SLPS(W,\F,\F')$ to $\Popper(W,\F,\F')$.  \ethm

832:

833: \prf It is immediate from the definition that if $(W,\F,\vecmu) \in

834: \SLPS(W,\F,\F')$, then $\FCP(W,\F,\vecmu) \in \Popper(W,\F,\F')$.  It is

835: also straightforward to show that $\FCP$ is an injection (see the

836: appendix for details).  The work comes in showing that $\FCP$ is a

837: surjection (or, equivalently, in constructing an inverse to $\FCP$).

838: I sketch the main ideas of the argument here, leaving details to the

839: appendix.

840:

841: Given $\mu \in \Popper(W,\F,\F')$, the idea is to choose $k \le |W|$ and $k$

842: disjoint sets $U_0, \ldots, U_k \in \F'$ appropriately such that $\mu_j

843: = \mu \mid U_j$ for $j= 0, \ldots, k$ (i.e., $\mu_j(V) = \mu(V \mid

844: U_j)$) amd $\FCP(W,\F,\vecmu) = \mu$.  Since the sets $U_0, \ldots, U_k$

845: are disjoint, $\vecmu$ must be an SLPS.  The difficulty lies in choosing

846: $U_0, \ldots, U_k$ so that $\vecmu(U) > 0$ iff $U \in \F'$.  This is

847: done as follows.  Let $U_0$ be the smallest set $U \in \F$ such that

848: $\mu(U) = 1$.

849: Since $W$ is finite, there is such a smallest set; it is simply the

850: intersection of all sets $U$ such that $\mu(U \mid W) = 1$.  Since $\mu(U_0

851: \mid W) > 0$, it follows that $U_0 \in \F'$.  If $\overline{U}_0 \notin

852: \F'$. then (because $\F'$ is closed under supersets in $\F$), no

853: subset of $\overline{U}_0$ is in $\F'$.  If $\overline{U}_0 \in

854: \F'$, let $U_1$ be the smallest set in

855: $\F$ such that $\mu(U_1 \mid \overline{U}_0) = 1$.  Note that $U_1

856: \subseteq \overline{U}_0$ and that $U_1 \in \F'$.

857: Continuing in this way, it is

858: clear that there exists a $k \ge 0$ and a sequence of pairwise disjoint

859: sets $U_0, U_1, \ldots, U_k$ such that (1) $U_i \in \F'$ for $i = 0,

860: \ldots, k$,  (2) for $i < k$, $\overline{U_0 \union \ldots \union U_i}

861: \in \F'$ and $U_{i+1}$ is the smallest subset of $\F$ such that

862: $\mu(U_{i+1} \mid \overline{U_0 \union \ldots \union U_i}) = 1$, and (3)

863: $\overline{U_0 \union \ldots \union U_k} \notin \F'$.

864: Condition (2) guarantees that $U_{i+1}$ is a subset of

865: $\overline{U_0 \union \ldots \union U_i}$, so the $U_i$'s are pairwise

866: disjoint.  Define the LPS $\vecmu = (\mu_1, \ldots, \mu_k)$ by taking

867: $\mu_i(V) = \mu(V \mid U_i)$.  Clearly the support of $\mu_i$ is $U_i$, so

868: this is an LCPS (and hence an SLPS).  \eprf

869:

870: %%joetark

871: %\thm\label{FCPfin} {\rm \cite{Hammond94}} If $W$ is finite, then $\FCP$

872: %is an isomorphism from $\SLPS(W,\F)$

873: %to $\Popper(W,\F)$.  \ethm

874:

875: \cor\label{FCPfin1}

876: If $W$ is finite, then $\FCP$ is a bijection

877: from $\SLPS(W,\F)$ to $\Popper(W,\F)$.  \ecor

878:

879:

880: %That is, Popper spaces are strictly more general than SLPS's in the case

881: %of infinite spaces where the probability measures are not necessarily

882: %countably additive.  On the other hand, I show that $\FCP$ is an

883: %isomorphism from countably additive SLPS's to countable

884: %additive Popper spaces.

885:

886:

887: %\subsection{Technical results}

888:

889:

890:

891: %\subsection{Infinite State Spaces without Countability}

892: \subsection{The infinite case}

893:

894: The case where the state space $W$ is infinite is not considered

895: by either BBD or Hammond.  It presents some interesting subtleties.

896:

897: It is easy to see that $\FCP$ is an injection from

898: %joe9

899: %from SLPS's to Popper spaces.  However,

900: SLPS's to Popper spaces.  However,

901: %joetark: added next line

902: as the following two examples show, if we do not require countable

903: additivity,  then it is not a bijection.

904:

905:

906:

907: \xam\label{counter1}  (This example is essentially due to Robert

908: Stalnaker [private communication, 2000].)  Let $W = \IN$, the natural

909: numbers, let $\F$ consist of the finite and cofinite subsets of $\IN$

910: %joe6

911: (recall that a cofinite set is the complement of a finite set),

912: and let $\F' = \F - \{\emptyset\}$.  If $U$ is cofinite,

913: take $\mu^1(V \mid U)$  to be 1 if $V$ is cofinite and 0 if $V$ is finite.

914: If $U$ is finite, define $\mu^1(V \mid U) = |V \inter

915: U|/|U|$.  I leave it to the reader to check that $(\IN,\F,\F',\mu^1)$ is a

916: %joe9

917: %Popper space.  Suppose there were some LPS $(\IN,\F,\vecmu)$ which was

918: Popper space.  Note that $\mu^1$ is not countably additive (since

919: $\mu^1(\{i\} \mid \IN) = 0$ for all $i$, although $\mu^1(\IN \mid \IN) =

920: 1$).

921: Suppose that there were some LPS $(\IN,\F,\vecmu)$ which was

922: mapped by $\FCP$ to this Popper space.  Then it is easy to check that

923: if $\mu_i$ is the first measure in $\vecmu$ such that $\mu_i(U) > 0$ for

924: some finite set $U$, then $\mu_i(U') > 0$ for all nonempty finite sets $U'$.

925: To see this, note that for any nonempty finite set $U'$, since

926: $\mu_i(U) > 0$, it follows that $\mu_i(U \union U') > 0$.  Since $U

927: \union U'$ is finite, it must be the case that $\mu_i$ is the first

928: measure in $\vecmu$ such that $\mu_i(U \union U') > 0$.  Thus, by

929: definition, $\mu^1(U' \mid U \union U') = \mu_i(U' \mid U \union U')$.  Since

930: $\mu^1(U' \mid U \union U') > 0$, it follows that $\mu_i(U') > 0$.

931: Thus, $\mu_i(U') > 0$ for all nonempty finite sets $U'$.

932:

933: It is also easy to see that $\mu_i(U)$ must be proportional

934: to $|U|$ for all finite sets $U$.  To show this, it clearly suffices to show

935: that $\mu_i(n) = \mu_i(0)$ for all $n \in \IN$.  But this is immediate

936: from the observation that

937: $$\mu_i(\{0\} \mid \{0, n \}) = \mu^1(\{0\} \mid \{0, n \}) =

938: |\{0\}|/|\{0,n\}| = \frac{1}{2}.$$

939: But there is no probability measure $\mu_i$ on the natural

940: numbers such that $\mu_i(n) = \mu_i(0) > 0$ for all $n \ge 0$.

941: %For, by countable additivity, if

942: %$\mu_i(0) = 0$ then $\mu_i(\IN) = 0$ and if $\mu_i(0) > 0$, then

943: %$\mu_i(\IN) = \infty$.

944: For if $\mu_i(0) > 1/N$, then $\mu_i(\{0, \ldots, N-1\}) > 1$, a

945: contradiction.

946: (See Example~\ref{counter3} for further discussion of this setup.)

947: \exam

948:

949: %joetark

950: %\commentout{

951: \xam\label{counter2}  Again, let $W = \IN$,

952: let $\F$ consist of the finite and cofinite subsets of $\IN$,

953: and let $\F' = \F - \{\emptyset\}$.  As with $\mu^1$,

954: if $U$ is cofinite,

955: take $\mu^2(V \mid U)$  to be 1 if $V$ is cofinite and 0 if $V$ is finite.

956: However, now, if $U$ is finite, define $\mu^2(V \mid U) =

957: 1$ if $\max(V \inter U) = \max U$, and $\mu^2(V \mid U) = 0$ otherwise.

958: Intuitively, if $n > n'$, then $n$ is infinitely more probable than $n'$

959: according to $\mu^2$.

960: Again, I leave it to the reader to check that $(\IN,\F,\F',\mu^2)$ is a

961: Popper space.  Suppose there were some LPS $(\IN,\F,\vecmu)$ which was

962: mapped by $\FCP$ to this Popper space.  Then it is easy to check that

963: if $\mu_n$ is the first measure in $\vecmu$ such that $\mu_n(\{n\}) > 0$, then

964: $\mu_n$ comes before $\mu_{n'}$ in $\vecmu$ if $n > n'$.  However, since

965: $\vecmu$ is well-founded, this is impossible.

966: \exam

967:

968: %joetark:

969: %As the following theorem shows, there are no such counterexamples if we

970: As the following theorem,

971: %joe6

972: originally

973: proved by Spohn \citeyear{Spohn86}, shows,

974: there are no such counterexamples if we

975: restrict to countably additive SLPS's and countably additive Popper spaces.

976:

977:

978:

979: \thm\label{infiso} {\rm \cite{Spohn86}}

980: For all $W$, the map $\FCP$ is a bijection from

981: $\SLPS^c(W,\F,\F')$

982: to $\Popper^c(W,\F,\F')$.  \ethm

983:

984: \prf Again, the difficulty comes in showing that $\FCP$ is onto.  Given

985: a Popper space $(W,\F,\F',\mu)$, I again construct sets $U_0, U_1,

986: \ldots$ and an LPS $\vecmu$ such that $\mu_\beta(V)=\mu(V \mid

987: U_\beta)$, and show that $\FCP(W,\F,\vecmu) = (W,\F,\F',\mu)$.

988: However, now a completely different construction is required; the

989: earlier inductive construction of the

990: sequence $U_0, \ldots, U_k$ no longer works.  The problem already arises

991: in the construction of $U_0$.  There may no longer be a smallest set

992: $U_0$ such that $\mu(U_0) = 1$.  Consider, for example, the interval

993: $[0,1]$ with Borel measure.  There is clearly no smallest subset $U$ of

994: $[0,1]$ such that $\mu(U) = 1$.  The details can be found in the appendix.

995: \eprf

996:

997:

998: \cor\label{infiso1} For all $W$, the map $\FCP$ is a bijection from

999: $\SLPS^c(W,\F)$

1000: to $\Popper^c(W,\F)$.

1001: \ecor

1002:

1003: It is important in Corollary~\ref{infiso1} that we consider SLPS's and not

1004: %joe9

1005: %LCPS's.  $\FCP$ is in fact not a bijection from LCPS's to Popper

1006: MSLPS's or LCPS's.  $\FCP$ is in fact not a bijection from MSLPS's or

1007: LCPS's to Popper

1008: spaces.

1009:

1010: \xam\label{SLPSxam2} Consider the Popper space $([0,1],\F,\F',\mu)$

1011: which is the image under $\FCP$ of the SLPS constructed in

1012: Example~\ref{SLPSxam}.  It is easy to see that this Popper space cannot

1013: %joe9

1014: %be the image under $\FCP$ of some LCPS. \exam

1015: be the image under $\FCP$ of some MSPLS (and hence not of some LCPS

1016: either). \exam

1017:

1018: %joe9:  moved here

1019: \subsection{Treelike CPS's}\label{sec:BS}

1020:

1021: %joe10

1022: %One of the ``minimal'' requirements for $\F \times \F'$ to be a Popper

1023: %algebra over $W$

1024: One of the requirements in a Popper space is that

1025: $\F'$ be closed under supersets in $\F$.  If we

1026: think of $\F'$ as consisting of all sets on which conditioning is

1027: possible, this makes sense; if we can condition on a set $U$, we should

1028: be able to consider on a superset $V$ of $U$.  But if we think of $\F'$

1029: as representing all the possible evidence that can be obtained (and

1030: thus, the set of events on which an agent must be be able to

1031: condition, so as to update her beliefs), there is no reason that $\F'$

1032: should be closed under supersets; nor, for that matter, is it

1033: necessarily the case that if $U \in \F'$ and $\mu(V \mid U) \ne 0$, then

1034: $V \inter U \in \F'$.

1035: In general, a cps where $\F'$ does not have these properties

1036: cannot be represented by an LPS, as the following

1037: example shows.

1038:

1039: \xam\label{xam:noPopper}  Let $W = \{w_1, w_2, w_3, w_4\}$, let $\F$ consist

1040: of all subsets of $W$, and let $\F'$ consist of all the 2-element

1041: subsets of $W$.

1042: %joe10

1043: %Clearly $\F \times \F'$ is not a Popper algebra, since

1044: Clearly

1045: $\F'$ is not closed under supersets.  Define $\mu$ on $\F \times \F'$

1046: such that $\mu(w_1 \mid \{w_1,w_3\}) = \mu(w_4 \mid

1047: \{w_2,w_4\}) = 1/3$, and  $\mu(w_1 \mid \{w_1,w_2\}) = \mu(w_4 \mid

1048: \{w_3,w_4\}) =

1049: 1/2$, and CP1 and CP2 hold.  This is easily seen to determine $\mu$.

1050: Moreover, $\mu$ vaciously satisfies CP3, since there do not exist

1051: distinct sets $U$ and $X$ in $\F'$ such that $U \subseteq X$.  It is

1052: easy to show that there is no unconditional probability $\mu^*$ on $W$

1053: such that $\mu^*(U \mid V) = \mu(U \mid V)$ for all pairs $(U,V) \in \F

1054: \times \F'$ such that $\mu^*(V) > 0$ (where, for $\mu^*$, the

1055: conditional probability is defined in the standard way).%

1056: \footnote{This example is closely related to examples of conditional

1057: probabilities for which there is no common prior; see, for example,

1058: \cite[Example 2.2]{Hal21}.}  It easily follows that there is no LPS

1059: $\vecmu$ such that $\vecmu(U \mid V) = \mu(U \mid V)$ for all $(U,V) \in

1060: \F \times \F'$ (since otherwise $\mu_0$ would agree with $\mu$ on all

1061: pairs $(U,V) \in  \F \times \F'$ such that $\mu(V) > 0$).

1062: Had $\F'$ been closed under supersets, it would have included $W$.  It

1063: is easy to see that it is impossible to extend $\mu$ to $\F \times (\F'

1064: \union \{W\})$ so that CP3 holds.

1065: \exam

1066:

1067:

1068:

1069: In the game-theory literature, Battigalli and Siniscalchi

1070: \citeyear{BS02} use conditional probability measures to model players'

1071: beliefs about other players' strategies in

1072: extensive-form games where agents have perfect recall.  The conditioning

1073: events are essentially

1074: information sets;

1075: %joe9

1076: %Thus, the cps's they consider are not necessarily Popper spaces.  The set

1077: %of conditioning events (i.e., the set $\F'$) may not be closed under

1078: %supersets nor does it necessarily satisfy the condition that if

1079: %$U \in \F'$ and $\mu(V \mid U) \ne 0$ then $V \inter U \in \F'$.

1080: which can be thought of as representing the possible evidence that an

1081: agent can obtain in a agame.

1082: Thus, the cps's they consider are not necessarily Popper spaces, for the

1083: reasons described above.

1084: Nevertheless, the conditioning events considered by Battigalli and

1085: Sinischalchi  satisfy certain properties that

1086: prevent an analogue of Example~\ref{xam:noPopper} from holding.

1087: I now make this precise.

1088:

1089: %joe7:

1090: Formally, I assume that there is a one-to-one

1091: correspondence between the sets in $\F'$ and the information sets of

1092: some fixed player $i$.  For each set $U \in \F'$, there is a unique

1093: information set $I_U$ for player $i$ such that $U$ consists of all the

1094: strategy profiles

1095: that reach $I_U$.  With this identification, it is immediate that we can

1096: organize the sets in $\F'$ into a forest (i.e., a collection of trees),

1097: with the same ``reachability'' structure as that of the information sets

1098: in the game tree.  The topmost sets in the forest are the ones

1099: corresponding to the topmost information

1100: sets for player $i$ in the game tree.  There may be several such topmost

1101: information sets if nature or some player $j$ other than $i$ makes the

1102: first move in the game.  (That is why we have a forest, rather than a

1103: tree.)  The immediate successors of a set $U$ are the sets of strategy

1104: profiles corresponding to information sets for player $i$ reached

1105: immediately after $I_U$.

1106: Because agents have perfect recall, the conditioning events $\F'$ have

1107: the following properties:

1108: \begin{itemize}

1109: \item[T1.] $\F'$ is countable.

1110: \item[T2.] The elements of $\F'$ can be organized as a forest (i.e., a collection of

1111: trees) where, for each $U \in \F'$, if there is an edge from $U$ to

1112: some $U' \in \F'$, then $U' \subseteq U$, all the immediate successors

1113: of  $U$ are

1114: disjoint, and $U$ is the union of its immediate successors.

1115: \item[T3.] The topmost nodes in each tree of the forest form a

1116: partition of $W$.

1117: \end{itemize}

1118:

1119: Say that a set $\F'$ is \emph{treelike} if it satisfies T1--3.

1120: It follows from T2 and T3 that, for any sets $U$ and $U'$ in a treelike

1121: set $\F'$, either $U \subseteq U'$ (if $U$ is a descendant of $U'$

1122: in some tree), $U' \subseteq

1123: U$ (if $U'$ is a descendant of $U$), or $U$ and $U'$ are disjoint (if

1124: neither is a descendant of the other).

1125: If $\F'$ is treelike, let $\T^c(W,\F,\F')$ consist of all countably

1126: additive cps's defined on

1127: $\F \times \F'$.

1128: %Let $\SLPS^c(W,\F,\F')$ consist of all SLPS's $\vecmu$

1129: %in $\SLPS^c(W,\F)$

1130: %such that $\vecmu(U) > \vec{0}$ for all $U \in \F'$

1131: %(i.e., $\mu_j(U) > 0$ for some $j$).

1132: I abuse notation

1133: in the next result, viewing $\FCP$ as a mapping from

1134: $\SLPS^c(W,\F,\F')$ to $\T^c(W,\F,\F')$.

1135:

1136:

1137: \pro\label{prop:BS} The map $\FCP$ is a surjection from

1138: $\SLPS^c(W,\F,\F')$ onto $\T^c(W,\F,\F')$.  \epro

1139:

1140: Since $\F'$ is countable, every SLPS in $\SLPS^c(W,\F,\F')$ must have at

1141: most countable length.  Thus, there is no distinction between SLPS's,

1142: LCPS's, and MSPLS's in this case.  (Indeed, in the proof of

1143: Proposition~\ref{prop:BS}, the LPS constructed to demonstrate the

1144: surjection is an LCPS.)  Note that we cannot hope to get a bijection

1145: here, even if $W$ is finite.  For example, suppose that $W = \{w_1,

1146: w_2\}$, $\F = 2^W$, and

1147: %joe9

1148: %$\F' = \{\{w_1\}, \{w_2\}\}$.  $\F'$ is clearly treelike.  There

1149: $\F' = \{\{w_1\}, \{w_2\}\}$.  $\F'$ is clearly treelike, and there

1150: is a unique cps $\mu$ on $(W,\F,\F')$.  $\FCP$ maps

1151: every SLPS in $\SLPS(W,\F,F')$  to $\mu$, but is clearly not  a

1152: bijection.  (This example also shows that we do not get a bijection by

1153: considering MSLPS's or LCPS's either.)

1154:

1155: %joetark:

1156: \subsection{Related Work}\label{related}

1157:

1158: It is interesting to contrast these results to those of R\'{e}nyi

1159: \citeyear{Renyi56} and van Fraassen \citeyear{vF76}.  R\'{e}nyi considers

1160: what he calls {\em dimensionally ordered\/} systems.

1161: A dimensionally ordered system over $(W,\F)$ has the form

1162: $(W,\F,\F',\{\mu_i: i \in I\})$, where $\F$ is an algebra of

1163: subsets of $W$, $\F'$ is a subset of $\F$ closed under finite unions

1164: %joe9

1165: (but not necessarily closed under supersets in $\F$),

1166: $I$ is a totally ordered set (but not necessarily well-founded, so it

1167: may not, for example, have a first element) and $\mu_i$ is a measure on

1168: $(W,\F)$ (not necessarily a probability measure) such that

1169: \begin{itemize}

1170: \item for each $U \in \F'$, there is some $i \in I$ such that $0 <

1171: \mu_i(U) < \infty$ (note that the measure of a set may, in general, be

1172: $\infty$),

1173: \item if $\mu_i(U) < \infty$ and $j < i$, then $\mu_j(U) = 0$.

1174: \end{itemize}

1175: Note that it follows from these conditions that for each $U \in \F'$,

1176: there is exactly one $i \in I$ such that $0 < \mu_i(U) < \infty$.

1177:

1178:

1179: There is an obvious analogue of the map $\FCP$ mapping dimensionally

1180: ordered systems to cps's.  Namely, let $\FDC$ map the dimensionally

1181: ordered system $(W,\F,\F',\{\mu_i: i \in I\})$ to the cps

1182: $(W,\F,\F',\mu)$, where $\mu(V \mid U) = \mu_i(V \mid U)$, where $i$ is the unique

1183: element of $I$ such that $0 < \mu_i(U) < \infty$.  R\'{e}nyi shows that

1184: $\FDC$ is a bijection from dimensionally ordered systems to cps's

1185: where the set $\F'$ is closed under finite unions.  (Cs\'{a}sz\'{a}r

1186: \citeyear{Csaszar55} extends this result to cases where the set $\F'$ is

1187: not necessarily closed under finite unions.)

1188: R\'{e}nyi assumes that all measures involved are countably additive and

1189: that $\F$ is a $\sigma$-algebra, but these are inessential assumptions.

1190: That is, his proof goes through without change if $\F$ is an algebra and

1191: the measures are additive; all that happens is that the resulting

1192: conditional probability measure is additive rather than

1193: $\sigma$-additive.

1194:

1195: It is critical in R\'{e}nyi's framework that the $\mu_i$'s are arbitrary

1196: measures, and not just probability measures.  His result does not hold

1197: if the $\mu_i$'s are required to be probability measures. In the case of

1198: finitely additive measures, the Popper space constructed in

1199: Example~\ref{counter1} already shows why.  It corresponds to a

1200: dimensionally ordered space $(\mu_1,\mu_2)$ where $\mu_1(U)$ is 1 if $U$ is

1201: cofinite and 0 if $U$ is finite and $\mu_2(U) = |U|$ (\ie

1202: the measure of a set is its cardinality).  It cannot be captured by a

1203: dimensionally ordered space where all the elements are probability

1204: measures, for the same reason that it is not the image of an SLPS under

1205: $\FCP$.  (R\'{e}nyi \citeyear{Renyi56} actually provides a

1206: general characterization of when the $\mu_i$'s can be taken to be

1207: (countably additive) probability measures.)

1208: %joe4: removed paragraph break

1209: Another example is provided by the Popper space considered in

1210: Example~\ref{counter2}.  This corresponds to the dimensionally ordered

1211: system $\{\mu_\beta: \beta \in \IN \union \{\infty\}\}$, where

1212: $$

1213: \mu_n(U) =

1214: \left \{ \begin{array}{ll}

1215: 0 &\mbox{if $\max(U) < n$}\\

1216: 1 &\mbox{if $\max(U) = n$}\\

1217: \infty &\mbox{if $\max(U) > n$},

1218: \end{array} \right.

1219: $$

1220: where $\max(U)$ is taken to be  $\infty$ if $U$ is cofinite.

1221:

1222: %joe4

1223: Krauss \citeyear{Kr68} restricts to Popper algebras of the form $\F

1224: %joe9

1225: %\times \F'-\{\emptyset\}$; this allows him to simplify and generalize

1226: \times (\F-\{\emptyset\})$; this allows him to simplify and generalize

1227: R\'{e}nyi's analysis.  Interestingly, he also proves a representation

1228: theorem in the spirit of R\'{e}nyi's that involves  nonstandard

1229: probability.

1230:

1231: Van Fraassen \citeyear{vF76} proves a

1232: result whose assumptions are somewhat closer to Theorem~\ref{infiso}.

1233: Van Fraassen considers what he calls  {\em ordinal families of

1234: probability measures}.  An ordinal family over $(W,\F)$ is a sequence of

1235: the form $\{(W_\beta,\F_\beta,\mu_\beta): \beta < \alpha\}$ such that

1236: \begin{itemize}

1237: \item $\union_{\beta < \alpha} W_\beta =  W$;

1238: \item $\F_\beta$ is an algebra over $W_\beta$;

1239: \item $\mu_\beta$ is a probability measure with domain $\F_\beta$;

1240: \item $\union_{\beta < \alpha} \F_\beta = \F$;

1241: \item if $U \in \F$ and $V \in \F_\beta$, then $U \inter V \in \F_\beta$;

1242: \item if $U \in \F$, $U \inter V \in \F_\beta$, and $\mu_\beta(U \inter V) >

1243: 0$, then there exists $\gamma$ such that $U \in \F_\gamma$ and

1244: $\mu_\gamma(U) > 0$.

1245: \end{itemize}

1246:

1247: Given an ordinal family $\{(W_\beta,\F_\beta,\mu_\beta): \beta < \alpha\}$ over

1248: $(W,\F)$, consider the map $\FOP$ which associates with it the cps

1249: $(W,\F,\F',\mu)$, where $\F' = \{U \in \F:

1250: \mu_\gamma(U) > 0 \mbox{ for some } \gamma < \alpha\}$ and $\mu(V \mid U) =

1251: \mu_\beta(V \mid U)$, where $\beta$ is the smallest ordinal such that $U \in

1252: \F_\beta$ and $\mu_\beta(U) > 0$.

1253: Van Fraassen shows that $\FOP$ is a bijection from ordinal families

1254: over $(W,\F)$ to Popper spaces over $(W,\F)$.  Again, for van Fraassen,

1255: countable additivity does not play a significant role.  If $\F$ is a

1256: $\sigma$-algebra, a {\em countably additive\/} ordinal family over

1257: $(W,\F)$ is defined just as an ordinal family, except that now

1258: $\F_\beta$ is a $\sigma$-algebra over $W_\beta$ for all

1259: $\beta < \alpha$, $\mu_\alpha$ is a countably additive probability

1260: measure, and $\F$ is

1261: the least $\sigma$-algebra containing $\union_{\beta <

1262: \alpha} \F_\beta$ (since $\union_{\beta < \alpha} \F_\beta$ is not in

1263: general a $\sigma$-algebra).

1264: The same map

1265: $\FOP$ is also a bijection from countably additive ordinal families

1266: to countably additive  Popper spaces.

1267:

1268: Spohn's result, Theorem~\ref{infiso}, can be viewed as a

1269: strengthening of van Fraassen's result in the countably additive case,

1270: since for Theorem~\ref{infiso} all the $\F_\beta$'s are

1271: required to be identical.  This is a nontrivial requirement.  The fact

1272: that it cannot be met in the case that $W$ is infinite and the measures

1273: are not countably additive is an indication of this.

1274:

1275: It is worth seeing  how van Fraassen's approach handles the finitely

1276: additive examples which do not correspond to SLPS's.

1277: The Popper space in Example~\ref{counter1} corresponds to the ordinal

1278: family $\{(W_n,\F_n,\mu_n): n \le \omega\}$ where, for $n < \omega$,

1279: $W_n = \{1, \ldots, n\}$, $\F_n$ consists of all subsets of $W_n$, and

1280: $\mu_n$ is the uniform measure, while $W_\omega = \IN$, $\F_\omega$

1281: consists of the finite and cofinite subsets of $\IN$, and $\mu_\omega(U)$ is 1

1282: if $U$ is cofinite and 0 if $U$ is finite.  It is easy to check that

1283: this ordinal family has the desired properties.

1284: The Popper space in

1285: Example~\ref{counter2} is represented in a similar way, using the

1286: ordinal family $\{(W_n,\F_n,\mu_n'): n \le \omega\}$, where $\mu_n'(U)$

1287: is 1 if $n \in U$ and 0 otherwise, while $\mu_\omega' = \mu_\omega$.  I

1288: leave it to the reader to see that this family has the desired

1289: properties.

1290: The key point to observe here is the leverage obtained by

1291: allowing each probability measure to have a different domain.

1292:

1293:

1294:

1295: \section{Relating LPS's to NPS's}\label{LPSNPS}

1296:

1297: In this section, I show that LPS's and NPS's are

1298: isomorphic in a strong sense.

1299: Again, I separate the results for the finite case and the infinite case.

1300:

1301: \subsection{The finite case}

1302: Consider an LPS of the form $(\mu_1, \mu_2,\mu_3)$.  Roughly speaking,

1303: the corresponding NPS should be $(1 - \epsilon - \epsilon^2) \mu_1 +

1304: \epsilon \mu_2 + \epsilon^2 \mu_3$, where $\epsilon$ is some

1305: infinitesimal.  That means that $\mu_2$ gets infinitesimal weight

1306: relative to $\mu_1$ and $\mu_3$ gets infinitesimal weight relative to

1307: $\mu_2$.  But which infinitesimal $\epsilon$ should be

1308: chosen?  Intuitively, it shouldn't matter.  No matter which

1309: infinitesimal is chosen, the resulting NPS should be equivalent to the

1310: %joe2

1311: %original LPS.  How can we make this intuition precise?

1312: %joe6

1313: %original LPS.  I now make this intuition precise?

1314: original LPS.  I now make this intuition precise.

1315:

1316:

1317: Suppose that we want to use an LPS or an NPS to compute which of two

1318: bounded, {\em real-valued\/} random variables has higher expected value.%

1319: %{\em real-valued\/} random variables with finite range has higher

1320: %expected value.

1321: %joe2

1322: The intended

1323: application here is decision making, where the random variables can be

1324: thought of as the utilities corresponding to two actions; the one with

1325: higher expected utility is preferred.

1326: The idea is that two measures of

1327: uncertainty (each of which can be an LPS or an NPS) are equivalent if

1328: the preference order they place on (real valued) random variables

1329: (according to their expected value) is the same.

1330: %joe6

1331: I consider only random variables with countable range.  This restriction

1332: both makes the

1333: exposition simpler and avoids having to define, for example, integration

1334: with respect to an NPS.  Note that, given an LPS $\vecmu$, the

1335: expected value of a random variable $X$ is $\sum_x x \vecmu(X=x)$, where

1336: $\vecmu(X=x)$ is a sequence of probability values and the multiplication

1337: and addition are pointwise.  Thus, the expected value is a sequence;

1338: these sequences can be compared using the lexicographic order $<_L$

1339: defined in Section~\ref{LPSdef}.  If $\nu$ is either an LPS or NPS,

1340: then let $E_\nu(X)$ denote the expected value of random variable $X$

1341: according to $\nu$.

1342:

1343: \dfn\label{aeq} If each of $\nu_1$ and $\nu_2$ is either an NPS over

1344: $(W,\F)$ or an

1345: LPS over $(W,\F)$, then $\nu_1$ is {\em equivalent to\/} $\nu_2$,

1346: denoted $\nu_1 \aeq \nu_2$, if, for all real-valued random variables $X$

1347: and $Y$

1348: measurable with respect to $\F$,  $E_{\nu_1}(X) \le E_{\nu_1}(Y)$ iff

1349: $E_{\nu_2}(X) \le E_{\nu_2}(Y)$.

1350: %joe2

1351: %(As usual, $X$ is said to be measurable with respect to $\F$ if

1352: (If $X$ has countable range, which is the only case I consider here, then

1353: $X$ is measurable with respect to $F$ iff $\{w: X(w) = x\} \in \F$ for

1354: all $x$ in the range of $X$.)%

1355: %joe3

1356: \footnote{As pointed out by Adam Brandenburger and Eddie Dekel,

1357: this notion of equivalence is essentially the same as one

1358: implicitly used by BBD.  They work with preference orders on

1359: Anscombe-Aumann acts \cite{AA63}, that is, functions from states to

1360: probability measures on prizes.  Fix a utility function $u$ on prizes.  Then

1361: take $\nu_1 \sim_u \nu_2$ if the preference order on acts generated by

1362: $\nu_1$ and $u$ is the same as that generated by $\nu_2$ and $u$.

1363: It is not hard to show that this notion of equivalence is independent of

1364: the choice of utility function; if $u$ and $u'$ are two utility

1365: functions on prizes, then $\nu_1 \sim_u \nu_2$ iff $\nu_1 \sim_{u'}

1366: \nu_2$.  Moreover, $\nu_1 \sim_u \nu_2$ iff $\nu_1 \aeq \nu_2$.

1367: The advantage of the notion of equivalence used here is that it is

1368: defined without the overhead of preference orders on acts.}

1369: %if $X^{-1}(A) \in \F$ for all Borel sets $A$.

1370: \edfn

1371:

1372: %joe: make this precise (changed ``lemma'' to ``proposition'' already)

1373: %In a precise sense, this notion of equivalence is stronger than that

1374: %provided by the map $\FCP$ of Section~\ref{FCP}, as the following

1375: %proposition shows.

1376: This notion of equivalence satisfies analogues of the two key

1377: properties of the map $\FCP$ considered at the beginning of Section~\ref{FCP}.

1378: \pro\label{FCPaeq}

1379: If $\nu \in \NPS(W,\F)$, $\vecmu \in \LPS(W,\F)$, and

1380: $\nu \aeq \vecmu$, then $\nu(U) > 0$ iff $\vecmu(U) > \vec{0}$

1381: Moreover, if $\nu(U) > 0$, then $\stand{\nu(V \mid U)} = \mu_j(V \mid U)$, where

1382: $\mu_j$ is the first probability measure in $\vecmu$ such that $\mu_j(U)

1383: > 0$. \epro

1384:

1385:

1386: As the next result shows,

1387: %joe9

1388: %for structured LPS's, the $\aeq$-equivalence classes

1389: for SLPS's, the $\aeq$-equivalence classes

1390: %joe9

1391: %are singletons. (This is not true for LPS's in general.

1392: are singletons, even if the set of worlds is infinite. (This is not true

1393: for LPS's in general.

1394: For example, $(\mu,\mu) \aeq (\mu)$.)  This can be viewed as providing

1395: more motivation for the use of SLPS's.

1396:

1397:

1398: \pro\label{motivation} If $\vecmu, \vecmu' \in \SLPS(W,\F)$, then

1399: $\vecmu \aeq \vecmu'$

1400: iff $\vecmu = \vecmu'$.

1401: \epro

1402:

1403:

1404:

1405: The next result justifies restricting to finite LPS's if the state

1406: space is finite.

1407: Given an algebra $\F$, let $\Bas(\F)$ consist of the

1408: {\em basic sets\/} in $\F$, that is, the nonempty sets $\F$ that themselves

1409: contain no nonempty subsets in $\F$.  Clearly the sets in $\Bas(\F)$ are

1410: disjoint, so that $|\Bas(\F)| \le |W|$.  If all sets are measurable, then

1411: $\Bas(\F)$ consists of the singleton subsets of $W$.  If $W$ is finite,

1412: it is easy to see that all sets in $\F$ are finite unions of the sets in

1413: $\Bas(\F)$.

1414:

1415: \pro\label{finiteeq} If $W$ is finite, then every LPS over $(W,\F)$ is

1416: equivalent to an LPS of length at most $|\Bas(\F)|$. \epro

1417:

1418:

1419: %joe2

1420: %I can now define the isomorphism that relates NPS's

1421: I can now define the bijection that relates NPS's

1422: and LPS's.  Given $(W,\F)$, let $\LPS(W,\F)/\naeq$ be the equivalence

1423: classes of $\aeq$-equivalent LPS's over $(W,\F)$; similarly, let

1424: $\NPS(W,\F)/\naeq$ be the equivalence classes of $\aeq$-equivalent NPS's

1425: over $(W,\F)$.   Note that in $\NPS(W,\F)/\naeq$, it is possible that

1426: different nonstandard probability measures could have different ranges.

1427: For this section, without loss of generality, I could also fix the range

1428: of all NPS's to be

1429: %joe4

1430: %fixed nonstandard model

1431: the nonstandard model

1432: $\IR(\epsilon)$ discussed in Section~\ref{NPSdef}.  However, in the infinite

1433: case, it is not possible to restrict to a single nonstandard model, so I

1434: do not do so here either, for uniformity.

1435:

1436: Now define the mapping $\FLN$ from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$

1437: pretty much as suggested at the beginning of this subsection:

1438: If $[\vecmu]$ is an equivalence class of LPS's, then choose a

1439: representative $\vecmu' \in [\vecmu]$ with finite length.

1440: Fix an infinitesimal $\epsilon$.

1441: Suppose that

1442: $\vecmu' = (\mu_0, \ldots, \mu_k)$.

1443: Let $\FLN([\vecmu]) = [(1 - \epsilon -

1444: \cdots - \epsilon^{k}) \mu_0 + \epsilon \mu_1 + \cdots + \epsilon^k \mu_k]$.

1445:

1446: \thm\label{lpsnps} If $W$ is finite, then

1447: $\FLN$ is a bijection from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$

1448: that preserves equivalence (that is, each NPS in $\FLN([\vecmu])$ is

1449: equivalent to $\vecmu$).

1450: \ethm

1451:

1452: %joe9*: added

1453: \prf It is easy to check that if $\vecmu = (\mu_0, \ldots, \mu_k)$, then

1454: $\vecmu \aeq (1 - \epsilon -

1455: \cdots - \epsilon^{k}) \mu_0 + \epsilon \mu_1 + \cdots + \epsilon^k

1456: \mu_k$ (see Lemma~\ref{aeqchar} in the appendix for a formal proof).

1457: It follows that $\FLN$ is an injection from

1458: $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$.    To show that $\FLN$ is a

1459: surjection, we must essentially construct an inverse map; that is, given

1460: an NPS $(W,\F,\nu)$ where $W$ is finite, we must find an LPS $\vecmu$

1461: such that $\vecmu \aeq \nu$.  The idea is to find

1462: a finite collection $\mu_0, \ldots, \mu_k$ of (standard)

1463: probability measures, where $k \le |W|$, and nonnegative nonstandard reals

1464: $\epsilon_0, \ldots, \epsilon_k$ such that

1465: $\stand{\epsilon_{i+1}/\epsilon_i} = 0$ and $\nu = \epsilon_0 \mu_0 +

1466: \cdots + \epsilon_k\mu_k$.  A straightforward argument then shows that

1467: $\nu \aeq \vecmu$ and $\FLN([\vecmu]) = [\nu]$.  I leave details to

1468: the appendix. \eprf

1469:

1470:

1471: BBD \citeyear{BBD1} also relate nonstandard probability measures and

1472: LPS's under the assumption that the state space is finite,

1473: %joe6

1474: %However, the way they relate them is somewhat different in spirit from

1475: %is different in spirit from the notion of equivalence introduced here.

1476: but there are some significant technical differences between the way

1477: they relate them and the approach taken here.

1478: BBD prove representation theorems

1479: %joe4

1480: %essentially showing that a preference orders on lotteries

1481: essentially showing that a preference order on lotteries

1482: can be represented by a standard utility function on lotteries and an

1483: LPS iff it

1484: can be represented by a standard utility function on lotteries and an NPS.

1485: Thus, they show that NPS's and LPS's are equiexpressive in terms of

1486: representing preference orders on lotteries.

1487: The difference between

1488: BBD's result and Theorem~\ref{lpsnps} is essentially a matter of

1489: quantification.  BBD's  result can be viewed as showing that, given an

1490: LPS, for each utility function on lotteries, there is an NPS that

1491: generates the same preference order on lotteries for that particular

1492: utility function.  In principle, the NPS might depend on the utility

1493: function.  More precisely, for a fixed LPS $\vecmu$, all

1494: that follows from their result is that for each utility function $u$, there

1495: is an NPS $\nu$ such that $(\vecmu,u)$ and $(\nu,u)$ generate the same

1496: preference order on lotteries.   Theorem~\ref{lpsnps} says that, given

1497: $\vecmu$, there is an NPS $\nu$ such that $(\vecmu,u)$ and $(\nu,u)$

1498: generate the same preference on lotteries for {\em all\/} utility

1499: functions $u$.

1500:

1501: \subsection{The infinite case}

1502:

1503:

1504: An LPS over an infinite state space $W$ may not be equivalent to any

1505: finite LPS.   However, ideas analogous to those used to prove

1506: Proposition~\ref{finiteeq} can be used to provide a bound on the length

1507: of the minimal-length LPS's in an equivalence class.

1508:

1509: \pro\label{infiniteeq} Every LPS over $(W,\F)$ is

1510: equivalent to an LPS over $(W,\F)$ of length at most $|\F|$. \epro

1511:

1512: The first step in relating LPS's to NPS's is to show that, just as in

1513: the finite case, for every LPS $(\mu_\beta: \beta < \alpha)$ of length

1514: $\alpha$, there is an equivalent NPS $\nu$.  The idea will

1515: be to

1516: %joe9

1517: set

1518: $\nu = (1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta}) +

1519: \sum_{0 < \beta < \alpha} \epsilon_{n_\beta} \mu_\beta$.  In the finite

1520: case, we could take $n_\beta = \beta$.  This worked because

1521: each $\beta$ was finite, and the field $\IR(\epsilon)$ includes

1522: $\epsilon^j$ for each integer $j$.   But now, since $\alpha$ may be

1523: greater than $\omega$, we cannot just take $n_\beta = \beta$.

1524: To get this idea to work in the infinite setting, I consider a

1525: \emph{nonstandard} model of the integers, which includes an ``integer''

1526: corresponding to all the ordinals less than $\alpha$.  I then

1527: %joe9

1528: %construct a field that include $\epsilon^{n_\alpha}$ even for

1529: construct a field that includes $\epsilon^{n_\alpha}$ even for

1530: these nonstandard integers $n_\alpha$.

1531:

1532: A {\em nonstandard model of the integers\/} is a

1533: model that contains the integers and satisfies every property of the

1534: integers expressible in first-order logic.

1535: It follows easily from the compactness theorem of first-order

1536: logic \cite{Enderton} that, given an ordinal $\alpha$, there exists a

1537: nonstandard model

1538: %joe6

1539: $I^\alpha$

1540: of the integers $I^\alpha$ that includes elements

1541: $n_\beta$, $\beta <

1542: \alpha$, such that $n_j = j$ for $j <\omega$ and $n_\beta < n_{\beta'}$

1543: if $\beta < \beta'$.  (Note that since $I^\alpha$ satisfies all the

1544: properties of the integers, it follows that if $n_\beta < n_{\beta'}$,

1545: then $n_{\beta'} - n_\beta \ge 1$, a fact that will be useful later.)

1546: The compactness theorem says that, given a collection of

1547: formulas, if each finite subset has a model, then so does the whole set.

1548: Consider a language with a function $+$ and constant symbols for each

1549: integer, together with constants ${\bf n}_\beta$, $\beta < \alpha$.

1550: Consider the collection of first-order formulas in this language

1551: consisting of all the formulas true of the integers, together with the

1552: formulas ${\bf n}_i = i$ for $i < \omega$ and ${\bf n}_\beta < {\bf

1553: n}_{\beta'}$, for all $\beta < \beta' < \alpha$.

1554: Clearly any finite subset of this set has a model---namely, the

1555: integers.  Thus, by compactness, so does the full set.  Thus, for each

1556: ordinal $\alpha$, there is a model $I^\alpha$

1557: with the required properties.

1558:

1559:

1560: %There are two issues that must be dealt with in order to get this to

1561: %work.  First, we must ensure that there is a non-Archimedean field where

1562: %there are infinitesimals $\epsilon_\beta$, $\beta < \alpha$, such that

1563: %$\stand{\epsilon_{\beta'}/\epsilon_{\beta}} = 0$ if $\beta < \beta' <

1564: %\alpha$.  Note, for example, that

1565: %this cannot be done in $\IR(\epsilon)$ if $\alpha > \omega$.

1566: %Another problem is making sense of the infinite sum.  Fields are closed

1567: %under finite sums; in general, infinite sums may not be defined.

1568:

1569: Given $\alpha$, I now construct a field $\IR(I^\alpha)$

1570: that includes $\epsilon^n$ for each ``integer'' $n \in I^\alpha$.

1571: %These fields are all similar in spirit to $\IR(\epsilon)$.  To

1572: To explain the construction,

1573: it is best to first consider $\IR(\epsilon)$ in a

1574: little more detail.  Since $\IR(\epsilon)$ is a field, once it includes

1575: $\epsilon$, it must include $p(\epsilon)$,

1576: where $p$ is a polynomial with real coefficients.  To ensure the every

1577: nonzero element of $\IR(\epsilon)$ has an inverse, we need not just

1578: finite polynomials in $\epsilon$, but \emph{infinite} polynomials in

1579: $\epsilon$.  The inverse of a polynomial in $\epsilon$ can then be

1580: computer using standard ``formal'' division of polynomials.

1581: Moreover, the leading coefficient of the polynomial can be negative.

1582: Thus, the inverse of $\epsilon^3$ is, not surprisingly, $\epsilon^{-3}$;

1583: the inverse of $1-\epsilon$ is $1 + \epsilon + \epsilon^2 + \ldots$.

1584:

1585: The field $\IR(I^\alpha)$ also includes polynomials in $\epsilon$, but

1586: now the exponents are not just integers, but elements of

1587: $I^\alpha$.  Since a field is closed under multiplication, if it

1588: contains $\epsilon^{n_1}$ and $\epsilon^{n_2}$, it must also include

1589: their product.  Since $I^\alpha$ satisfies all the properties of the

1590: integers, if it includes $n_1$ and $n_2$, it also includes an element

1591: $n_1 + n_2$, and we can take $\epsilon^{n_1} \times \epsilon^{n_2} =

1592: \epsilon^{n_1 + n_2}$. Formally, let $\IR(I^\alpha)$ be the

1593: non-Archimedean model defined as follows:

1594: $\IR(I^\alpha)$ consists of all polynomials of the form

1595: $\sum_{n \in J} r_n \epsilon^{n}$, where $r_n$ is a standard real,

1596: $\epsilon$ is an infinitesimal, and $J$ is a \emph{well-founded} subset

1597: of $I^\alpha$.  (Recall that a set is well founded if it has no

1598: infinite descending sequence; thus, the set of integers is not well

1599: founded, since $\ldots -3 < -2 < -1$ is an infinite descending

1600: sequence.  The reason I require well foundedness will be clear shortly.)

1601: We can identify the standard real $r$ with the polynomial

1602: $r \epsilon^0$.

1603:

1604: The polynomials in $\IR(I^\alpha)$ can be added and

1605: multiplied using the standard rules for addition and multiplication of

1606: polynomials.

1607: It is easy to check that

1608: %joe3

1609: %since $\alpha$ is a limit ordinal,

1610: the result of adding or multiplying two

1611: polynomials is another polynomial in $\IR(I^\alpha)$.  In particular, if

1612: $p_1$ and $p_2$ are

1613: %joe4:

1614: %two polynomials, $N_1$ is the set of coefficients of $p_1$, and

1615: %$N_2$ is the set of coefficients of $p_2$, then the

1616: %coefficients of $p_1 + p_2$ lie in $N_1 \union N_2$, while the

1617: %coefficients of $p_1p_2$ lie in the set $N_3 = \{n_1 + n_2:

1618: two polynomials, $N_1$ is the set of exponents of $p_1$, and

1619: $N_2$ is the set of exponents of $p_2$, then the

1620: exponents of $p_1 + p_2$ lie in $N_1 \union N_2$, while the

1621: exponents of $p_1p_2$ lie in the set $N_3 = \{n_1 + n_2:

1622: %joe4: typo

1623: %n \in N_1, n_2 \in N_2\}$.  Both $N_1 \union N_2$ and $N_3$ are easily

1624: n_1 \in N_1, n_2 \in N_2\}$.  Both $N_1 \union N_2$ and $N_3$ are easily

1625: seen to be well founded if $N_1$ and $N_2$ are.  Moreover, for each

1626: expression $n_1 + n_2 \in N_3$, it

1627: follows from the well-foundedness of $N_1$ and $N_2$ that there are only

1628: finitely many pairs $(n,n') \in N_1 \times N_2$ such that $n+n' = n_1 +

1629: n_2$,

1630: %joe6

1631: so the coefficient of $\epsilon^{n_1 + n_2}$ in $p_1p_2$ is well defined.

1632: Finally, each polynomial (other than 0) has an

1633: inverse that can be computed using standard ``formal'' division of

1634: polynomials;  I leave the details to the reader.

1635: %joe6:

1636: This step is where the well foundedness comes in.  The formal division

1637: process cannot be applied to a polynomial with coefficients that are not

1638: well founded, such as $\cdots + \epsilon^{-3} + \epsilon^{-2} +

1639: \epsilon^{-1}$.  An element

1640: of $\IR(I^\alpha)$ is {\em positive\/} if its leading coefficient is

1641: positive.  Define an order $\le$ on  $\IR(I^\alpha)$ by taking $a \le b$ if

1642: $b-a$ is positive.

1643: With these definitions, $\IR(I^\alpha)$ is a non-Archimedean field.

1644: %Moreover, $\stand{\epsilon^{n_2}/\epsilon^{n_1}} = 0$ if $n_1 < n_2$.

1645:

1646: Given $(W,\F)$, let $\alpha$ be

1647: the minimal ordinal whose cardinality is greater than

1648: %joe3

1649: or equal to

1650: $|\F|$.

1651: %Let $I^*_{(W,\F)}$ be a nonstandard model of the

1652: %integers such that there exist

1653: %elements $n_\beta$ in $I^*_{(W,\F)}$ for all $\beta < \alpha$ such

1654: By construction, $I^\alpha$ has elements $n_\beta$ for all $\beta <

1655: \alpha$ such that

1656: $n_i = i$ for $i < \omega$ and $n_\beta < n_{\beta'}$ if $\beta <

1657: \beta' < \alpha$.

1658: I now define a map $\FLN$ from

1659: $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$ just as suggested earlier.

1660: In more detail, given an equivalence class $[\vecmu] \in \LPS(W,\F)$, by

1661: Proposition~\ref{infiniteeq}, there exists $\vecmu' \in

1662: [\vecmu]$ such that $\vecmu'$ has length $\alpha' \le \alpha$.

1663: %joe9

1664: %Let $\nu = (1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta}) +

1665: Let $\nu = (1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta})\mu_0 +

1666: \sum_{0 < \beta < \alpha} \epsilon_{n_\beta} \mu_\beta'$.

1667: By definition, $\sum_{0 < \beta < \alpha} \epsilon^{n_\beta} \in

1668: \IR(I^\alpha)$  (the set

1669: of exponents is well ordered since the ordinals are well ordered), hence

1670: so is $(1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta})$.

1671: The elements $\epsilon^{n_\beta}$ for $\beta \le \alpha$ are also all

1672: in $\IR(I^\alpha)$.  It easily follows that $\nu$ is nonstandard

1673: probability measure over the field $\IR(I^\alpha)$.  As observed

1674: earlier, if $\beta' < \beta$, then $\beta - \beta' \ge 1$, so

1675: $\epsilon^{n_\beta'}$ is infinitesimally smaller than

1676: $\epsilon^{n_\beta}$.  Arguments essentially identical to those of

1677: Lemma~\ref{aeqchar} in the appendix can be

1678: used to show that $\nu \aeq \vecmu'$.

1679: Define $\FLN[\vecmu] = [\nu]$.

1680: %joetark

1681: The following result is immediate.

1682:

1683: \thm\label{injection} $\FLN$ is an injection from $\LPS(W,\F)/\naeq$ to

1684: $\NPS(W,\F)/\naeq$ that preserves equivalence.  \ethm

1685:

1686: What about the converse?  Is it the case that for every NPS there is an

1687: equivalent LPS?

1688: %joe9

1689: The technique for finding an equivalent LPS used in the finite case

1690: fails.  There is no obvious way to find a well-ordered sequence

1691: of standard probability measures $\mu_0, \mu_1, \ldots$ and a sequence

1692: of nonnegative nonstandard reals $\epsilon_0, \epsilon_1, \ldots$ such

1693: that $\stand{\epsilon_{\beta+1}/\epsilon_\beta} = 0$ and

1694: $\nu = \epsilon_0 \mu_0 + \epsilon_1 \mu_1 + \cdots$.  As the following

1695: example shows, this is not an accident.

1696: %joe9

1697: %As the following example shows, the answer is no.

1698: There exists NPSs that are not equivalent to any LPS.

1699:

1700: \xam\label{counter3} As in Example~\ref{counter1}, let $W = \IN$, the

1701: natural numbers, let $\F$ consist of the finite and cofinite subsets of

1702: $\IN$,

1703: and let $\F' = \F - \{\emptyset\}$.  Let $\nu^1$ be an NPS with

1704: range $\IR(\epsilon)$, where $\nu^1(U) = |U|\epsilon$ if $U$ is finite and

1705: %joe6

1706: %$\nu(U) = 1 - |\overline{U}|\epsilon$ if $U$ is cofinite

1707: $\nu^1(U) = 1 - |\overline{U}|\epsilon$ if $U$ is cofinite

1708: %joe6

1709: (as usual, $\overline{U}$ denotes the complement of $U$, which in this

1710: case is finite).

1711: This is

1712: clearly an NPS, and it corresponds to the cps $\mu^1$ of

1713: Example~\ref{counter1}, in the sense that $\stand{\nu^1(V \mid U)}

1714:  = \mu^1(V \mid U)$ for all $V \in \F$, $U\in \F'$.  Just as in

1715: Example~\ref{counter1}, it can be shown that there is

1716: no LPS $\vecmu$ such that $\nu^1 \aeq \vecmu$.

1717:

1718:

1719:

1720: %joe6

1721: To see the potential relevance of this setup, suppose that

1722: %joe9

1723: %there is a lottery with countably many where a natural number can be

1724: %chosen and,

1725: a natural number is chosen at random and,

1726: intuitively, all numbers are equally likely to be chosen.  An agent may

1727: place a bet

1728: on the number being in a finite or cofinite set.  Intuitively, the agent

1729: should prefer a bet on a set with larger cardinality.  More precisely,

1730: if $U_1$ and $U_2$ are two sets in the algebra, the agent should prefer

1731: a bet on $U_1$ over a bet on $U_2$ iff

1732: (a) $U_1$ and $U_2$ are both cofinite and the complement of $U_1$ has

1733: smaller cardinality than that of $U_2$, (b) $U_1$ is cofinite and $U_2$

1734: is finite, or (c) $U_1$ and $U_2$ are both finite, and $U_1$ has larger

1735: cardinality than $U_2$.  These preferences on acts or bets

1736: should translate to statements of likelihood.

1737: The NPS captures these preferences directly; they cannot

1738: be captured in an LPS.  The cps of Example~\ref{counter1} captures (b)

1739: directly, and (c) indirectly: when conditioning on any finite set that

1740: contains $U_1 \union U_2$, the probability of $U_1$ will be higher than

1741: that of $U_2$.

1742: \exam

1743:

1744:

1745: \subsection{Countably additive nonstandard probability

1746: measures}\label{countableadditivity}

1747:

1748: Do things get any better if countable additivity is required?

1749: To answer this question, I must first make precise what countable

1750: additivity means in the context of non-Archimedean fields.

1751: To understand the issue here, recall that for the standard real numbers,

1752: every bounded nondecreasing sequence has a unique least upper bound, which

1753: can be taken to be its limit.  Given a countable sum each of whose terms

1754: is nonnegative, the partial sums form a nondecreasing sequence.

1755: If the partial sums are bounded (which they are if the terms in the sums

1756: represent the probabilities of a pairwise

1757: disjoint collection of sets), then the limit is well defined.

1758:

1759: None of the above is true in the case of non-Archimedean fields.  For a

1760: trivial counterexample,

1761: consider the sequence $\epsilon, 2 \epsilon, 3 \epsilon, \ldots$.

1762: Clearly this sequence is bounded (by any positive real number), but it

1763: does not have a least upper bound.  For a more subtle example, consider

1764: the sequence $1/2, 3/4, 7/8, \ldots$ in the field $\IR(\epsilon)$.  Should

1765: its limit be 1?  While this does not seem to be an unreasonable choice,

1766: note that 1 is not the least upper bound of the sequence.  For example,

1767: $1-\epsilon$ is greater than every term in the sequence, and is less

1768: than 1.  So are $1-3\epsilon$ and $1 - \epsilon^2$.  Indeed, this

1769: sequence has no least upper bound in $\IR(\epsilon)$.

1770:

1771: Despite these concerns, I define limits in

1772: $\IR(I^*)$ pointwise.  That is,

1773: %the elements of $\IR(I^*)$ are (infinite) polynomials over $\epsilon$

1774: %where the power of $\epsilon$ are in $I^*$.  Convergence is taken

1775: %pointwise.  That is,

1776: a sequence $a_1, a_2, a_3, \ldots$ in $\IR(I^*)$

1777: converges to $b \in \IR(I^*)$ if, for every $n \in I^*$, the

1778: coefficients of $\epsilon^n$ in $a_1, a_2, a_3, \ldots$ converge to the

1779: coefficient of $\epsilon^n$ in $b$. (Since the coefficients are standard

1780: reals, the notion of convergence for the

1781: coefficients is just the standard definition of convergence in the reals.

1782: Of course, if $\epsilon^n$ does not appear explicitly, its coefficient

1783: is taken to be 0.)

1784: %joe6

1785: Note that here and elsewhere I use the letters $a$ and $b$ (possibly with

1786: subscripts)  to denote (standard) reals, and $\epsilon$ to denote an

1787: infinitesimal.

1788: As usual, $\sum_{i=1}^\infinity a_i$ is taken to be $b$ if

1789: the sequence of partial sums $\sum_{i=1}^n a_i$ converges to $b$.

1790: Note that, with this notion of convergence, $1/2, 3/4, 7/8, \ldots$

1791: converges to 1 even though 1 is not the least upper bound of the

1792: sequence.%

1793: \footnote{For those used to thinking of convergence in topological

1794: terms, what is going on here is that the topology corresponding to this

1795: notion of convergence is not Hausdorff.}

1796: %joetark

1797: I discuss the consequences of this choice further in

1798: Section~\ref{discussion}.

1799:

1800: With this notion of countable sum, it makes perfect sense to consider

1801: countably-additive nonstandard probability measures.  If $\F$ is a

1802: $\sigma$-algebra and $\LPS^c(W,\F)$ and $\NPS^c(W,\F)$ denote the

1803: countably additive LPS's and NPS's on $(W,\F)$, respectively, then

1804: Theorem~\ref{injection} can be applied with no change in proof to

1805: show the following.

1806:

1807: \thm\label{injection1}  $\FLN$ is an injection from $\LPS^c(W,\F)/\naeq$

1808: to $\NPS^c(W,\F)/\naeq$.

1809: \ethm

1810:

1811: However, as the following example shows, even with the requirement of

1812: countable additivity, there are nonstandard probability measures that

1813: are not equivalent to any LPS.

1814:

1815: \xam\label{counter4} Let $W = \{w_1, w_2, w_3, \ldots\}$, and let $\F =

1816: 2^W$.  Choose any nonstandard $I^*$ and fix an infinitesimal $\epsilon$

1817: in $\IR(I^*)$.

1818: Define an NPS $(W,\F,\nu)$ with range $\IR(I^*)$

1819: by taking $\nu(w_j) = a_j + b_j \epsilon$, where $a_j = 1/2^j$, $b_{2j-1} =

1820: %joe6

1821: %\epsilon/2^{j-1}$, and $b_{2j} = -\epsilon/2^{j-1}$, for $j = 1, 2, 3,

1822: %\ldots$.

1823: 1/2^{j-1}$, and $b_{2j} = -1/2^{j-1}$, for $j = 1, 2, 3, \ldots$.

1824: Thus, the probabilities of $w_1, w_2, \ldots$ are characterized by the

1825: sequence $1/2 + \epsilon, 1/4 - \epsilon, 1/8 + \epsilon/2, 1/16 -

1826: \epsilon/2, 1/32 + \epsilon/4, \ldots$.  For $U \subseteq W$, define

1827: $\nu(U) = \sum_{\{j: w_j \in U\}} a_j + \epsilon \sum_{\{j: w_j \in U\}}

1828: b_j$.  It is easy to see that these sums are well-defined.

1829: %joe6

1830: These likelihoods correspond to preferences.  For example, an agent

1831: should prefer a bet that gives a payoff of 1 if $w_2$ occurs and 0 otherwise

1832: to a bet that gives a payoff of 4 if $w_4$ occurs and 0 otherwise.

1833: %joetark

1834: As I show in the appendix (see Proposition~\ref{counter}), there

1835: %As I show in the full paper, there

1836: is no LPS $\vecmu$ over $(W,\F)$ such that $\nu \aeq

1837: \vecmu$.

1838: \exam

1839:

1840: Roughly speaking, the reason that $\nu$ is not equivalent to

1841: any LPS in Example~\ref{counter4} is that the ratio between $a_j$ and

1842: $b_j$ in the definition of $\nu$ (i.e., the ratio

1843: %joe6

1844: between the

1845: ``standard part'' of

1846: $\nu(w_j)$ and the ``infinitesimal part'' of $\nu(w_j)$)

1847: %joe9

1848: %grows unboundedly large.  This can be generalized so as to give

1849: goes to zero.  This can be generalized so as to give

1850: a condition on nonstandard probability measures that

1851: is necessary and sufficient to guarantee that they can be represented by

1852: an LPS.

1853: % however, I do not pursue this issue here.

1854: However, the condition is rather technical and I have not found an

1855: interesting interpretation of it, so I do not pursue it here.

1856:

1857:

1858: \section{Relating Popper Spaces to NPS's}\label{PopperNPS}

1859: Consider the map $\FNP$ from nonstandard probability spaces to Popper

1860: spaces such that $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$, where

1861: $\F' = \{U: \nu(U) \ne 0\}$ and $\mu(V \mid U) = \stand{\nu(V \mid U)}$ for $V \in

1862: \F$, $U \in \F'$.  I leave it to the reader to check that

1863: $(W,\F,\F',\mu)$ is indeed a Popper space.

1864: %joe9

1865: This is arguably the most natural map; for example, it is easy to check

1866: that $\FNP \circ \FSN = \FCP$, where $\FSN$ is the restriction of $\FLN$

1867: to SLPSs.  (Note that $\FLN$ is well-defined on SLPS's, since if

1868: $\vecmu$ is an SLPS, by Proposition~\ref{motivation}, $[\vecmu] =

1869: \{\vecmu\}$.)

1870:

1871: %joetark: this is new

1872: %joe9

1873: We might hope that $\FNP$ is a bijection from $\NPS(W,\F)/\naeq$ to

1874: $\Popper(W,\F)$.  As I show shortly, it is not.  To understand $\FLN$

1875: better, define an equivalence relation $\simeq$ on $\NPS(W,\F)$ (and

1876: $\NPS^c(W,\F)$) by taking $\nu_1 \simeq \nu_2$ if $\{U: \nu_1(U) = 0\} =

1877: \{U: \nu_2(U) = 0\}$ and $\stand{\nu_1(V \mid U)} = \stand{\nu_2(V \mid

1878: U)}$ for

1879: all $V, U$ such that $\nu_1(U) \ne 0$.

1880: %joe9

1881: Thus, $\simeq$ essentially says that infinitesimal differences between

1882: conditional probabilities do not count.

1883: Let $\NPS/\!\simeq$

1884: (\respc $\NPS^c/\!\simeq$) consist of the

1885: $\simeq$ equivalence classes in $\NPS$ (\respc $\NPS^c$).  Clearly

1886: $\FNP$ is well defined as a map from $\NPS/\!\simeq$ to $\Popper(W,\F)$

1887: and

1888: from $\NPS^c/\!\simeq$ to $\Popper^c(W,\F)$.  As the following result

1889: shows, $\FNP$ is actually a bijection from $\NPS^c/\!\simeq$ to

1890: $\Popper^c(W,\F)$.

1891:

1892:

1893: \thm\label{FNP} $\FNP$ is a bijection from $\NPS(W,\F)/\!\simeq$ to

1894: $\Popper(W,\F)$ and from $\NPS^c(W,\F)/\!\simeq$ to $\Popper^c(W,\F)$.

1895: \ethm

1896:

1897: \prf

1898: It is easy to see that $\FNP$ is an injection.

1899: In the countable case, the inverse map can be defined using earlier results.

1900: If $(W,\F,\F',\mu) \in \Popper^c(W,\F)$,  by

1901: Theorem~\ref{infiso},

1902: there is a countably additive SLPS $\vec{\mu}'$ such that

1903: $\FCP((W,\F,\vec{\mu}')) = (W, \F,\F', \mu)$.  By

1904: Theorem~\ref{injection}, there is some

1905: $(W,\F,\nu) \in \NPS^c(W,\F)$ such that $\nu \aeq \vecmu'$.  It is not

1906: hard to show that $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$; see the appendix

1907: for details.  Showing that $\FNP$ is a surjection in the finitely

1908: additive case requires more work; again, see the appendix for details.

1909: \eprf

1910:

1911: McGee \citeyear{McGee94} proves essentially the same result as

1912: Theorem~\ref{FNP} in the case that $\F$ is an algebra (and the measures

1913: involved are not necessarily countably additive).  McGee

1914: \citeyear[p.~181]{McGee94} says that his

1915: result shows that ``these two approaches amount to the same thing''.

1916: However, this is far from clear.   The $\simeq$ relation is rather

1917: coarse.  In particular, it is coarser than~$\aeq$.

1918:

1919: %joe6

1920: %\opro{simeqvsaeq} If $\nu_1 \aeq \nu_2$ than $\nu_1 \simeq \nu_2$.

1921: \pro\label{simeqvsaeq} If $\nu_1 \aeq \nu_2$ then $\nu_1 \simeq \nu_2$.

1922: \epro

1923:

1924:

1925: The converse of Proposition~\ref{simeqvsaeq} does not hold in general.

1926: As a result,

1927: the $\simeq$ relation identifies nonstandard measures that behave quite

1928: differently in decision contexts.

1929: %Indeed, the results in

1930: %Sections~\ref{FCP} and~\ref{LPSNPS} already suggest the nature of the

1931: %gap between Popper spaces and NPS's.  To simplify the discussion,

1932: This difference already arises in finite spaces, as the following example

1933: shows.

1934:

1935: \xam\label{McGee}

1936: Suppose $W =

1937: \{w_1,w_2\}$.  Consider the nonstandard probability measure $\nu_1$ such that

1938: $\nu_1(w_1) = 1/2 + \epsilon$ and $\nu_1(w_2) = 1/2 - \epsilon$.  (This is

1939: equivalent to the LPS $(\mu_1,\mu_2)$ where $\mu_1(w_1) = \mu_2(w_2) =

1940: 1/2$, $\mu_2(w_1) = 1$, and $\mu_2(w_2) = 0$.)

1941: Let $\nu_2$ be the nonstandard probability measure such that $\nu_2(w_1)

1942: = \nu_2(w_2) = 1/2$.  Clearly $\nu_1 \simeq \nu_2$.  However, it is not

1943: the case that $\nu_1 \aeq \nu_2$.

1944: Consider the two

1945: random variables $\chi_{\{w_1\}}$ and $\chi_{\{w_2\}}$.

1946: (I use the notation $\chi_U$ to denote the indicator function for $U$;

1947: that is, $\chi_U(w) = 1$ if $w \in U$ and $\chi_U(w) = 0$ otherwise.)

1948: According to $\nu_1$, the

1949: expected value of $\chi_{\{w_1\}}$ is (very slightly) higher than that of

1950: $\chi_{\{w_2\}}$.

1951: According to $\nu_2$, $\chi_{\{w_1\}}$ and $\chi_{\{w_2\}}$ have the

1952: same expected value.  Thus, $\nu_1 \not\aeq \nu_2$.

1953: Moreover, it is easy to see that there

1954: is no Popper measure $\mu$ on $\{w_1,w_2\}$ that can make the same

1955: distinctions with respect to $\chi_{\{w_1\}}$ and $\chi_{\{w_2\}}$ as

1956: $\nu_1$, no matter how we define expected value with respect to a

1957: Popper measure.   According to $\nu_1$, although the expected value of

1958: $\chi_{\{w_1\}}$ is higher than that of $\chi_{\{w_2\}}$, the expected

1959: value of $\chi_{\{w_1\}}$ is less than

1960: that of $\alpha \chi_{\{w_2\}}$ for any (standard) real $\alpha > 1$.

1961: There is no Popper measure with this behavior.

1962: \exam

1963:

1964: %suppose that $W$ is finite.  Then

1965: More generally, in finite spaces, Theorem~\ref{FCPfin} shows that

1966: Popper spaces are equivalent to SLPS's, while

1967: Theorem~\ref{lpsnps} shows that $\LPS(W,\F)/\naeq$ is equivalent to

1968: $\NPS(W,\F)/\naeq$.  By Proposition~\ref{motivation},

1969: $\SLPS(W,\F)/\naeq$ is essentially identical to $\SLPS(W,\F)$ (all the

1970: equivalence classes in $\SLPS(W,\F)/\naeq$ are singletons),

1971: so in finite spaces, the gap in expressive power between Popper spaces

1972: and NPS's essentially amounts to the gap between $\SLPS(W,\F)$ and

1973: $\LPS(W,\F)/\naeq$.  This gap is nontrivial.  For example, there is no

1974: SLPS equivalent to the LPS $(\mu_1,\mu_2)$ that represents the NPS in

1975: Example~\ref{McGee}.

1976:

1977: %I do not know any way of making this gap completely precise. There

1978: %does not seem to be an analogue of the equivalence $\aeq$ from

1979: %Definition~\ref{aeq} for Popper spaces, since it is not clear how to

1980: %define expected value with respect to a Popper measure.

1981: %The following example may help to clarify the issue.

1982:

1983: \section{Independence}\label{sec:indep}

1984: %joe3

1985: %BBD \citeyear{BBD1} and Hammond \citeyear{Hammond94} discuss

1986: %independence, but they consider

1987: %only when a (standard or nonstandard) probability measure can be viewed

1988: %as a {\em product measure\/} (that is, a product of other measures).

1989: %%joe2

1990: %%Interestingly, their discussion does {\em not\/}

1991: %%consider independence directly for LPS's; indeed, it is far from clear

1992: %%what it would mean that an LPS can be written as a product measure.

1993: %Rather than considering product measures,

1994: %I consider more standard notions of independence:

1995: %joe6

1996: %In this section, I consider

1997: The notion of independence is fundamental.  As I show in this section, the

1998: results of the previous sections sheds light on various notions of

1999: independence considered in the literature for LPS's and (variants of)

2000: cps's.  I first consider independence for events and then independence

2001: for random variables.  I then relate my definitions to those of BBD,

2002: Hammond, and Kohlberg and Reny \citeyear{KR97}.

2003:

2004: Intuitively, event $U$ is independent of $V$ if learning $U$ gives no

2005: information about $V$.  Certainly if learning $U$ gives no information

2006: about $V$, then if $\mu$ is an arbitrary probability measure, we would

2007: expect that $\mu(V \mid U) = \mu(V)$.  Indeed, this is often taken as the

2008: definition of $V$ being independent of $U$ with respect to $\mu$.

2009: If standard probability measures are used, conditioning is not

2010: defined if $\mu(U) = 0$.  In this case, $U$ is still considered

2011: independent of $V$.  As is well known, if $U$ is independent of $V$,

2012: then $\mu(U \inter V) = \mu(V) \times \mu(U)$ and $V$ is independent of

2013: $U$, that is, $\mu(U \mid V) = \mu(U)$.  Thus, independence of events with

2014: respect to a probability measure can be

2015: defined in any of three equivalent ways.  Unfortunately, these

2016: definitions are not equivalent for other representations of uncertainty

2017: (see \cite[Chapter 4]{Hal31} for a general discussion of this issue).

2018:

2019:

2020: The situation is perhaps simplest for nonstandard probability measures.%

2021: \footnote{Although I talk about $U$ being independent of $V$ with

2022: respect to a nonstandard measure $\nu$, technically I should talk about

2023: $U$ being independent of $V$ with respect to an NPS $(W,\F,\nu)$, for

2024: $U, V \in \F$.  I continue to be sloppy at times, reverting to more

2025: careful notation when necessary.}

2026: In this case, the three notions coincide, for exactly the same reasons

2027: as they do for standard probability measures.  However, independence is

2028: perhaps too strong a notion in some ways.  In particular, nonstandard

2029: measures that are equivalent do not in general agree on independence, as

2030: the following example shows.

2031: \xam\label{xam:approximatelyindep}

2032: Suppose that $W = \{w_1, w_2, w_3, w_4\}$.  Let

2033: $\nu_i(w_1 ) = 1 - 2 \epsilon + \epsilon_i$, $\nu_i(w_2) = \nu_i(w_3) =

2034: \epsilon - \epsilon_i$, and $\nu_i(w_4) = \epsilon_i$,  for $i = 1,

2035: 2$, where $\epsilon_1 = \epsilon^2$ and $\epsilon_2 = \epsilon^3$.  If

2036: $U = \{w_2, w_4\}$ and $V = \{w_3, w_4\}$, then $\nu_i(U) =

2037: \nu_i(V) = \epsilon$ and $\nu_i(U \inter V) = \epsilon_i$.  It

2038: follows $U$ and $V$ are independent with

2039: respect to $\nu_1$, but not with respect to $\nu_2$.   However, it is

2040: easy to check that $\nu_1 \aeq \nu_2$.

2041: \exam

2042:

2043: Example~\ref{xam:approximatelyindep} shows that independence of events in the

2044: context of nonstandard

2045: measures is very sensitive to the choice of $\epsilon$, even if this

2046: choice does not affect decision making at all.  This suggests the

2047: following definition: $U$ is {\em approximately independent\/} of $V$ with

2048: respect to $\nu$ if $\nu(U) \ne 0$ implies that

2049: $\nu(V \mid U) - \nu(V)$ is infinitesimal, that is, if

2050: $\stand{\nu(V \mid U)} = \stand{\nu(V)}$.

2051: Note that $U$ can be approximately independent of $V$ without

2052: $V$ being approximately independent of $U$.  For example, consider the

2053: nonstandard probability measure $\nu_1$ from

2054: Example~\ref{xam:approximatelyindep}.  Let

2055: %joe2

2056: %$V' = \{w_1, w_2\}$;

2057: $V' = \{w_4\}$;

2058: as before, let $U = \{w_2, w_4\}$.  It is easy to check that

2059: $\stand{\nu_1(V' \mid U)} = \stand{\nu_1(V')} = 0$, but

2060: $\stand{\nu_1(U \mid V')} = 1$, while $\stand{\nu_1(U)} = 0$.  Thus,

2061: $U$ is approximately independent of $V'$ with respect to $\nu_1$, but

2062: $V'$ is not

2063: approximately independent of $U$.   Similarly, $U$ can be approximately

2064: independent of $V$ without $\overline{U}$ being approximately

2065: independent of $V$.  For example, it is easy to check that

2066: $\overline{V}'$ is approximately independent of $U$ with respect to

2067: $\nu_1$, although $V'$ is not.

2068:

2069:

2070:

2071: A straightforward argument shows that $U$

2072: is approximately independent of $V$ with respect to $\nu$ iff

2073: $\nu(U) \ne 0$ implies $\stand{(\nu(V

2074: \inter U) - \nu(V) \times \nu(U))/ \nu(U)} = 0$, while $V$ is

2075: approximately independent of

2076: $U$ with respect to $\nu$ iff the same statement holds with the roles of

2077: $V$ and $U$ reversed.

2078: Note for future reference that each of these requirements

2079: is stronger than just

2080: requiring that $\stand{\nu(V \inter U) - \nu(V) \times \nu(U)} = 0$.

2081: The latter requirement is automatically met, for example, if the

2082: probability of either $U$ or $V$ is infinitesimal.

2083:

2084: The definition of (approximate) independence extends in a straightforward

2085: way to (approximate) conditional independence.  $U$ is

2086: conditionally independent of $V$ given $V'$ with respect to a (standard

2087: or nonstandard) probability measure $\nu$ if

2088: $\nu(U \inter V') \ne 0$ implies $\nu(V \mid U \inter V') = \nu(V \mid V')$.

2089: %Conditional independence is typically taken to hold by convention

2090: %if $\nu(U \inter V') = 0$.

2091: %joe2

2092: %Intuitively, this is because $\nu(V \inter U

2093: %\inter V')$ is indeterminate in this case.

2094: %However, it may seem

2095: %reasonable to say that conditional independence does {\em not\/} hold if

2096: %$U \inter V' = \emptyset$ but $V \inter V' \ne \emptyset$.  Intuitively,

2097: %in this case, it is clear that, given $V'$, finding out $

2098: Again, for probability, $U$ is

2099: conditionally independent of $V$ given $V'$ iff $V$ is conditionally

2100: independent of $U$ given $V'$ iff $\nu(V \inter U \mid V') = \nu(V

2101: \mid V') \times \nu(U \mid V')$.

2102: $U$ is approximately

2103: conditionally independent of $V$ given $V'$ with respect to $\nu$ if

2104: $\stand{\nu(V \mid U \inter V')} = \stand{\nu(V

2105: \mid V')}$.   If $V'$ is taken to be $W$, the whole space, then (approximate)

2106: conditional independence reduces to (approximate) independence.

2107:

2108: The following proposition shows that, although independence is not

2109: preserved by equivalence, approximate independence is.

2110:

2111: \pro\label{indaeq} If $U$ is approximately conditionally independent of

2112: $V$ given

2113: $V'$ with respect to $\nu$, and $\nu \aeq \nu'$, then

2114: $U$ is approximately conditionally independent of $V$ given

2115: $V'$ with respect to $\nu'$.

2116: \epro

2117:

2118: \prf Suppose that $\nu \aeq \nu'$.  I claim that for all events $U_1$

2119: and $U_2$ such that $\nu_1(U_2) \ne 0$, $\stand{\nu(U_1)/\nu(U_2)} =

2120: \stand{\nu'(U_1)/\nu'(U_2)}$.  For suppose that

2121: $\stand{\nu(U_1)/\nu(U_2)} = \alpha$.  Then it easily follows that

2122: $E_\nu(\chi_{U_1}) < E_\nu(\alpha'\chi_{U_2})$ for all $\alpha' > \alpha$,

2123: and $E_\nu(\chi_{U_1}) > E_\nu(\alpha''\chi_{U_2})$ for all $\alpha'' <

2124: \alpha$.  Thus, the same must be true for $E_{\nu'}$, and hence

2125: $\stand{\nu'(U_1)/\nu'(U_2)} = \alpha$.   It thus follows

2126: that $\stand{\nu (V \mid  U \inter V')} = \stand{\nu' (V \mid  U \inter

2127: V')}$ and $\stand{\nu(V \mid V')} = \stand{\nu'(V \mid V')}$, from which

2128: the result is immediate. \eprf

2129:

2130:

2131: \commentout{

2132: There is also an interesting connection between approximate independence

2133: and independence, which will prove useful in understanding issues

2134: involving independence between random variables.

2135:

2136:

2137: \pro\label{indaeq1} There exists a measure $\nu'$ such

2138: that $\nu \aeq \nu'$ and $U$ is conditionally independent of $V$ given

2139: $V'$ with respect to $\nu'$ iff (a) both $U$ and $\overline{U}$ are

2140: approximately conditionally independent of $V$ given $V'$ with respect

2141: to $\nu$ and (b) both $V$

2142: and $\overline{V}$ are approximately conditionally independent of $U$

2143: given $V'$ with respect to $\nu$.

2144: \epro

2145:

2146:

2147: \prf First suppose that $\nu \aeq \nu'$ and  and that $U$ is

2148: conditionally independent of $U$ given $V$ with respect to $\nu'$.

2149: Then, by standard properties of independence, both $U$ and

2150: $\overline{U}$ are conditionally independent of $V$ given $V'$

2151: and both $V$ and $\overline{V}$ are conditionally independent of $U$

2152: given $V'$.  Since conditional independence certainly implies

2153: conditional approximate independence, the forward implication follows

2154: from Proposition~\ref{indaeq}.

2155:

2156: For the reverse implication, suppose that (a) both $U$ and

2157: $\overline{U}$ are approximately conditionally independent of $V$ given

2158: $V'$ with respect to $\nu$ and (b) both $V$

2159: and $\overline{V}$ are approximately conditionally independent of $U$

2160: given $V'$ with respect to $\nu$.  Suppose that $\stand{U \mid V'} =

2161: r_1$, $\stand{V \mid V'} = r_2$

2162: We now need to consider a number of

2163: cases.  First, suppose that both $0 < r_1, r_2 < 1$.

2164: }

2165:

2166:

2167: There is an obvious definition of independence for events for Popper spaces:

2168: $U$ is independent of $V$ given $V'$ with respect to the Popper space

2169: $(W,\F,\F',\mu)$ if $U \inter V' \in\F'$ implies that $\mu(V \mid U

2170: \inter V') = \mu(V \mid V')$; if $U \inter V' \notin \F'$, then $U$ is also

2171: taken to be independent of $V$ given $V'$.  If

2172: $U$ is independent of $V$ given $V'$ and $V' \in \F'$, then $\mu(U

2173: \inter V \mid V') = \mu(U \mid V') \times \mu(V \mid V')$.  However, the

2174: converse does not necessarily hold.  Nor is it the case that if $U$ is

2175: independent of $V$ given $V'$ then $V$ is independent of $U$ given

2176: $V'$.  A counterexample can be obtained by taking the Popper space

2177: arising from the NPS in Example~\ref{xam:approximatelyindep}.  Consider the

2178: Popper space $(W,2^W,\F',\mu)$ corresponding to the NPS $(W,2^W,\nu_1)$

2179: %joe2

2180: %via the isomorphism $\FNP$.  It is easy to check that $U$ is

2181: %independent

2182: via the bijection $\FNP$.  It is easy to check that $U$ is independent

2183: of $V'$ but $V'$ is not independent of $U$ with respect to this Popper

2184: space, although $\mu(V' \inter U) = \mu(U \mid V') \times \mu(V') \ (=

2185: 0)$.  This observation is an instance of the following more general

2186: result, which is almost immediate from the definitions:

2187:

2188: \pro\label{pro:approximatelyindep} $U$ is approximately independent of

2189: $V$ given $V'$

2190: with respect to the NPS $(W,\F,\nu)$ iff $U$ is independent of

2191: $V$ given $V'$ with respect to the Popper space $\FNP(W,\F,\nu)$.

2192: \epro

2193:

2194: How should independence be defined in LPS's?

2195: %joe2

2196: Interestingly, neither BBD nor Hammond define independence

2197: %joe3

2198: directly

2199: for LPS's.

2200: %joe3

2201: \commentout{

2202: BBD \citeyear{BBD1} give three definitions of independence: two of them

2203: are given in terms of NPS's; the third is an indirect definition in

2204: terms of preference orders.  Hammond also works in NPS's.  Note that

2205: requiring that $\vecmu(V

2206: \mid U) = \vecmu(V)$ will not work since $\vecmu \mid

2207: U$ and $\vecmu$ are, in general, LPS's of different lengths.  Nor is

2208: there any obvious way to define multiplication of two LPS's.

2209: %joe3:

2210: %It seems

2211: %to me that the most natural way

2212: One way

2213: to define independence in LPS's is to

2214: essentially reduce the definition to that for Popper spaces.  That is,

2215: $U$ is independent of $V$ given $V'$ with respect to the LPS

2216: $(W,\F,\vecmu)$ if the leftmost number in the sequence $\vecmu(V \mid U

2217: \inter V')$ is the same as the leftmost number in $\vecmu(V \mid V')$;

2218: as usual, independence is taken to hold trivially if $\vecmu(U \inter

2219: V') = \vec{0}$.  The following result is almost immediate from the

2220: definitions.

2221:

2222: \pro\label{pro:approximatelyindep1} $U$ is independent of $V$ given $V'$

2223: with respect to the LPS $\vecmu$ iff $U$ is approximately independent of

2224: $V$ given $V'$ with respect to each NPS in the equivalence class

2225: $\FLN([\vecmu])$.

2226: \epro

2227:

2228: \noindent Propositions~\ref{pro:approximatelyindep}

2229: and~\ref{pro:approximatelyindep1} emphasize

2230: the naturalness of approximate independence in this context.

2231: }

2232: %joe3: \end{commentout}

2233: However, they do give definitions in terms of NPS's that can be

2234: applied to equivalent LPS's; indeed, BBD \citeyear{BBD2} do just this

2235: %joe4: typo

2236: %(see the discussion of BBD strong equivalence below).

2237: (see the discussion of BBD strong independence below).

2238:

2239: I now consider independence for random variables.  If $X$ is a random

2240: variable on $W$, let $\V(X)$ denote

2241: range (set of possible values) of random variable $X$; that is, $\V(X) =

2242: \{X(w): w \in W\}$.

2243: %joe2

2244: %For simplicity here, assume that the range of all random variables is

2245: Recall that I am assuming that all random variables have countable range.

2246: Random variable $X$ is

2247: independent of $Y$ with respect to a standard probability measure $\mu$

2248: if the event $X=x$ is independent of the

2249: event $Y=y$ with respect to $\mu$, for all $x \in \V(X)$ and $y \in \V(Y)$.

2250: %joe2

2251: By analogy, for nonstandard probability measures, following Kohlberg and

2252: Reny \citeyear{KR97},

2253: define $X$ and $Y$ to

2254: be {\em weakly independent\/} with respect to $\nu$ if  $X=x$ is

2255: approximately independent of $Y=y$ and $Y=y$ is approximately

2256: independent of $X=x$ with respect to $\nu$ for all $x\in

2257: \V(X)$ and $y \in \V(Y)$.%

2258: \footnote{Kohlberg and Reny's definition of weak independence also

2259: requires that the joint

2260: range of $X$ and $Y$ be the product of the individual ranges.  That is,

2261: for $X$ and $Y$ to be weakly independent, it must be the case that for

2262: all $x \in \V(X)$ and $y \in \V(Y)$, there exists some $w \in W$ such

2263: that $X(w) = x$ and $Y(w) = y$.

2264: %joe3

2265: %Of course, this requirement could also be added to the definitions of

2266: %weak and strong independence I have proposed here; adding it does not

2267: %seem to make a significant difference.}

2268: Of course, this requirement could also be added to the definition

2269: I am proposing here; adding it would not affect any of the results

2270: of this paper.}

2271:

2272:

2273:

2274:

2275: For standard probability measures, it easily follows

2276: that if $X$ is independent of $Y$, then $X \in U_1$ is independent of $Y

2277: \in V_1$ conditional on $Y \in V_2$ and $Y \in V_1$ is independent of $X

2278: \in U_1$ conditional on $X \in U_2$, for all $U_1, U_2 \subseteq \V(X)$

2279: and $V_1, V_2 \subseteq \V(Y)$.   The same arguments show that this is

2280: also true for for nonstandard probability measures.  However, the

2281: argument breaks down for approximate independence.

2282:

2283: \xam\label{xam:needapproximate} Suppose that $W = \{1,2,3\} \times

2284: \{1,2\}$. Let $X$ and $Y$ be the random variables that project onto the

2285: first and second components of a world, respectively, so that $X(i,j) =

2286: i$ and $Y(i,j) = j$.  Let $\nu$ be the nonstandard probability measure

2287: on $W$ given by the following table:

2288:

2289:

2290: %\begin{table}[h]

2291: \begin{center}

2292: \begin{tabular}{| c | c | c | c |}

2293: \hline

2294: %joe2: added X=, Y=

2295: & $Y=1$ & $Y=2$ \\

2296: \hline

2297: $X=1$ & $1 - 3 \epsilon - 3\epsilon^2$ & $\epsilon$\\

2298: \hline

2299: $X=2$ & $\epsilon$ & $\epsilon^2$\\

2300: \hline

2301: $X=3$ & $\epsilon$ & $2\epsilon^2$\\

2302: \hline

2303: \end{tabular}

2304: \end{center}

2305: %\end{table}

2306: It is easy to check that

2307: %$X = i$ is approximately independent of $Y=j$ and that $Y=j$

2308: %is approximately independent of $X=i$

2309: $X$ and $Y$ are weakly independent

2310: with respect to $\nu$, for all $i \in

2311: \{1,2,3\}$, $j \in \{2,3\}$.  However, $\stand{\nu(X = 2 \mid X \in

2312: \{2,3\} \inter Y=2)} = 1/3$, while $\stand{\nu(X=2 \mid X \in \{2,3\})}

2313: = 1/2$.

2314: \exam

2315:

2316: In light of this example, I define $X$ to be {\em approximately independent of

2317: %joe2

2318: $\{Y_1, \ldots, Y_n\}$ with respect to $\nu$\/} if $X \in U_1$ is

2319: %joe3

2320: %approximately independent of $Y_1 \in V_1 \inter \ldots \inter Y_n \in V_n$

2321: %conditional on $Y_1 \in V_1' \inter \ldots \inter Y_n \in V_n'$ with

2322: approximately independent of $(Y_1 \in V_1) \inter \ldots \inter (Y_n

2323: \in V_n)$ conditional on $(Y_1 \in V_1') \inter \ldots \inter (Y_n \in

2324: V_n')$ with

2325: respect to $\nu$ for all

2326: $U_1 \subseteq \V(X)$, $V_i, V_i' \subseteq \V(Y_i)$, and $i = 1, \ldots,

2327: n$.   $X_1, \ldots, X_n$ are {\em approximately independent with respect

2328: to $\nu$\/} if $X_i$ is approximately independent of $\{X_1, \ldots,

2329: X_n\} - \{X_i\}$ with respect to $\nu$ for $i = 1, \ldots, n$.  I leave

2330: to the reader the obvious extensions to

2331: conditional independence and the

2332: analogues of this definition for Popper spaces and LPS's.

2333: %joe2

2334: %Note that the events $U$ and $V$ are approximately independent iff the

2335: %random variables $\chi_U$ and $\chi_V$ are approximately independent (or

2336: %weakly independent---approximate independence and weak independence

2337: %coincide for binary random variables).

2338:

2339: %joe3

2340: %I consider one last notion of independence for random variables,

2341: %\emph{strong independence}, first considered by Kohlberg and Reny

2342: %\citeyear{KR97}.

2343:

2344: %joe3

2345: %We can, of course, define strong independence for LPS's and NPS's by

2346: %translating the definition from Popper spaces using the mapping $\FNP$

2347: %and $\FLN$.  But there is as more direct, and much more natural,

2348: %definition in the case of NPS's, as the following the following theorem

2349: %shows.

2350:

2351: As I said, BBD consider three notions of independence for random variables.

2352: One is a decision-theoretic notion of stochastic independence on preference

2353: relations on acts over $W$.  Under appropriate assumptions, it can be

2354: shown that a preference relation is stochastically independent

2355: iff it can be

2356: represented by some (real-valued) utility function $u$ and a nonstandard

2357: probability measure $\nu$ such that $X_1, \ldots, X_n$ are approximately

2358: independent with respect to $\nu$ \cite{BV96}.

2359: A second notion they consider is a weak notion of

2360: product measure that requires only that there exist measures $\nu_1,

2361: \ldots, \nu_n$ such that $\stand{(\nu(w_1, \ldots, w_n)} =

2362: \stand{\nu_1(w_1) \times \cdots \nu(w_n)}$.  As we have already

2363: observed, this notion of independence is rather weak.  Indeed, an

2364: example in BBD shows that it misses out on some interesting

2365: decision-theoretic behavior.

2366:

2367: \commentout{

2368: Approximate independence and strong independence differ in the order of

2369: universal and existential quantification.  $X$ and $Y$ are

2370: approximately independent with respect to $\nu$ if, for all values $x$

2371: and $y$ in the range of $X$ and $Y$, respectively, there is an NPS

2372: $\nu_{xy}$ such that $\nu_{xy} \aeq \nu$ and $X=x$ and $Y=y$ are

2373: independent with respect to $\nu_{xy}$.  On the other hand, $X$ and $Y$

2374: are strongly independent if there exists an NPS $\nu'$ such that $\nu'

2375: \aeq \nu$ and for all $x$ and $y$ in the range of $X$ and $Y$,

2376: respectively, $X = x$ is independent of $Y=y$.  Clearly KR-strong

2377: independence implies approximate independence.  As the following

2378: example (due to Kohlberg and Reny \citeyear{KR97}) shows, in general, it

2379: is strictly stronger.

2380:

2381: \xam Suppose that $W = \{1,2,3\} \times \{1,2,3\}$.

2382: Let $X$ and $Y$ be the random variables that project onto the first and second

2383: components of a world, respectively, so that $X(i,j) = i$ and $Y(i,j) =

2384: j$.  Let $\nu$ be the nonstandard probability measure on $W$ given by the

2385: following table:

2386:

2387: %\begin{table}[h]

2388: \begin{center}

2389: \begin{tabular}{| c | c | c | c |}

2390: \hline

2391: %joe2: added X=, Y=

2392: & $Y=1$ & $Y=2$ & Y=3\\

2393: \hline

2394: $X=1$ & $1 - 3 \epsilon - 4\epsilon^2 - 3 \epsilon^3 - \epsilon^4$ &

2395: $2\epsilon$ & $\epsilon^2$\\

2396: \hline

2397: $X=2$ & $\epsilon$ & $\epsilon^2$ & $2\epsilon^3$\\

2398: \hline

2399: $X=3$ & $2\epsilon^2$ & $\epsilon^3$ & $\epsilon^4$\\

2400: \hline

2401: \end{tabular}

2402: \end{center}

2403: %\end{table}

2404: It is easy to check that $X$ and $Y$ are approximately independent.

2405: However, they are not strongly independent.   Suppose, by way of

2406: contradiction, that there exists some probability measure $\nu' \aeq

2407: \nu$ such that $X$ and $Y$ are independent with respect to $\nu'$.

2408: Note that

2409: $$\stand{\frac{\nu(X=1 \inter Y=2)}{\nu(X=2 \inter Y=1)}} =

2410: \stand{\frac{\nu(X=3 \inter Y=1)}{\nu(X=1 \inter Y=3)}} =

2411: \stand{\frac{\nu(X=2 \inter Y=3)}{\nu(X=3 \inter Y=2)}} = 2.$$

2412: Since $\nu' \aeq \nu$, it is easy to check that

2413: $$\stand{\frac{\nu'(X=1 \inter Y=2)}{\nu'(X=2 \inter Y=1)}} =

2414: \stand{\frac{\nu'(X=3 \inter Y=1)}{\nu'(X=1 \inter Y=3)}} =

2415: \stand{\frac{\nu'(X=2 \inter Y=3)}{\nu'(X=3 \inter Y=2)}} = 2.$$

2416: Thus, it follows that

2417: $$\stand{\frac{\nu'(X=1 \inter Y=2) \times \nu'(X=3 \inter Y=1) \times \nu'(X=2

2418: \inter Y=3)}{ \nu'(X=2 \inter Y=1) \times \nu'(X=1 \inter Y=3) \times

2419: \nu'(X=3 \inter Y=2)}} = 8.$$

2420: However, since $X$ and $Y$ are independent with respect to $\nu'$, we

2421: must have

2422: $$\stand{\frac{\nu'(X=1 \inter Y=2) \times \nu'(X=3 \inter Y=1) \times

2423: \nu'(X=2

2424: \inter Y=3)}{ \nu'(X=2 \inter Y=1) \times \nu'(X=1 \inter Y=3) \times

2425: \nu'(X=3 \inter Y=2) }} = 1.$$

2426: This gives the desired contradiction.

2427: \exam

2428: \commentout{

2429: We can define two events $U$ and $V$ to be strongly independent if the

2430: random variables $\chi_U$ and $\chi_V$ are strongly independent.

2431: However, it is not hard to check (using techniques much like those used

2432: to prove Proposition~\ref{indaeq}) that $\chi_U$ and $\chi_V$ are

2433: strongly independent iff they are weakly (or approximately) independent.

2434: Distinctions that are significant when considering independence of

2435: random variables disappear at the level of independence of independence

2436: of events.

2437: }%\end{commsentout}

2438: }%\end{commentout}

2439:

2440: %joe3: all new

2441: The third notion of independence that BBD consider is the strongest.

2442: BBD \citeyear{BBD2} define $X_1, \ldots, X_n$ to be

2443: strongly independent with respect to an LPS

2444: $\vecmu$ if they are independent (in the usual sense) with respect to an NPS

2445: $\nu$ such that $\mu \aeq \nu$.%

2446: \footnote{In \cite{BBD2}, BBD say that this definition of strong

2447: independence is given in \cite{BBD1}.  However, the definition appears

2448: to be given only in terms of NPS's in \cite{BBD1}.}

2449: Moreover, they give a characterization

2450: of this notion of strong independence, which I henceforth call \emph{BBD

2451: strong independence}, to distinguish it from the KR notion of strong

2452: independence that I discuss shortly.

2453: Given a tuple $\vec{r} = (r^0, \ldots, r^{k-1})$ of vectors of reals in

2454: $(0,1)^k$ and a finite LPS

2455: $\vecmu = (\mu^0, \ldots, \mu^k)$, let $\vecmu \, \Box \, \vec{r}$

2456: be the (standard) probability measure

2457: $$(1 - r^0) \mu^0 + r^0[(1-r^1) \mu^1 + r^1[(1-r^2)\mu^2 + r^2[\cdots +

2458: r^{k-2}[(1-r^{k-1})\mu^{k-1} + r^{k-1}\mu^k)]\ldots ]]].$$

2459: Note that $\vecmu \, \Box \, \vec{r}$ is defined only if $\vecmu$ is

2460: finite.  Thus, in discussing BBD strong independence, I restrict  to

2461: finite LPS's.

2462: %joe6

2463: In addition, for technical reasons that will become clear in the proof

2464: of Theorem~\ref{BBDstrongindependence}, I consider only random variables

2465: with finite range, which is what BBD do as well.

2466: BBD \citeyear[p.~90]{BBD2} claim without proof that ``it is

2467: straightforward to show'' that $X_1, \ldots, X_n$ are BBD strongly

2468: independent with respect to $\vecmu$ iff there is a

2469: sequence $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$

2470: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$

2471: as $j\rightarrow\infty$,

2472: and $X_1, \ldots, X_n$ are

2473: independent with respect to $\vecmu \, \Box \, \vec{r}^j$ for $j = 1, 2, 3,

2474: \ldots$.  I can prove this result only if the NPS $\nu$ such that

2475: $\vecmu \aeq \nu$ and $X_1, \ldots, X_n$ are independent with respect to

2476: $\nu$ has a range that is an elementary extension of the reals (and thus

2477: has the same first-order properties as the reals).

2478:

2479: \thm\label{BBDstrongindependence}

2480: There exists an NPS $\nu$ whose range

2481: is an

2482: elementary extension of the reals such that $\vecmu \aeq \nu$ and $X_1,

2483: \ldots, X_n$ are

2484: %joe5

2485: %strongly

2486: independent with respect to $\nu$ iff there

2487: exists a sequence

2488: $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$

2489: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$

2490: as $j\rightarrow\infty$,

2491: and $X_1, \ldots, X_n$ are

2492: independent with respect to $\vecmu \, \Box \, \vec{r}^j$ for $j = 1, 2, 3,

2493: \ldots$.

2494: \ethm

2495: I do not know if this result holds without requiring that $\nu$ be an

2496: elementary extension of the reals.

2497:

2498: Kohlberg and Reny \citeyear{KR97} define a notion of strong independence with

2499: respect to what they call {\em relative probability spaces}, which are

2500: closely related to Popper spaces of the form

2501: $(W,2^W,2^W-\{\emptyset\},\mu)$, where all subsets of $W$ are measurable and

2502: it is possible to condition on all nonempty sets.

2503: %joe3

2504: Their definition is similar in spirit to the characterization of BBD

2505: strong independence given in Theorem~\ref{BBDstrongindependence}.

2506: For ease of exposition, I recast their definition in terms of Popper spaces.

2507: $X_1, \ldots, X_n$ are {\em KR-strongly independent\/} with respect to the

2508: Popper space $(W,\F,\F', \mu)$, where $\F'$ includes all events of the

2509: form $X_i = x$ for $x \in \V(X_i)$, if there exist a sequence of

2510: standard probability measures $\mu_1, \mu_2, \ldots$ such that $\mu_j

2511: \rightarrow \mu$, and for all $j = 1, 2, 3, \ldots$,

2512: $\mu_j(U) > 0$ for $U \in \F'$ and $X_1,

2513: \ldots, X_n$ are independent with respect to $\mu_j$.

2514: As Kohlberg and Reny show,

2515: KR-strong independence implies approximate independence%

2516: \footnote{They actually show only that it implies weak independence, but

2517: the same argument shows that it implies approximate independence.}

2518: and is, in general, strictly stronger.

2519:

2520: The following theorem characterizes KR strong independence in terms of

2521: NPS's.

2522:

2523: \thm\label{KRindependence}

2524: %joe3

2525: %$X_1, \ldots, X_n$ are strongly independent with respect to the Popper

2526: $X_1, \ldots, X_n$ are KR-strongly independent with respect to the Popper

2527: space $(W,\F,\F',\mu)$ iff there

2528: exists an NPS $(W,\F,\nu)$ such that

2529: %joe4

2530: %$\FNP(W,\F,\nu) = \mu$ and $X_1, \ldots,

2531: $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$ and $X_1, \ldots,

2532: X_n$ are independent with respect to $(W,\F,\nu)$.

2533: \ethm

2534: It follows from the proof that we can require the range of $\nu$ to be a

2535: nonelementary extension of the reals, but this is not necessary.

2536:

2537: %joe3:

2538: \commentout{

2539: There is a sense in which KR-strong independence is weaker than

2540: BBD-strong independence.

2541: Define $X_1, \ldots, X_n$ to be KR-strongly independent (resp.,

2542: BBD-strongly independent) with respect to

2543: NPS $\nu$ if there exists an NPS $\nu'$ such that $\nu \simeq \nu'$

2544: (resp., $\nu \aeq \nu'$) and $X_1, \ldots, X_n$ are independent with

2545: respect to  $\nu'$.  As we have seen, $\simeq$ is a coarser notion of

2546: equivalence than $\sim

2547: }

2548: %joe3: \end{commentout}

2549:

2550: %We must be a little careful regarding the interepretation of

2551: %Theorem~\ref{KRindependence} since, as observed after Theorem~\ref{FNP},

2552: %the map $\FNP$ acts the same on all $\simeq$ equivalence classes, and

2553: %$\simeq$ is a rather coarse equivalence relation.  I discuss this issue

2554: %in more detail after the proof of Theorem~\ref{KRindependence} in the

2555: %appendix.

2556:

2557: %joe3:

2558: \commentout{

2559: Now I can compare the definitions given here to those discussed by BBD,

2560: Hammond, and Kohlberg and Reny.  BBD define a (standard or nonstandard)

2561: probability measure

2562: $\nu$ on $W= W_1 \times \cdots \times W_n$ to be a product measure if

2563: there exist measures $\nu_i$ on $W_i$ for $i = 1, \ldots, n$, such that

2564: such that $\nu((w_1, \ldots, w_n)) = \nu_1(w_1) \times \cdots \times

2565: \nu_n(w_n)$.  If $X_i$ is the random variable that projects on to the

2566: $i$th component, then it is easy to see that $\nu$ is a product measure

2567: iff $X_1, \ldots, X_n$ are independent.

2568:

2569:

2570: Hammond mainly focuses on Popper spaces, and follows BBD's lead

2571: in considering when a Popper space can be, in a sense, viewed as a

2572: product measure.   He defines a notion of conditional independence of a

2573: Popper space defined on $W = W_1 \times \cdots \times W_n$ which is

2574: similar in spirit to the notion of independence of random variables in

2575: Popper spaces as defined here.  In fact, it is straightforward to show

2576: that the Popper space $(W_1 \times \cdots \times W_n, \F,\F',\mu)$ is

2577: conditionally independent in Hammond's sense iff the projections

2578: $X_1, \ldots, X_n$ are independent with respect to the Popper space, in

2579: the sense defined here. }

2580: %\end{commentout}

2581:

2582: %joe3

2583: %Finally, I compare these definitions to those discussed by Kohlberg and

2584: %Reny \citeyear{KR97}.

2585: %As I said,

2586: %Kohlberg and Reny defined weak independence and strong independence of

2587: %random variables with respect to relative probability spaces.%

2588: Kohlberg and Reny show that their notions of weak and strong independence

2589: can be used to characterize

2590: Kreps  and Wilson's

2591: \citeyear{KW82} notion of sequential equilibrium.

2592: BBD \citeyear{BBD2} use their notion of strong independence in their

2593: characterization of perfect equilibrium and proper equilibrium for games

2594: with more than two players.

2595: %joe4

2596: Finally, Battigali \cite{Bat96} uses approximate independence (or,

2597: equivalently, independence in cps's) to characterize sequential

2598: equilibrium.

2599: %joe3

2600: %Thus, all these notions

2601: %play a significant role in characterizing concepts of great

2602: %relevance to game theory.

2603: \commentout{

2604: Sequential equilibrium uses the notion of an {\em assessment}.

2605: Given a game $\Gamma$, an assessment is

2606: a pair $(\rho,\pi)$, where $\rho$ is a

2607: function that assigns to each information set $I$ in $\Gamma$ a probability

2608: measure $\rho(I)$ on the set of histories in that information set, and $\pi$

2609: assigns to each a node $x$ a probability $\pi(x)$ on the possible next

2610: moves at that node so that $\pi(x) = \pi(x')$ for two nodes $x$ and $x'$

2611: in the same information set.   Roughly speaking, an assessment is {\em

2612: consistent\/} if,  whenever information set $I$ immediately follows

2613: information set $I'$, if $I$ can be reached from $I'$ with positive

2614: probability, then  $\rho(I)$ is obtained from

2615: $\rho(I')$ by the obvious computation; if $I$ is not reachable from

2616: $I'$ with positive probability, then $\rho(I)$ must be the

2617: limit of the probabilities on $I$ induced by imposing small trembles on the

2618: moves (so that all of them have some small positive probability, which

2619: goes to 0).

2620: Kohlberg and Reny \cite{KR97} show that $(\rho,\pi)$ is an

2621: assessment (i.e., $\pi(x) = \pi(x')$ for all $x$ and $x'$ in the same

2622: information set) iff $S_1, \ldots, S_n$ are weakly independent and

2623: $(\rho, \pi)$ is a consistent iff $S_1, \ldots, S_n$ are strongly

2624: independent.

2625: }

2626: %Thus, all of weak independence, approximate independence, and strong

2627: %independence

2628:

2629:

2630: \section{Discussion}\label{discussion}

2631: As the preceding discussion shows, there is a sense in which NPS's

2632: are more general than both Popper spaces and LPS's.

2633: %joe6

2634: It would be of interest to get a natural characterization of those NPS's

2635: that are equivalent to Popper spaces and LPS's; this remains an open

2636: problem.

2637: LPS's are more expressive than Popper measures in finite spaces and in

2638: infinite spaces where we assume countable additivity (in the sense

2639: discussed at the end of Section~\ref{PopperNPS}), but without assuming

2640: countable additivity, they are incomparable,

2641: %joetark:

2642: as Examples~\ref{counter1} and~\ref{counter2} show.

2643: %as Example~\ref{counter1} shows.

2644: %Although NPS's are equivalent to LPS's in

2645: %finite state spaces, NPS's have other advantages.

2646: %For example, as

2647: %pointed out by Hammond \citeyear{Hammond94} and BBD, it is also easier

2648: %to define

2649: %independence in NPS's.

2650: %joe2:

2651: Since all of these approaches to representing uncertainty have been

2652: using in characterizing solution concepts in extensive-form games and

2653: notions of admissibility, the results here suggest that it is worth

2654: considering the extent to which these results depend on the particular

2655: representation used.

2656:

2657: It is worth stressing here that this notion of equivalence depends on

2658: the fact that I have been viewing cps's, LPS's, and NPS's as

2659: representations of uncertainty.  But, as Asheim \citeyear{Asheim06}

2660: emphasizes, they can also be viewed as representations of conditional

2661: preferences.  Example~\ref{McGee} shows that, even in finite spaces,

2662: NPS's and LPS's can express preferences that cps's cannot.  However, as

2663: %joe9

2664: %Asheim and Pereira \citeyear{AP05} point out, in finite spaces, cps's

2665: Asheim and Perea \citeyear{AP05} point out, in finite spaces, cps's

2666: can also represent conditional  preferences that cannot be represented by

2667: LPS's and NPS's.  See \cite{Asheim06} for a detailed discussion of the

2668: expressive power of these representations with respect to

2669: conditional preferences.

2670:

2671: Although NPS's are the most expressive of the three approaches I have

2672: considered,  they have some disadvantages.  In particular,

2673: working with a nonstandard probability measure requires defining and

2674: working with a non-Archimedean field.

2675: LPS's have the advantage of using just standard probability measures.

2676: Moreover, their lexicographic structure may give useful insights.

2677: It seems to be worth considering the

2678: extent to which LPS's can be generalized so as to increase their

2679: expressive power.

2680: %joe8

2681: %I am currently exploring LPS's ordered by an arbitrary (not necessarily

2682: %well-founded) index set.  It seems that such LPS's

2683: %may be useful in understanding

2684: %%characterizing  iterated deletion of weakly dominated strategies.

2685: In particular, it may be of interest to consider LPS's indexed by

2686: partially ordered and not necessarily well-founded sets, rather than

2687: just LPS's indexed by the ordinals.

2688: For example,

2689: Brandenburger, Friedenberg, and Keisler~\citeyear{BFK04} characterize

2690: $n$ rounds of

2691: iterated deletion using finite LPS's, for any $n$.

2692: %joe8

2693: %it seems that that these results are more cleanly stated using infinite

2694: %LPS's ordered by the

2695: Rather than using a sequence of (finite) LPS's of different lengths to

2696: characterize (unbounded) iterated deletion,

2697: it seems that a result similar in spirit can be obtained using a single LPS

2698: indexed by the (positive and negative) integers.

2699: %I hope to report on this in future work.

2700:

2701: %One final point: defining belief.

2702: I conclude with a brief discussion of a few other issues raised by this

2703: paper.

2704: \begin{itemize}

2705: \item Belief:

2706: The connections between LPS's, NPS's, and cps's are relevant to the

2707: notion of belief.

2708: %joe3

2709: There are two standard notions of belief that can be defined in LPS's.

2710: %joe6

2711: %Say that $U$ is {\em strongly believed\/} in LPS $\vecmu$ of length

2712: Say that $U$ is a {\em certain belief\/} in LPS $\vecmu$ of length

2713: $\alpha$ if $\mu_\beta(U) = 1$ for all $\beta < \alpha$; $U$ is {\em

2714: weakly believed\/} if $\mu_0(U) = 1$.

2715: Brandenburger, Friedenberg, and Keisler \citeyear{BFK04} defined a

2716: %joe3

2717: third

2718: notion of belief,

2719: intermediate between weak and strong belief,

2720: % using LPS's

2721: and provided an elegant decision-theoretic justification of it.

2722: According to their definition, an agent {\em assumes $U$

2723: %joe3

2724: %in LPS

2725: in

2726: %$\vecmu$\/} if there is some $j \le m$ such that (a) $\mu_i(U) = 1$ for all

2727: $\vecmu$\/} if there is some $\beta < \alpha$ such that (a) $\mu_{\beta'}(U) =

2728: 1$ for all

2729: $\beta' \le \beta$, (b) $\mu_{\beta''}(U) = 0$ for all $\beta'' > \beta$, and

2730: (c) $U \subseteq \union_{\beta' \le \beta} \Supp(\mu_{\beta'})$, where

2731: $\Supp(\mu_{\beta'})$

2732: denotes the support of

2733: the probability measure $\mu_{\beta'}$.  (Condition (c) is unnecessary if $W$

2734: is finite, given Brandenburger, Friedenberg, and Keisler's assumption that

2735: $W = \union_{\beta'} \Supp(\mu_{\beta'})$.)

2736: %joe3

2737: %The usual notion of belief in probability

2738: %spaces is that $U$ is believed with respect to probability measure $\mu$

2739: %if $\mu(U) = 1$.

2740: %Assumption can be viewed as a strong notion of belief.

2741: %joe3:

2742: There are straightforward analogues of certain belief and weak belief in

2743: Popper spaces.  $U$ is strongly believed in a Popper space

2744: $(W,\F,\F',\mu)$ if $\mu(U \mid V) = 1$ for all $V \in \F'$; $U$ is

2745: weakly believed if $\mu(U \mid V) = 1$ for all $V \in \F'$ such that

2746: $\mu(V) > 0$.

2747: %joe6

2748: Analogues of this notion of assumption have been considered elsewhere in

2749: the literature.

2750: Van Fraassen

2751: \citeyear{vF95} independently defined a

2752: %joe6

2753: %strong

2754: notion of belief using Popper spaces; in a finite state space, an event

2755: is what van Fraassen calls a \emph{belief core} iff it is assumed in

2756: the sense of Brandenburger, Friedenberg,

2757: and Keisler.  Battigalli and Siniscalchi's \citeyear{BS02} notion of

2758: \emph{strong belief} is also essentially equivalent.

2759: Assumption also corresponds to Stalnaker's \citeyear{Stal98} notion of

2760: \emph{absoutely robust belief} and Asheim and S{\o}vik's \citeyear{AS05}

2761: notion of \emph{robust belief}.

2762: Asheim and S{\o}vik \citeyear{AS05} do a careful comparison of all these

2763: notions (and others).

2764: %joe3

2765: %in a finite state space,

2766: %an event is what van Fraassen calls a {\em belief core\/}

2767: %iff it is assumed in the sense of Brandenburger and Keisler.

2768: %%joe4

2769: %}

2770:

2771: %joe3

2772: %That there should be equivalent notions of strong belief in the

2773: %That there should be equivalent notions of belief in the

2774: %context of LPS's and Popper spaces is perhaps not that surprising, in

2775: %light of the close connection between them.

2776: It is easy to define analogues of certain and weak belief in NPS's:

2777: $U$ is certain belief if $\nu(U) = 1$; $U$ is weakly believed if

2778: $\stand{\nu(U)} = 1$.

2779: The results of this paper

2780: %joe3

2781: %suggest that it may also be worth considering such strong notions of

2782: suggest that it may also be worth

2783: %considering what the analogue of these notions is

2784: %in the context of NPS's.

2785: investigating an analogue of assumption in NPS's.

2786:

2787: \item Nonstandard utility:

2788: In this paper, while I have allowed probabilities to be

2789: lexicographically ordered or nonstandard, I have implicitly assumed that

2790: utilities are standard real numbers (since I have restricted to

2791: real-valued random variables).

2792:   There is a tradition in decision theory going back to Hausner

2793: \citeyear{Hausner54} and continued recently in a sequence of papers by

2794: Fishburn and Lavalle (see \cite{FL99} and the references therein)

2795: %joe2

2796: and Hammond \citeyear{Hammond99} of

2797: considering nonstandard or lexicographically-ordered utilities.  I have

2798: not considered the relationship between these ideas and the ones

2799: considered here, but there may be some fruitful connections.

2800:

2801: \item Countable additivity for NPS's:

2802: Countable additivity for standard

2803: probability measures is essentially a continuity condition.  The

2804: fact that $\sum_{i=1}^\infty a_i$ may not be the least upper bound of

2805: the partial sums $\sum_{i=1}^n a_i$ in an NPS leads to a certain lack of

2806: continuity in decision-making.  For example, let $W = \{w_1, w_2, \ldots\}$.

2807: Consider a nonstandard probability measure $\nu$ such that $\nu(w_1) =

2808: 1/3 -\epsilon$, $\nu(w_2) = 1/3 + \epsilon$, and $\nu(w_{k+2}) = 1/(3

2809: \times 2^k)$, for $k = 1, 2, \ldots$.  Let $U_n = \{w_3, \ldots, w_n\}$

2810: and let $U_\infty = \{w_3, w_4, \ldots \}$.  Clearly $\nu(U_n) \tendsto

2811: \nu(U_\infty) = 1/3$.  However, $\nu(U_n) < \nu(w_1)$ for all $n$.

2812: Thus, $E_\nu(\chi_{\{w_1\}}) > E_\nu(\chi_{U_n})$ for all $n \ge 3$ although

2813: $E_\nu(\chi_{\{w_1\}}) < E_\nu(\chi_{U_\infty})$.

2814:

2815: Not surprisingly, the same situations can be modeled with LPS's.

2816: Consider the LPS $(\mu_1, \mu_2)$, where

2817: %joe9

2818: %$\mu_1 = \stand{\nu_1}$, $\mu(w_1) = 0$,

2819: $\mu_1 = \stand{\nu}$, $\mu_2(w_1) = 0$,

2820: $\mu_2(w_2) = 2/3$, and $\mu_2(w_{k+2}) = 1/(3\times 2^k)$ for $k = 1,

2821: 2, \ldots$.  It is easy to see

2822: that again $E_{\vecmu}(\chi_{\{w_1\}}) > E_{\vecmu}(\chi_{U_n})$ for all $n

2823: \ge 3$ although  $E_{\vecmu}(\chi_{\{w_1\}}) < E_\nu(\chi_{U_\infty})$.

2824: (A similar example can be obtained using SLPS's, by replacing each world

2825: $w_i$ by a pair of worlds $w_i', w_i''$, where $w_i'$ is in the support

2826: of $\mu_1$ and $w_i''$ is in the support of $\mu_2$.)

2827:

2828: An analogous continuity problem arises even in finite domains.  Let $W = \{w_1, w_2, w_3\}$ and consider a sequence of

2829: probability measures $\nu_n$ such that $\nu_n(w_1) = 1/3

2830: -1/n$, $\nu_n(w_2) = 1/3 - \epsilon$ and $\nu(w_3) = 1/3 + 1/n +

2831: \epsilon$.  Clearly $\nu_n

2832: \tendsto \nu$, where $\nu(w_1) = 1/3$, $\nu(w_2) = 1/3 - \epsilon$, and

2833: $\nu(w_3) = 1/3 + \epsilon$.  However, $\nu_n(\chi_{\{w_1\}}) <

2834: \nu_n(\chi_{\{w_2\}})$ for all $n$, while $\nu(\chi_{\{w_1\}}) >

2835: \nu(\chi_{\{w_2\}})$.  Again, the same situation can be modeled using LPS's

2836: (and even SLPS's).

2837:

2838:

2839: %joe2

2840: %Is this lack of continuity a problem?

2841: Of course, continuity plays a significant role in standard

2842: axiomatizations of SEU, and is vital in proving the existence of a Nash

2843: equilibrium.  None of the uses of continuity that I am familiar with

2844: have the specific form of this example, but I believe it is worth

2845: considering further the impact of this lack of continuity.

2846: %I am not sure, but I believe it deserves further thought.

2847: \end{itemize}

2848:

2849: \paragraph{Acknowledgments:}  I'd like to thank Adam Brandenburger and

2850: Peter Hammond for a number of very enlightening discussions, Bob

2851: Stalnaker for pointing out Example~\ref{counter1}, Brian Skyrms for

2852: pointing me to Hammond's work, Bas van Fraassen for pointing

2853: me to Spohn's work, Amanda Friedenberg for her careful reading of an

2854: earlier draft, her many useful comments, and for encouraging me to try

2855: to understand what my results had to say about Battigalli and

2856: Sinischalchi's work,

2857: and Horacio Arlo-Costa, Geir Asheim, Larry Blume, Adam Brandenburger,

2858: Eddie Dekel, and the anonymous reviewers for a number of

2859: useful comments on earlier drafts of this paper.

2860:

2861:

2862: %joetark:

2863:

2864: \appendix

2865:

2866: \section{Appendix: Proofs}

2867: In this section, I prove all the results claimed in the main part of the

2868: paper.  For the convenience of the reader, I repeat the statements of

2869: the results.

2870:

2871: \medskip

2872:

2873: \othm{FCPfin}

2874: %joe9

2875: %If $W$ is finite, the map $\FCP$ is a bijection from $\SLPS(W,\F)$ to

2876: %$\Popper(W,\F)$.

2877: %joe10

2878: %If $W$ is finite and $(\F,\F')$ is a Popper algebra over $W$, then

2879: If $W$ is finite and $(\F,\F')$, then

2880: $\FCP$ is a bijection from $\SLPS(W,\F,\F')$ to $\Popper(W,\F,\F')$.

2881: \eothm

2882:

2883: \medskip

2884:

2885: \prf The first step is to show that $\FCP$ is an injection.

2886: If $\vecmu, \vecmu' \in \SLPS(W,\F,\F')$ and $\vecmu \ne \vecmu'$, let

2887: $\mu = \FCP(\W,\F,\vecmu)$, and let $\mu' = \FCP(\W,\F,\vecmu')$.    Let

2888: $i$ be the least index such that $\mu_i \ne \mu'_i$.

2889: There is some set $U$ such that $\mu_i(U) \ne \mu'_i(U)$.

2890: Let $U_i$ be the set such $\mu_i(U_i) = 1$ and $\mu_j(U_i) = 0$ for $j <

2891: i$; since $\vecmu$ is an SLPS, such a set $U_i$ exists.  Similarly, let

2892: $U_i'$ be such that $\mu_i'(U_i) = 1$ and $\mu_j'(U_i) = 0$ for $j <

2893: i$.  Since $\mu_j = \mu_j'$ for all $j < i$, we must have $\mu_j(U_i \union

2894: U_i') = \mu_j(U_i \union U_i') = 0$ for all $j <  i$.

2895: Clearly $\vecmu(U_j \union U_j') > 0$, so $U_j \union U_j' \in \F'$.

2896: Moreover,

2897: $\mu(U \mid U_i \union U_i') = \mu_i(U \mid U_i \union U_i') =

2898: \mu_i(U)$.  Similarly, $\mu'(U \mid U_i \union U_i') = \mu_i'(U)$.

2899: Hence, $\mu \ne \mu'$.

2900:

2901: To show that $\FCP$ is a surjection, given a cps $\mu$, let $\vecmu =

2902: (\mu_0, \ldots, \mu_k)$ be the LPS constructed in the main text.  We

2903: must show that

2904: $\FCP(\vecmu) = (W,\F,\F',\mu)$.  Suppose that  $\FCP(\vecmu) =

2905: (W,\F,\F'',\mu')$.

2906: I first show that $\F' = \F''$.  Suppose that $V \in \F''$.  Then

2907: $\mu_i(V) > 0$ for some $i$.  Thus, $\mu(V \mid U_i) > 0$.  Since

2908: $U_i \in \F'$, it follows that $V \in \F'$.  Thus, $\F'' \subseteq \F'$.

2909:

2910: To show that $\F' \subseteq \F''$, first note that, by

2911: construction, $\mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}

2912: ) = 1$.

2913: %Since $U_j \union \ldots \union U_k \subseteq \overline{U_0 \union

2914: %\ldots \union U_{j-1}}$, it follows from CP3 that

2915: %$$1= \mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}} ) = \mu(U_j \mid

2916: %U_j \union \ldots \union U_k) \times \mu(U_j \union \ldots \union U_k \mid

2917: %\overline{U_0 \union \ldots \union U_{j-1}}).$$

2918: %Thus, $\mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}} ) = 1$ and

2919: %$\mu(U_{j'} \mid \overline{U_0 \union \ldots \union U_{j-1}} ) = 0$ if $j' >

2920: %j$.

2921: It easily follows that if $V \subseteq \overline{U_0 \union \ldots

2922: \union U_{j-1}}$

2923: then $$\mu(V \mid \overline{U_0 \union \ldots \union U_{j-1}}) = \mu(V

2924: \inter U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}).$$

2925: Thus, by CP3,

2926: $$\mu(V \mid \overline{U_0 \union \ldots \union U_{j-1}}) =

2927: \mu(V \inter U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}) = \mu(V \mid

2928: U_j) \times

2929: \mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}),$$ so

2930: %if $V \subseteq \overline{U_0 \union \ldots \union U_{j-1}}$, then

2931: \begin{equation}\label{eq1}

2932: \mu(V \mid U_j) = \mu(V \mid \overline{U_0 \union \ldots \union U_{j-1}}).

2933: \end{equation}

2934:

2935: Now suppose that $V \in \F'$.

2936: Clearly $V \inter (U_0 \union \ldots \union U_k) \ne

2937: \emptyset$, for otherwise $V \subseteq \overline{U_0 \union \ldots

2938: \union U_k}$, contradicting the fact that $\overline{U_0 \union \ldots

2939: \union U_k} \notin \F'$.  Let $j_V$ be the smallest index $j$ such that $V

2940: \inter U_j \ne \emptyset$.

2941: I claim that $\mu(V \mid \overline{U_0 \union \ldots \union U_{j_V - 1}}) \ne

2942: 0$.  For if $\mu(V \mid \overline{U_0 \union \ldots \union U_{j_V - 1}}) =

2943: 0$, then $\mu(U_{j_V} - V \mid \overline{U_0 \union \ldots \union U_{j_V -

2944: 1}}) = 1$, contradicting the definition of $U_{j_V}$

2945: as the smallest set $U'$ such that $\mu(U' \mid \overline{U_0 \union

2946: \ldots \union U_{j_V - 1}}) = 1$.    Moreover, since

2947: $V \subseteq \overline{U_0 \union \ldots U_{j_V-1}}$, it follows

2948: from (\ref{eq1}) that

2949: $\mu(V \mid U_{j_V}) = \mu(V \mid \overline{U_0 \union

2950: \ldots \union U_{j_V - 1}}) > 0$.  Thus, $\mu_{j_V}(V) > 0$, so $V \in

2951: \F''$.

2952:

2953: This argument can be extended to show that $\mu(V' \mid V) = \mu'(V' \mid

2954: V)$ for all $V' \in \F$.

2955: Since $V \inter U_i = \emptyset$ for $i < j_V$, it follows that

2956: $\mu'(V' \mid V) = \mu_{j_V}(V' \mid V)$.

2957: By CP3, $\mu(V' \mid V) \times \mu(V \mid \overline{U_0 \union \ldots

2958: \union U_{j_V

2959: - 1}}) = \mu(V'\inter V \mid \overline{U_0 \union \ldots \union U_{j_V - 1}})$.

2960: By (\ref{eq1}) and the fact that $\mu(V \mid U_{j_V}) > 0$, it follows

2961: that $\mu(V' \mid V) = \mu(V'\inter V \mid U_{j_V})/\mu(V \mid

2962: U_{j_V})$,

2963: %joe9

2964: %i.e.,  that $\mu(V' \mid V) = \mu_{j_V}(V' \mid V)$.

2965: that is,  that $\mu(V' \mid V) = \mu_{j_V}(V' \mid V)$.

2966: \eprf

2967:

2968:

2969: \bigskip

2970:

2971: Although Theorem~\ref{infiso} was proved by Spohn \citeyear{Spohn86}, I

2972: include a proof here as well, to make the paper self-contained.

2973:

2974:

2975: \othm{infiso} For all $W$, the map $\FCP$ is a bijection from

2976: $\SLPS^c(W,\F,\F')$

2977: to $\Popper^c(W,\F,\F')$.  \eothm

2978:

2979: \medskip

2980:

2981: \prf Again, the difficulty comes in showing that $\FCP$ is onto.

2982: As it says in the main text, given a Popper space $(W,\F,\F',\mu)$, the

2983: idea is to

2984: construct sets $U_0, U_1, \ldots$ and an LPS $\vecmu$ such that

2985: $\mu_\beta(V)=\mu(V \mid U_\beta)$, and show that $\FCP(W,\F,\vecmu) =

2986: (W,\F,\F',\mu)$. The construction is somewhat involved.

2987:

2988: As a first step, put an order $\le$ on sets in

2989: $\F'$ by defining $U \le V$ if

2990: $\mu(U \mid U \union V) > 0$.

2991: %joe1

2992: (Essentially, the same order is considered by van Fraassen \citeyear{vF76}.)

2993:

2994: \lem\label{lem0}  $\le$ is transitive.  \elem

2995:

2996: \prf

2997: By definition, if $U \le V$ and $V \le V'$, then $\mu(U \mid U \union V) >0$

2998: and $\mu(V \mid V \union V') > 0$.  To see that $\mu(U \mid U \union

2999: V') > 0$, note that

3000: $\mu(U \mid U \union V \union V') + \mu(V \mid U \union V \union V') + \mu(V' \mid U

3001: \union V \union V') = 1$, so at least one of $\mu(U \mid U \union V \union

3002: V')$, $\mu(V \mid U \union V \union V')$, or $\mu(V' \mid U \union V \union V')$

3003: is positive.  I consider each of the cases separately.

3004:

3005: \paragraph{Case 1:} Suppose that $\mu(U \mid U \union V \union V') > 0$.  By CP3,

3006: $$\mu(U \mid U \union V \union V') = \mu(U \mid  U \union V') \times \mu(U \union

3007: V' \mid U \union V \union V').$$

3008: Thus, $\mu(U \mid U \union V') > 0$, as desired.

3009:

3010: \paragraph{Case 2:} Suppose that $\mu(V \mid U \union V \union V') > 0$.

3011: By assumption, $\mu(U \mid U \union V) > 0$; since $\mu(V \mid U \union V \union

3012: V') > 0$, it follows that $\mu(U \union V \mid U \union V \union V') > 0$.

3013: Thus, by CP3,

3014: $$\mu(U \mid U \union V \union V') = \mu(U \mid  U \union V) \times \mu(U \union

3015: V \mid U \union V \union V') > 0.$$

3016: Thus,  case 2 can be reduced to case 1.

3017:

3018: \paragraph{Case 3:} Suppose that $\mu(V' \mid U \union V \union V') > 0$.

3019: By assumption, $\mu(V \mid V \union V') > 0$; since $\mu(V' \mid U \union V \union

3020: V') > 0$, it follows that $\mu(V \union V' \mid U \union V \union V') > 0$.

3021: Thus, by CP3,

3022: $$\mu(V \mid U \union V \union V') = \mu(V \mid  V \union V') \times \mu(V \union

3023: V' \mid U \union V \union V') > 0.$$

3024: Thus, case 3 can be reduced to case 2.

3025:

3026:

3027: This completes the proof, showing that $\le$ is transitive.

3028: \eprf

3029:

3030: Define  $U \sim V$ if $U \le V$ and $V \le U$.

3031:

3032: \lem\label{lem1} $\sim$ is an equivalence relation on $\F'$. \elem

3033:

3034: \prf It is immediate from the

3035: definition that $\sim$ is reflexive and symmetric; transitivity follows

3036: from the transitivity of $\le$.   \eprf

3037:

3038: R\'{e}nyi \citeyear{Renyi56}

3039: and van Fraassen \citeyear{vF76} also considered the $\sim$ relation in

3040: their papers, and the argument that $\le$ is transitive is similar in

3041: spirit to R\'{e}nyi's argument that $\sim$ is transitive.

3042: However, the rest of this proof diverges from those of R\'{e}nyi and van

3043: Fraassen.

3044:

3045: Let $[U]$ denote the $\sim$-equivalence class of $U$, and

3046: let $\F'/\nsim = \{[U]: U \in \F'\}$.

3047:

3048:

3049: \lem\label{lem2} Each equivalence class $[V] \in \F'/\nsim$ is closed under

3050: countable unions.  \elem

3051:

3052: \prf Suppose that $V_1, V_2, \ldots \in [V]$.  I must show that

3053: $\union_{i=1}^\infty V_i \in [V]$.  Clearly $V_j \le \union_{i=1}^\infty

3054: V_i$ for all $j$.  Suppose, by way of contradiction, that

3055: $\union_{i=1}^\infty V_i \not\le V_j$ for some $j$.  Since $\le$ is

3056: transitive, it follows that $V_j < \union_{i=1}^\infty V_i$ for all $j$.

3057: Thus, $\mu(V_j \mid \union_{i=1}^\infty V_i) = 0$ for all $j$.

3058: But then, by countable additivity,

3059: %joe9

3060: % $$1 = \mu(\union_{i=1}^\infty V_i) \mid \union_{i=1}^\infty V_i) \le

3061:  $$1 = \mu(\union_{i=1}^\infty V_i \mid \union_{i=1}^\infty V_i) \le

3062: \sum_{j=1}^\infty \mu(V_j \mid \union_{i=1}^\infty V_i) = 0,$$

3063: a contradiction.  Thus, $[V]$ is closed under countable unions.

3064: \eprf

3065:

3066:

3067: \commentout{

3068: Next, observe that we can define a total preorder $\preceq$ (i.e., a

3069: reflexive and transitive relation) on $[V]$ using

3070: the same techniques as used to define $\le$.  Namely,

3071: $V_1 \preceq V_2$ if $\mu(V_1 \mid V_1 \union V_2) \le \mu(V_2 \mid V_1 \union

3072: V_2)$.  To see that $\preceq$ is transitive, suppose that $V_1, V_2, V_3

3073: \in [V]$ and $V_1 \preceq V_2$ and $V_2 \preceq V_3$.

3074: By CP3,

3075: \begin{equation}\label{eq2}

3076: \mu(V_i \mid V_1 \union V_2 \union V_3) = \mu(V_i \mid V_i \union V_j)

3077: \times \mu(V_i \union V_j \mid V_1 \union V_2 \union V_3),

3078: \end{equation}

3079: for all $i, j$.

3080: Since $V_1 \le V_2, applying (\ref{eq2}) first with $i = 1$ and $j=2$

3081: and then with $i=2$ and $j=1$,

3082: it follows that $\mu(V_1 \mid V_1 \union V_2 \union V_3)

3083: \le \mu(V_2 \mid V_1 \union V_2 \union V_3)$.  Similarly, since $V_2 \le

3084: V_3$, it follows that $\mu(V_2 \mid V_1 \union V_2 \union V_3)

3085: \le \mu(V_3 \mid V_1 \union V_2 \union V_3)$.

3086: Thus,

3087: \begin{equation}\label{eq3}

3088: \mu(V_1 \mid V_1 \union V_2 \union V_3) \le \mu(V_3 \mid V_1 \union V_2 \union

3089: V_3).

3090: \end{equation}

3091: Since $[V]$ is

3092: closed under unions, it follows that $V_1 \union V_3 \in [V]$ and

3093: $V_1 \union V_2 \union V_3 \in [V]$.  Thus, $\mu(V_1 \union V_3 \mid V_1

3094: \union V_2 \union V_3) > 0$.  Now it immediately follows from

3095: (\ref{eq2}) and (\ref{eq3}) that $\preceq$ is transitive.}

3096:

3097: Fix an element $V_0 \in [V]$.

3098: \lem\label{lem3} $\inf \{\mu(V_0 \mid V_0 \union V'): V' \in [V]\} > 0$. \elem

3099:

3100: \prf Suppose that $\inf \{\mu(V_0 \mid V_0 \union V'): V' \in [V]\} = 0$.

3101: Then there exist sets $V_1, V_2, \ldots$ such that $\mu(V_0 \mid V_0 \union

3102: V_n) < 1/n$.  Since $[V]$ is closed under countable unions,

3103: $\union_{i=1}^n V_i \in [V]$.  Since $V_0 \sim \union_{i=1}^n V_i$, it

3104: follows that $\mu(V_0 \mid \union_{i=0}^\infty V_i) > 0$.

3105: But, by CP3, $$\mu(V_0 \mid \union_{i=0}^\infty V_i) = \mu(V_0 \mid V_0 \union V_n)

3106: \times \mu(V_0 \union V_n \mid \union_{i = 0}^\infty V_i) \le \mu(V_0 \mid V_0

3107: \union V_n) \le 1/n.$$

3108: Since this is true for all $n > 0$, it follows that

3109: $\mu(V_0 \mid \union_{i=0}^\infty V_i) = 0$, a contradiction.

3110: \eprf

3111:

3112: The next lemma shows that each equivalence class in $\F'/\nsim$ has a

3113: ``maximal element''.

3114:

3115: \lem\label{lem4} In each equivalence class $[V]$, there is an element

3116: $V^*\in [V]$ such that $\mu(V^* \mid V' \union V^*) = 1$ for all $V' \in

3117: [V]$. \elem

3118:

3119: \prf Again, fix an element $V_0 \in [V]$.  By Lemma~\ref{lem3}, there

3120: exists some $\alpha_V > 0$ such that $\inf \{\mu(V_0 \mid V_0 \union V'): V'

3121: \in [V]\} = \alpha_V$. Thus, there exist sets $V_1, V_2, V_3, \ldots \in

3122: [V]$ such that $\mu(V_0 \mid V_0 \union V_n) < \alpha + 1/n$.  By

3123: Lemma~\ref{lem2}, $V^* = \union_{i=0}^\infty V_i \in [V]$.   By CP3,

3124: %\begin{equation}

3125: $$\mu(V_0 \mid V^*) = \mu(V_0 \mid V_0 \union V_n) \times \mu(V_0 \union V_n \mid V^*)

3126: \le \mu(V_0 \mid V_0 \union V_n) < \alpha_V + 1/n.$$

3127: %\end{equation}

3128: Thus, $\mu(V_0 \mid V^*) \le \alpha_V$.  By choice of $\alpha_V$, it follows

3129: that $\mu(V_0 \mid V^*) = \alpha_V$.

3130:

3131: Suppose that

3132: $\mu(V^* \mid V' \union V^*) < 1$ for some $V' \in [V]$.  But then, by CP3,

3133: $$\mu(V_0 \mid V' \union V^*) = \mu(V_0 \mid V^*) \times \mu(V^* \mid V' \union V^*) <

3134: \alpha_V,$$ contradicting the choice of $\alpha_V$.  Thus,

3135: $\mu(V^* \mid V' \union V^*) = 1$ for all $V' \in [V]$.  \eprf

3136:

3137: Define a

3138: total order on these equivalence relations by taking $[U] \le [V]$ if

3139: $U' \le V'$ for some $U' \in [U]$ and $V' \in [V]$.  It is easy to check

3140: (using the transitivity of $\le$) that if $U' \le V'$ for some $U' \in

3141: [U]$ and some $V' \in [V]$, then $U'' \le V''$ for all $U'' \in [U]$ and

3142: all $V'' \in [V]$.

3143:

3144: \lem $\le$ is a well-founded relation on $\F'/\nsim$. \elem

3145:

3146: \prf

3147: Note that if $[U] < [V]$, then $\mu(V \mid U \union V) = 0$.  It now

3148: follows from countable additivity that $<$ is a well-founded order on

3149: these equivalence classes.  For suppose that there exists an infinite

3150: decreasing sequence $[U_0] > [U_1] > [U_2] > \ldots$.

3151: Since $\F$ is a $\sigma$-algebra, $\union_{i=0}^\infty U_i \in \F$; since

3152: $\F'$ is closed under supersets, $\union_{i=0}^\infty U_i \in \F'$.

3153: By CP3,

3154: $$\mu(U_j \mid \union_{i=0}^\infty U_i) = \mu(U_j \mid U_{j} \union U_{j+1}) \times

3155: \mu(U_j \union U_{j+1} \mid \union_{i=0}^\infty U_i) = 0.$$  Let $V_0 = U_0$

3156: and, for $j > 0$, let

3157: $V_j = U_j - (\union_{i =0}^{j-1} U_j)$.  Clearly the $V_j$'s are

3158: pairwise disjoint, $\union_i U_i = \union_i V_i$, and

3159: $\mu(V_j \mid \union_{i=0}^\infty U_i) \le \mu(U_j \mid \union_{i=0}^\infty U_i) = 0$.

3160: It now follows that using countable additivity that

3161: %joe9

3162: %$$1 = \mu(\union_{i=0}^\infty U_i \mid \union_{i=0}^\infty U) =

3163: $$1 = \mu(\union_{i=0}^\infty U_i \mid \union_{i=0}^\infty U_i) =

3164: \sum_{i=0}^\infty \mu(V_i \mid  \union_{i=0}^\infty U_i) = 0.$$

3165: This is as contradiction, so the equivalence classes are well-founded.

3166: \eprf

3167:

3168: Because $\le$ is well-founded, there is an order-preserving bijection

3169: $O$ from $\F'/\nsim$ to an initial segment of the ordinals (i.e., $[U]

3170: \le [V]$ iff $O([U]) \le O([V])$.

3171: Thus, the equivalence classes can be enumerated using all the ordinals

3172: less than some ordinal $\alpha$.  By Lemma~\ref{lem4}, there are

3173: sets $U_\beta$, $\beta < \alpha$, in $\F'$ such that if $O([U]) =

3174: \beta$, then $U_\beta \in [U]$ and $\mu(U_\beta \mid U \union U_\beta) = 1$

3175: for all $U' \in [U]$.  Define an LPS $\vecmu = (\mu_0, \mu_1, \ldots )$ of

3176: length $\alpha$ by taking $\mu_\beta(V) = \mu(V \mid U_\beta)$.  The choice

3177: of the $U_\beta$'s guarantees that this is actually an SLPS.

3178:

3179: It remains to show that $(W,\F,\F',\mu)$ is the result of applying

3180: %joe9

3181: %$F_{C \rightarrow P}$

3182: $\FCP$

3183: to $(W,\F,\vecmu)$.  Suppose that instead $(W,\F,\F'',\mu')$ is the

3184: result.  The argument that $\F'' \subseteq \F'$ is identical to that in

3185: the finite case: If $V \in \F''$, then

3186: $\mu_\beta(V) > 0$ for some $\beta$.  Thus, $\mu(V \mid U_\beta) > 0$.  Since

3187: $U_\beta \in \F'$, it follows that $V \in \F'$.  Thus, $\F'' \subseteq \F'$.

3188:

3189: Now suppose that $V \in \F'$.  Thus, $V \sim V_\beta$ for some $\beta <

3190: \alpha$.  It follows that $\mu(V \mid V_\beta) > 0$, so $V \in \F''$.

3191:

3192: Finally, to show that $\mu(U \mid V) = \mu'(U \mid V)$, suppose that $\beta$ is

3193: such that $V \sim V_\beta$.  It follows that $\mu(V \mid V_{\beta'}) = 0$ for

3194: $\beta' < \beta$ and $\mu(V \mid V_{\beta}) > 0$.  Thus, by definition,

3195: $\mu'(U \mid V) = \mu_\beta(U \mid V)$.  Without loss of generality, assume

3196: that $U \subseteq V$ (otherwise replace $U$ by $U \inter V$).  Thus, by

3197: CP3,

3198: \begin{equation}\label{eq4}

3199: \mu(U \mid V) \times \mu(V \mid V \union V_\beta) = \mu(U \mid V \union V_\beta).

3200: \end{equation}

3201: Suppose $V' \subseteq V$.

3202: Clearly $$\mu(V' \mid V \union V_\beta) = \mu(V' \inter V_\beta \mid V \union

3203: V_\beta) + \mu(V' \inter \overline{V_\beta} \mid V \union V_\beta).$$

3204: Now by CP3 and the fact that  $\mu(V_\beta \mid V \union V_\beta) = 1$,

3205: $$\mu(V' \inter V_\beta \mid V \union V_\beta) = \mu(V' \mid V_\beta) \times

3206: \mu(V_\beta \mid V \union V_\beta) = \mu(V' \mid V_\beta)$$

3207: and

3208: $$\mu(V' \inter \overline{V_\beta} \mid V \union V_\beta) \le

3209: \mu(\overline{V_\beta} \mid V \union V_\beta) = 0.$$

3210: Thus, $\mu(V' \mid V \union V_\beta) = \mu(V' \mid V_\beta)$.

3211: Applying this observation to both $U$ and $V$ shows that

3212: $\mu(V \mid V \union V_\beta) = \mu(V \mid V_\beta)$ and $\mu(U \mid V

3213: \union V_\beta)

3214: =\mu(U \mid V_\beta)$.  Plugging this into (\ref{eq4}), it follows that

3215: $$\mu(U \mid V) = \mu(U \mid V_\beta)/\mu(V \mid V_\beta) = \mu_\beta(U)/\mu_\beta(V) =

3216: \mu_\beta(U \mid V) = \mu'(U \mid V).$$

3217: This completes the proof of the theorem.

3218: \eprf

3219:

3220:

3221: \bigskip

3222: \opro{prop:BS} The map $\FCP$ is a surjection from

3223: $\SLPS^c(W,\F,\F')$ onto $\T^c(W,\F,\F')$.  \eopro

3224:

3225: \medskip

3226:

3227: \prf Suppose that $\mu \in \T^c(W,\F,\F')$.  I want to construct an

3228: SLPS $\vecmu \in \SLPS^c(W,\F,\F')$ such that $\FCP(\vecmu) = \mu$.

3229: I first label each element of $\F'$ with a natural

3230: number.  Intuitively, if $U \in \F'$ is labeled $k$, then $k$ will be

3231: the least index such that $\mu_k(U) > 0$.  The labeling is done by

3232: induction on $k$.  Each topmost set in the forest

3233: (i.e., the root of some tree in the forest) is labeled 0, as are all

3234: sets $U'$ such that $\mu(U' \mid U) > 0$, where $U$ is a topmost node.

3235: These are all the nodes labeled by 0.  Label all the maximal unlabeled

3236: sets by 1 (that is, label $U \in \F'$ by 1 if it is not labeled 0, and

3237: is not a subset of another unlabeled set); in addition, label a set $U'$

3238: by 1 if $\mu(U' \mid U) > 0$ and $U$ is labeled by 1.  Note that every

3239: set at depth 0 or 1 in the forest is labeled by either 0 or 1.

3240:

3241: Suppose that the labeling process has been completed for labels $0,

3242: \ldots, k$ such that the following properties hold, where $\lab(U)$

3243: denotes the label of the event $U$:

3244: \begin{itemize}

3245: \item all sets up to depth $k$ in the forest have been labeled;

3246: \item if $\lab(U) = k'$, $U' \in \F'$, and $\mu(U' \mid U) > 0$, then

3247: $\lab(U') \le \lab(U)$.

3248: \end{itemize}

3249: Label all the maximal unlabeled sets with $k+1$; in addition, if $U'$

3250: is unlabeled and $\mu(U' \mid U) > 0$ for some $U$ such that $\lab(U)

3251: = k+1$, then assign label $k+1$ to $U'$.  Clearly the two properties

3252: above continue to hold.  This completes the labeling process.

3253:

3254: Let $\C_k$ be the set of maximal sets in $\F'$ labeled $k$.

3255: T2 and T3 guarantee that, for all $k$, the sets in $\C_k$ are

3256: disjoint.  Let $\mu_k'$ be

3257: an arbitrary probability on $W$ such that $\mu_k'(U) > 0$ for all $U \in

3258: \C_k$ and $\sum_{U \in C_k} \mu_k'(U) = 1$.  Define an LPS

3259: $\vecmu = (\mu_0, \mu_1, \ldots)$ as follows (where the length of

3260: $\vecmu$ is $\omega$ if $\C_k \ne \emptyset$ for all $k$, and

3261: is $k+1$ if $k$ is the largest integer such that $\C_k \ne \emptyset$).

3262: For $V \in \F$, let $\mu_j(V) = \sum_{U \in \C_j} \mu(V \mid U)

3263: \mu_j'(U)$.  I now show that $\vecmu(V \mid U) = \mu(V \mid U)$ for all $V \in

3264: \F$ and $U \in \F'$.  Suppose that $U \in \C_k$.  Then $\mu_j(U) = 0$ for

3265: all $j < k$, and $\mu_k(U) > 0$.  Thus, $\vecmu(V \mid U) = \mu_k(V \mid

3266: U)$.  But it is immediate from the definition that $\mu_k(V \mid U) =

3267: \mu(V \mid U)$.  Thus, $\FCP(\vecmu) = \mu$.  Moreover, if $U \in \F'$

3268: and $\lab(U) = k$, let $U'$ be the maximal set containing $U$ such

3269: that $\lab(U') = k$.  (The labeling guarantees that such a set

3270: exists.)  Then $\mu_k(U') = \mu(U' \mid U) > 0$.  It follows that

3271: $\vecmu(U) > 0$ for all $u \in \F'$.  Finally, note that

3272: $\vecmu$ is an SLPS (in fact, an LCPS). If $U_k = \union \C_k -

3273: \union_{k' > k} (\union \C_{k'})$, then the sets $U_k$ are disjoint, and

3274: $\mu_k(U_k) = 1$.  \eprf

3275:

3276:

3277:

3278: \bigskip

3279:

3280: \opro{FCPaeq}

3281: If $\nu \aeq \vecmu$, then $\nu(U) > 0$ iff $\vecmu(U) > \vec{0}$.

3282: Moreover, if $\nu(U) > 0$, then $\stand{\nu(V \mid U)} = \mu_j(V \mid U)$, where

3283: $\mu_j$ is the first probability measure in $\vecmu$ such that $\mu_j(U)

3284: > 0$. \eopro

3285:

3286: \medskip

3287:

3288: \prf  Recall that for  $U \subseteq W$, $\chi_U$ is the indicator

3289: function for $U$;

3290: that is, $\chi_U(w) = 1$ if $w \in U$ and $\chi_U(w) = 0$ otherwise.

3291: Notice that $E_\nu(\chi_U) > E_\nu(\chi_{\emptyset})$ iff $\nu(U) > 0$

3292: and $E_{\vecmu}(\chi_U) > E_{\vecmu}(\chi_{\emptyset})$ iff $\vecmu(U) >

3293: \vec{0}$.  Since $\nu \aeq \vecmu$, it follows that

3294: $\nu(U) > 0$ iff $\vecmu(U) > \vec{0}$.  If $\nu(U) > 0$,

3295: %joe6

3296: %note that $E_\nu(\chi_{U\inter V} - \alpha \chi_U) >

3297: %E_\nu(\chi_{\emptyset})$ iff $\alpha < \stand{\nu(V \mid U)}$.  Similarly,

3298: %$E_{\vecmu}(\chi_{U\inter V} - \alpha \chi_U) >

3299: %E_{\vecmu}(\chi_{\emptyset})$ iff $\alpha < \mu_j(U)$, where $j$ is the

3300: note that $E_\nu(\chi_{U\inter V} - r \chi_U) >

3301: E_\nu(\chi_{\emptyset})$ iff $r < \stand{\nu(V \mid U)}$.  Similarly,

3302: $E_{\vecmu}(\chi_{U\inter V} - r \chi_U) >

3303: E_{\vecmu}(\chi_{\emptyset})$ iff $r < \mu_j(U)$, where $j$ is the

3304: least index such that $\mu_j(U) > 0$.  It follows that $\stand{\nu(V \mid U)}

3305: = \mu_j(V \mid U)$.  \eprf

3306:

3307: \bigskip

3308:

3309: \opro{motivation} If $\vecmu, \vecmu' \in \SLPS(W,\F)$, then

3310: $\vecmu \aeq \vecmu'$

3311: iff $\vecmu = \vecmu'$.

3312: %Moreover, if $\vecmu \in \LPS^c(W,\F)$, then

3313: %there exists a unique $\vecmu' \in \SLPS^c(W,\F)$ such that $\vecmu \aeq

3314: %\vecmu'$.

3315: \eopro

3316:

3317: \medskip

3318:

3319: \prf Clearly $\vecmu = \vecmu'$ implies that $\vecmu \aeq \vecmu'$.

3320: For the converse, suppose that $\vecmu \aeq \vecmu'$ for $\vecmu,

3321: \vecmu' \in \SLPS(W,\F)$.  If $\vecmu \ne \vecmu'$, let $\alpha$ be the

3322: least ordinal such that $\mu_\alpha \ne \mu'_\alpha$, and let $U$ be

3323: such that $\mu_\alpha(U) \ne \mu'_\alpha(U)$.  Without loss of

3324: generality, suppose that $\mu_\alpha(U) >  \mu'_\alpha(U)$.

3325: Let the sets $U_\beta$

3326: be such that $\mu_\beta(U_\beta) = 1$ and $\mu_\beta(U_\gamma) = 0$ if

3327: $\gamma > \beta$; similarly choose the sets $U_\beta'$.  Since

3328: $\mu_\beta = \mu'_\beta$ for $\beta < \alpha$, it follows that

3329: %joe9

3330: %$\mu_\beta(U_\alpha \union U'_\alpha) = \mu_\beta(U_\alpha \union

3331: %U'_\alpha) = 0$ for $\beta < \alpha$; moreover

3332: %$\mu_\alpha(U_\alpha \union U'_\alpha) = \mu_\alpha(U_\alpha \union

3333: $\mu_\beta(U_\alpha \union U'_\alpha) = \mu'_\beta(U_\alpha \union

3334: U'_\alpha) = 0$ for $\beta < \alpha$; moreover,

3335: $\mu_\alpha(U_\alpha \union U'_\alpha) = \mu_\alpha'(U_\alpha \union

3336: U'_\alpha) = 1$.  Choose $r$ such that $\mu_\alpha(U) > r >

3337: \mu'_\alpha(U)$.  Let $X$ be the random variable $\chi_U -

3338: r\chi_{U_\alpha \union U'_\alpha}$ and let $Y = \chi_\emptyset$.

3339: Then $E_{\vecmu}(X) > E_{\vecmu}(Y)$, while

3340: $E_{\vecmu'}(X) < E_{\vecmu'}(Y)$, so $\vecmu \not\aeq \vecmu'$.

3341: \eprf

3342:

3343: \bigskip

3344:

3345: \opro{finiteeq} If $W$ is finite, then every LPS over $(W,\F)$ is

3346: equivalent to an LPS of length at most $|\Bas(\F)|$. \eopro

3347:

3348: \medskip

3349:

3350: \prf Suppose that $W$ is finite and $\Bas(\F) = \{U_1, \ldots, U_k\}$.

3351: Given an LPS $\vecmu$, define a finite subsequence $\vecmu' =

3352: %joe9

3353: %(\mu_{m_0}, \ldots, \mu_{m_h})$ of

3354: (\mu_{k_0}, \ldots, \mu_{k_h})$ of

3355: $\vecmu$  as follows.  Let $\mu_{k_0} = \mu_0$.  Suppose that

3356: $\mu_{k_0}, \ldots, \mu_{k_j}$ have been defined.  If all probability

3357: measures in $\vecmu$

3358: with index greater that $k_j$ are linear combinations of the probability

3359: measures with index $\mu_{k_0}, \ldots, \mu_{k_j}$, then take $\vecmu'

3360: = (\mu_{k_0}, \ldots, \mu_{k_j})$.  Otherwise, let $\mu_{k_{j+1}}$ be

3361: the probability measure in $\vecmu$ with least index that is not a

3362: linear combination of $\mu_{k_0}, \ldots, \mu_{k_j}$.

3363: Since a probability measure over $(W,\F)$ is determined by its value on

3364: the sets in $\Bas(\F)$,

3365: a probability measure over $(W,\F)$ can be identified with a vector in

3366: $\IR^{|\Bas(\F)|}$: the vector defining the probabilities of the

3367: elements in $\Bas(\F)$.  There can be at most $|\Bas(\F)|$ linearly

3368: independent such vectors, thus $\vecmu'$ has length at most

3369: $|\Bas(\F)|$.

3370:

3371: It remains to show that $\vecmu'$ is equivalent to $\vecmu$.  Given

3372: random variables $X$ and $Y$, suppose that $E_{\vecmu}(X) <

3373: E_{\vecmu}(Y)$.  Then there is some minimal index $\beta$ such that

3374: $E_{\mu_\gamma}(X) = E_{\mu_\gamma}(Y)$ for all $\gamma < \beta$ and

3375: $E_{\mu_\beta}(X) < E_{\mu_\beta}(Y)$.  It follows that

3376: $\mu_\beta$ cannot be a linear combination of $\mu_\gamma$ for $\gamma <

3377: \beta$.  Thus, $\mu_\beta$ is one of the  probability measures in

3378: $\vecmu'$.  Moreover, the expected value of $X$ and $Y$ agree for all

3379: probability measures in $\vecmu'$ with lower index (since they do in

3380: $\vecmu$).  Thus, $E_{\vecmu'}(X) < E_{\vecmu'}(X)$.

3381:

3382: The argument in the other direction is similar in spirit and left to the

3383: reader.  \eprf

3384:

3385:

3386: \othm{lpsnps} If $W$ is finite, then

3387: %joe2

3388: %$\FLN$ is an isomorphism from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$.

3389: $\FLN$ is a bijection from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$

3390: that preserves equivalence (that is, each NPS in $\FLN([\vecmu])$ is

3391: equivalent to $\vecmu$).

3392: \eothm

3393:

3394: %joe2

3395: \prf   I first provide a sufficient condition for an NPS to be equivalent

3396: an LPS in a finite space.

3397:

3398: \lem\label{aeqchar} Suppose that $\vecmu = (\mu_0,\ldots, \mu_k)$, and

3399: $\epsilon_0, \ldots, \epsilon_k$ are

3400: such that $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i = 1, \ldots,

3401: k-1$ and $\sum_{i=0}^k \epsilon_i = 1$.  Then $\vecmu \aeq \epsilon_0

3402: \mu_0 + \cdots + \epsilon_k

3403: \mu_k$.%

3404: \footnote{Although I do not need this fact here, it is easy to see that

3405: if $W$ is finite and $\vecmu = (\mu_0, \ldots, \mu_k)$ is

3406: an SLPS in $\LPS(W,\F)$, then the converse of Lemma~\ref{aeqchar} holds

3407: as well: if $\nu \aeq \vecmu$, then  $\nu = \epsilon_0 \mu_0 + \cdots

3408: \epsilon_k \mu_k$ for some $\epsilon_0, \ldots, \mu_k$ are such that

3409: $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i = 1, \ldots,

3410: k-1$ and $\sum_{i=0}^k \epsilon_i = 1$.  (I conjecture this fact is true

3411: in general, not just if $\vecmu$ is an SLPS, but I have not checked this.}

3412: \elem

3413:

3414: \prf Suppose that there exist $\epsilon, \ldots, \epsilon_k$ as in the

3415: statement of the lemma and

3416: %Let $\vecmu = (\mu_0, \ldots, \mu_k)$ and let

3417: $\nu = \epsilon_0 \mu_0 + \cdots + \epsilon_k \mu_k$.  I want to show

3418: that $\vecmu \aeq \nu$.

3419:

3420: If $E_{\vecmu}(X) < E_{\vecmu}(Y)$,

3421: then there exists some $j \le k$ such

3422: that $E_{\mu_j}(X) < E_{\mu_j}(Y)$ and $E_{\mu_{j'}}(X) =

3423: E_{\mu_{j'}}(Y)$ for all $j' < j$.

3424: Since $E_\nu(X) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(X)$ and

3425: $E_\nu(Y) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(Y)$,

3426: to show that $E_\nu(X) < E_\nu(Y)$, it suffices to show

3427: %joe9

3428: %that $\epsilon_j(E_{\mu_j}(X) - E_{\mu_j}(Y)) >

3429: that $\epsilon_j(E_{\mu_j}(Y) - E_{\mu_j}(X)) >

3430: \sum_{i=j+1}^k \epsilon_i (E_{\mu_i}(X) - E_{\mu_i}(Y))$.  Since

3431: $\epsilon_{j'+1} \le \epsilon_{j'}$ for $j' \ge j$

3432: %joe6

3433: (this follows from the fact that $\stand{\epsilon_{j'+1}/\epsilon{j'}} =

3434: 0$), it follows that

3435: $\sum_{i=j+1}^k \epsilon_i (E_{\mu_i}(X) - E_{\mu_i}(Y)) \le

3436: \epsilon_{j+1} \sum_{i=j+1}^k |E_{\mu_i}(X) - E_{\mu_i}(Y)|$.

3437: %joe9

3438: Thus, it suffices to show that $\epsilon_{j+1} \sum_{i=j+1}^k

3439: |E_{\mu_i}(X) - E_{\mu_i}(Y)| <  \epsilon_j(E_{\mu_j}(Y) -

3440: E_{\mu_j}(X))$.

3441: %joe6

3442: This is trivially the case if $E_{\mu_i}(X) = E_{\mu_i}(Y)$ for all

3443: $i$ such that $j+1 \le i \le k$.  Thus, assume without loss of

3444: generality that $\sum_{i=j+1}^k |E_{\mu_i}(X) - E_{\mu_i}(Y)| > 0$.

3445: In this case, it suffices to show that $\epsilon_{j+1}/\epsilon_{j} <

3446: %joe9

3447: %(E_{\mu_j}(X) - E_{\mu_j}(Y))/\sum_{i=j+1}^k |E_{\mu_i}(X) -

3448: (E_{\mu_j}(Y) - E_{\mu_j}(X))/\sum_{i=j+1}^k |E_{\mu_i}(X) -

3449: E_{\mu_i}(Y)|$. Since the right-hand side of the inequality is  a

3450: positive real and $\stand{\epsilon_{j+1}/\epsilon_{j}} = 0$, the result

3451: follows.

3452:

3453: The argument in the opposite direction is similar.  Suppose that

3454: $E_\nu(X) < E_\nu(Y)$.

3455: %joe6

3456: %Again, since $E_\mu(X) = \sum_{i=0}^k \epsilon_i

3457: %E_{\mu_i}(X)$ and  $E_\mu(Y) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(Y)$,

3458: Again, since $E_\nu(X) = \sum_{i=0}^k \epsilon_i

3459: E_{\mu_i}(X)$ and  $E_\nu(Y) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(Y)$,

3460: it must be the case that if  $j$ is the least index such that

3461: $E_{\mu_j}(X) \ne E_{\mu_j}(Y)$, then  $E_{\mu_j}(X) < E_{\mu_j}(Y)$.

3462: Thus, $E_{\vecmu}(X) < E_{\vecmu}(Y)$.  It follows that $\vecmu \aeq \nu$.

3463: \eprf

3464:

3465:

3466:

3467: It remains to show that, given an NPS $(W,\F,\nu)$, there is an equivalence

3468: class $[\vecmu]$ such that $\FLN([\vecmu]) = [\nu]$.

3469: %joe9

3470: %My goal is to find (standard) probability measures

3471: As I said in the main text, the goal now is to find (standard) probability

3472: measures

3473: $\mu_0, \ldots, \mu_{k}$ and $\epsilon_0, \ldots, \epsilon_k$ such that

3474: $\stand{\epsilon_{i+1}/\epsilon_i} = 0$ and $\nu = \epsilon_0 \mu_0 +

3475: \cdots + \epsilon_k\mu_k$.  If this can be done then, by

3476: Lemma~\ref{aeqchar}, $\nu \aeq (\mu_0, \ldots, \mu_k)$, and we are done.

3477:

3478: Suppose that $\Bas(\F) = \{U_1, \ldots, U_k\}$

3479: and that $\nu$ has range $\IR^*$.  Note that

3480: a probability measure $\nu'$ on $\F$ can be identified with a

3481: vector $(a_1, \ldots, a_k)$ over $\IR^*$, where $\nu'(U_i) = a_i$, so

3482: that $a_1 + \cdots + a_k = 1$.  In the rest of this proof, I frequently

3483: identify $\nu$ with such a vector.

3484:

3485:

3486: \lem\label{newlem1} There exist $k' \le k$, $\epsilon_0, \ldots,

3487: \epsilon_{k'}$ where $\epsilon_0 = 1$,

3488: $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i =

3489: 1, \ldots, k'-1$, and standard real-valued vectors

3490: $\vec{b}_j$, $j = 0, \ldots, k'$, in $\IR^k$ such that

3491: $$\nu = \sum_{j=0}^{k'} \epsilon_j \vec{b}_j.$$

3492: \elem

3493:

3494: \prf I show by induction on $m \le k$ that there exist $\epsilon_0,\ldots,

3495: \epsilon_m$ and $m' \le m$ such that $\epsilon_j = 0$ for $j' > m'$,

3496: $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i =

3497: 1, \ldots, m'-1$, and standard vectors

3498: $\vec{b}_j$ $j = 0, \ldots, m-1$

3499: and a possibly nonstandard vector $\vec{b}'_m =

3500: (b'_{m1}, \ldots, b'_{mk})$ such that

3501: (a) $\nu = \sum_{j=0}^{m-1} \epsilon_j \vec{b}_j + \epsilon_m \vec{b}'_m$,

3502: (b) $|b'_{mi}| \le 1$, and (c) at least $m$ of $b'_{m1}, \ldots, b'_{mk}$

3503: are standard.

3504:

3505: For the base case (where $m=0$), just take $\vec{b}'_0 =

3506: \nu$ and $\epsilon_0 = 1$.  For the inductive step, suppose that $0 \le m

3507: < k$. If  $\vec{b}'_m$ is standard, then take $\vec{b}_m = \vec{b}'_m$,

3508: $\vec{b}_{m+1} = \vec{0}$, and $\epsilon_{m+1}

3509: = 0$.  Otherwise, let $\vec{b}_m = \stand{\vec{b}'_m}$ and let

3510: $\vec{b}''_{m+1} =

3511: \vec{b}'_m - \vec{b}_m$.  Let $\epsilon' = \max\{|b''_{(m+1)i}|: i = 1,

3512: \ldots, k\}$.    Since not all components of $\vec{b}'_m$ are standard,

3513: $\epsilon' > 0$.  Note that, by construction, $\stand{\epsilon'/ b_{mi}} =

3514: 0$ if $b_{mi} \ne 0$, for $i = 1, \ldots, k$.  Let $\vec{b}'_{m+1} =

3515: \vec{b}''_{m+1}/\epsilon'$ and let $\epsilon_{m+1} = \epsilon'

3516: \epsilon_m$.

3517: By construction, $|b'_{(m+1)i}| \le 1$ and at least one

3518: component of $\vec{b}'_{m+1}$ is either 1 or $-1$.  Moreover, if

3519: $b_{mi}'$ is standard, then $b''_{(m+1)i} = b'_{(m+1)i} = 0$.  Thus,

3520: $\vec{b}'_{m+1}$ has at least one more standard component that

3521: $\vec{b}'_m$.  Since clearly $\nu = \sum_{j=0}^m\epsilon_j \vec{b}_j +

3522: \epsilon_{m+1} \vec{b}_{m+1}'$, this completes the inductive step.

3523: The lemma follows immediately.

3524: \eprf

3525:

3526: Returning to the proof of Theorem~\ref{lpsnps},

3527: I next prove by induction on $m$ that for all $m \le k'$ (where $k' \le

3528: k$ is as in Lemma~\ref{newlem1}), there exist standard probability measures

3529: $\mu_0, \ldots, \mu_m$,  (standard) vectors $\vec{b}_{m+1},

3530: \ldots, \vec{b}_{k'} \in \IR^k$, and $\epsilon_1, \ldots,

3531: \epsilon_{k'}$ such that $\nu = \sum_{j=0}^m \epsilon_j \mu_j +

3532: \sum_{j = m+1}^{k'} \epsilon_j \vec{b}_j$.

3533:

3534: The base case is immediate from Lemma~\ref{newlem1}: taking $\vec{b}_j$,

3535: $j = 1, \ldots, k'$ as in Lemma~\ref{newlem1},

3536: $\vec{b}_0$ is in fact a probability measure since $\vec{b}_0 = \stand{\nu}$.

3537: Suppose that the result holds for $m$. Consider $\vec{b}_{m+1}$.

3538: If $b_{(m+1)i} < 0$ for some $j$ then, since $\nu(U_i) \ge 0$,

3539: there must exist $j' \in \{1, \ldots, m\}$ such that $\mu_{j'}(U_i) >

3540: 0$.  Thus, there exists some $N > 0$ such that $N(\mu_{j'}(U_i)) +

3541: b_{(m+1)i} > 0$.

3542: Since there are only finitely many basic elements

3543: and every element in the vector $\mu_j$ is nonnegative, for $j = 0,

3544: \ldots, m$, there must exist some

3545: $N'$ such that

3546: $\vec{b}'_{m+1} = N'( \mu_0 + \cdots

3547: +  \mu_m) + \vec{b}_{m+1} \ge 0$.  Let $c = \sum_{i = 1}^k b_{(m+1)i}'$, and

3548: let $\mu_{m+1} = \vec{b}'_{m+1}/c$.  Clearly,

3549: $\nu = (\epsilon_0 -N' \epsilon_{m+1}) \mu_0 + \cdots (\epsilon_m - N'

3550: \epsilon_{m+1}) \mu_m + c \epsilon_{m+1} \mu_{m+1} + \sum_{j=m+2}^{k'}

3551: \vec{b}_j$.  This completes the proof of the inductive step.

3552:

3553: The theorem now immediately follows. \eprf

3554:

3555: \bigskip

3556:

3557:

3558: \opro{infiniteeq} Every LPS over $(W,\F)$ is

3559: equivalent to an LPS over $(W,\F)$ of length at most $|\F|$. \eopro

3560:

3561: \medskip

3562:

3563: \prf The argument is essentially the same as that for

3564: Proposition~\ref{finiteeq}, using the observation that

3565: a probability measure over $(W,\F)$ can be identified with an element of

3566: $\IR^{|\F|}$; the vector defining the probabilities of the elements in

3567: $\F$.  I leave details to the reader. \eprf

3568:

3569:

3570: \pro\label{counter} For the NPS $(W,\F,\nu)$ constructed in

3571: Example~\ref{counter4},

3572: there is no LPS $\vecmu$ over $(W,\F)$ such that $\nu \aeq

3573: \vecmu$. \epro

3574:

3575:

3576: \prf I start with a straightforward lemma.

3577:

3578: \lem\label{distinct} Given an LPS $\vecmu$, there is an LPS $\vecmu'$

3579: such that $\vecmu

3580: \aeq \vecmu'$ and all the probability measures in $\vecmu'$ are

3581: distinct.

3582: \elem

3583:

3584: \prf Define $\vecmu'$ to be the subsequence consisting of all the

3585: distinct probability measures in $\vecmu$.  That is, suppose that $\vecmu =

3586: %joe9

3587: %(\mu_0, \mu_1, \ldots )$.  Then $\vecmu = (\mu_{k_0}, \mu_{k_1},

3588: (\mu_0, \mu_1, \ldots )$.  Then $\vecmu' = (\mu_{k_0}, \mu_{k_1},

3589: \ldots )$, where $k_0 = 0$, and, if $k_\alpha$ has been defined for all

3590: $\alpha < \beta$ and

3591: there exists an index $\gamma$ such that $\mu_{k_\alpha} \ne \mu_\gamma$ for

3592: all $\alpha \le \beta$, then $k_\beta$ is the least index $\delta$ such that

3593: $\mu_{k_\alpha} \ne \mu_\delta$ for all $\alpha < \beta$.  If there is no

3594: index $\gamma$ such that $\mu_\gamma \notin \{\mu_{k_\alpha}: \alpha <

3595: \beta\}$, then $\vecmu' = (\mu_{k_\alpha}: \alpha < \beta)$.  I leave

3596: it to the reader to check that $\vecmu \aeq \vecmu'$. \eprf

3597:

3598: Returning to the proof of Proposition~\ref{counter}, suppose by way of

3599: contradiction that $\nu \aeq \vecmu$.  Without loss of generality, by

3600: Lemma~\ref{distinct}, assume that all the probability measures

3601: in $\vecmu$ are distinct.

3602: Clearly

3603: $E_\nu(\chi_W) < E_\nu(\alpha \chi_{\{w_1\}})$ if $\alpha \ge 2$ and

3604: $E_\nu(\chi_W) >

3605: E_\nu(\alpha \chi_{\{w_1\}})$ if $\alpha < 2$.  Since $\nu \aeq \vecmu$,

3606: it must be

3607: the case that $E_{\vecmu}(\chi_W) < E_{\vecmu}(\alpha \chi_{\{w_1\}})$ if

3608: $\alpha \ge 2$

3609: and $E_{\vecmu}(\chi_W) >

3610: E_{\vecmu}(\alpha \chi_{\{w_1\}})$ if $\alpha < 2$.  Since $E_{\vecmu}(\chi_W)

3611: = (1, 1, \ldots)$, it follows that if $\vecmu = (\mu_0, \mu_1, \ldots)$,

3612: it must

3613: be the case that $\mu_0(w_1) = 1/2$ and

3614: \begin{equation}\label{eq:mu1}

3615: \mu_1(w_1) \ge 1/2.

3616: \end{equation}

3617: Similar

3618: arguments (comparing $\chi_W$ to $\chi_{\{w_{j}\}}$) can be used to show that

3619: $\mu_0(w_j) = 1/2^j$ and $\mu_1(w_{2j-1}) \ge 1/2^j$ for $j = 1, 2,

3620: \ldots$.

3621: %Next observe that $E_{\nu}(\chi_{\{w_1\}} - 2 \chi_{\{w_2\}}) =

3622: %E_{\nu}(3\chi_{\{w_1\}} - 1.5 \chi_W) (= 3\epsilon)$.

3623: Next, observe that $E_{\nu}(\chi_{\{w_1\}} - 2^{2k-1}\chi_{\{w_{2k}\}}) =

3624: (2^{k} + 1)\epsilon$.  Thus, $$E_{\nu}(\chi_{\{w_1\}} -

3625: 2^{2k-1}\chi_{\{w_{2k}\}}) = E_{\nu}((2^{k}+1)(\chi_{\{w_1\}} - (\chi_W/2))).$$

3626: %E_{\nu}(\frac{2^k-1}{2^{k+1}-1}(\chi_{\{w_1\}} -

3627: %2^{2k+1}\chi_{\{w_{2k+2}\}}) >

3628: %> E_{\nu}(\chi_{\emptyset}) = 0.$$

3629: It follows that the same relationship must hold if $\nu$ is replaced by

3630: $\vecmu$.  That is,

3631: $$\mu_1(w_1) - 2^{2k-1}\mu_1(w_{2k}) =

3632: (2^{k}+1)(\mu_1(w_1) - (1/2)).$$

3633: Rearranging terms, this gives

3634: %joe9

3635: %$$2^{k}\mu_1(w_1) + 2^{2k-1}\mu(w_{2k}) = 2^{k-1} + 1/2,$$

3636: $$2^{k}\mu_1(w_1) + 2^{2k-1}\mu_1(w_{2k}) = 2^{k-1} + 1/2,$$

3637: or

3638: \begin{equation}\label{eq1.5}

3639: %joe9

3640: %\mu_1(w_1) + 2^{k-1} \mu(w_{2k}) = 1/2 + 1/2^{k+1}.

3641: \mu_1(w_1) + 2^{k-1} \mu_1(w_{2k}) = 1/2 + 1/2^{k+1}.

3642: \end{equation}

3643: Thus, $\mu_1(w_1) \le 1/2 + 1/2^{k+1}$ for all $k \ge 1$.

3644: Putting this together with (\ref{eq:mu1}), it

3645: follows that $\mu_1(w_1) = 1/2$.  Plugging this into (\ref{eq1.5}) gives

3646: $\mu_1(w_{2k}) = 1/2^{2k}$.  It now follows that $\mu_1 =

3647: \mu_0$, contradicting the choice of $\vecmu$.  \eprf

3648:

3649: \bigskip

3650:

3651: \othm{FNP} $\FNP$ is a bijection from $\NPS(W,\F)/\!\simeq$ to

3652: $\Popper(W,\F)$ and from $\NPS^c(W,\F)/\!\simeq$ to $\Popper^c(W,\F)$.

3653: \eothm

3654:

3655: \medskip

3656:

3657: \prf

3658: %joe9

3659: As I said in the main text, the proof that $\FNP$ is an injection is

3660: straightforward, and to prove that it is a surjection in the countably

3661: additive case, it suffices to show that $\FNP(W,\F,\nu) =

3662: (W,\F,\F',\mu)$, where $\nu \aeq \vecmu'$ and $\vecmu'$ is the

3663: countably additive SLPS such that $\FCP((W,\F,\vec{\mu}'))

3664: = (W, \F,\F', \mu)$.   I now do this.

3665:

3666: Suppose that $\FNP(W,\F,\nu) = (W,\F,\F_1',\mu_1)$.

3667: First I show that $\nu(U) = 0$ iff $\vecmu'(U) = \vec{0}$.

3668: Let $X = \chi_U$ and $Y = \chi_{\emptyset}$.  Note that $\nu(U) = 0$ iff

3669: $E_\nu(X) = E_\nu(Y)$ iff $E_{\vecmu'}(X) = E_{\vecmu'}(Y)$ iff

3670: $\vecmu'(U) = \vec{0}$.  Thus, $\F_1' = \{U: \nu(U) \ne 0\} =

3671: \{U: \vecmu'(U) \ne \vec{0}\} = \F'$.

3672:

3673: Now suppose by way of contradiction that $\mu \ne \mu_1$.  Thus, there

3674: must exist some $V \in \F$, $U \in \F'$ such that $\mu(V \mid U) \ne

3675: \mu_1(V \mid U)$.  Let $\beta$ be the smallest ordinal such that

3676: %joe9

3677: %$\mu_\beta'(U) \ne 0$.  It follows that $\mu_\beta(V \mid U) \ne

3678: %\stand{\nu(V \mid U)}$.  We can assume without loss of generality that

3679: %$\mu_\beta(V

3680: $\mu_\beta'(U) \ne 0$.  It follows that $\mu'_\beta(V \mid U) \ne \stand{\nu(V

3681: \mid U)}$.  We can assume without loss of generality that $\mu'_\beta(V

3682: \mid U) > \stand{\nu(V \mid U)}$.  Choose a real number $r$ such that

3683: %joe9

3684: %$\mu_\beta(V \mid U) > r >  \st(V \mid U)$.  Then

3685: $\mu'_\beta(V \mid U) > r >  \stand{\nu(V \mid U)}$.  Then

3686: $E_{\vecmu'}(\chi_{V \inter U}) > E_{\vecmu'}(r \chi_U)$ but

3687: $E_{\nu}(\chi_{V \inter U}) < E_{\nu}(r \chi_U)$.  This contradicts the

3688: %joe2

3689: %assumption that $\vecmu' \aeq \nu$.  It follows that $\FNP(\nu) =

3690: assumption that $\vecmu' \aeq \nu$.  It follows that $\FNP(W,\F,\nu) =

3691: (W,\F,\F',\mu)$, as desired.

3692:

3693:

3694: %Thus, it remains to prove the result in the

3695: %case that $W$ is infinite and $\F$ is an algebra (but not necessarily a

3696: %$\sigma$-algebra).

3697: It remains to show that if $(W,\F,\F',\mu) \in \Popper(W,\F) -

3698: \Popper^c(W,\F)$, then there is some $(W,\F,\nu) \in \NPS(W,\F)$ such that

3699: $\FNP(W,\F\nu) = (W,\F,\F',\mu)$.  My proof in this case follows closely

3700: the lines of

3701: an analogous result proved by

3702: McGee \citeyear{McGee94}.  I provide the details here mainly for

3703: completeness.

3704:

3705: The proof relies on the following ultrafilter construction of

3706: non-Archimedean fields.  Given a set $S$, a {\em filter\/} $\G$ on $S$ is a

3707: nonempty set of subsets of $\F$ that is closed under supersets (so that

3708: if $U \in \G$ and $U \subseteq U'$, then $U' \in \G$), is closed under

3709: finite intersections (so that if $U_1, U_2 \in \G$, then $U_1

3710: \inter U_2 \in \G$), and does not contain $\emptyset$.  An {\em

3711: ultrafilter\/} is a maximal filter, that is, a filter that is not a

3712: strict subset of any other filter.  It is not hard to show that if $\U$

3713: is an ultrafilter on $S$, then for all $U \subseteq S$, either $U \in

3714: \U$ or $\overline{U} \in \U$ \cite{BellSlomson}.

3715:

3716: Suppose $F$ is either $\IR$ or a

3717: non-Archimedean field, $J$ is an arbitrary set, and $\U$ is an

3718: ultrafilter on $J$.  Define an equivalence relation $\sim_{\U}$ on

3719: $F^J$ by taking $(a_j: j \in J) \sim_{\U} (b_j: j \in J)$ if $\{j: a_j =

3720: b_j\} \in \U$.  Similarly, define a total order $\preceq_\U$ by taking

3721: $(a_j: j \in J) \preceq_{\U} (b_j: j \in J)$ if $\{j: a_j \le b_j\} \in

3722: \U$.  (The fact that $\le_{\U}$ is total uses the fact that for all $U

3723: \subseteq

3724: J$, either $U \in \U$ or $\overline{U} \in \U$.  Note that the pointwise

3725: ordering on $F^J$ is not total.)  Let $F^J/\nsim_{\U}$ consist of these

3726: equivalence classes.  Note that $F$ can be viewed as a subset of

3727: $F^J/\nsim_{\U}$ by identifying $a \in F$ with the sequence of all $a$'s.

3728:

3729: Define addition and multiplication on $F^J$ pointwise,

3730: so that, for example, $(a_j: j \in J) + (b_j: j \in J) = (a_j + b_j: j

3731: \in J)$.  It is easy to check that if $(a_j: j \in J) \sim_{\U} (a_j': j

3732: \in J)$, then $(a_j: j \in J) + (b_j: j \in J) \sim_{\U} (a_j': j \in J) +

3733: (b_j: j \in J)$, and similarly for multiplication.  Thus, the

3734: definitions of $+$ and $\times$ can be extended in the obvious way to

3735: $F^J/\nsim_{\U}$.  With these definitions, it is easy to check that

3736: $F^J/\nsim_{\U}$ is a field that contains $F$.

3737:

3738: Now given a Popper space $(W,\F,\F',\mu)$ and a finite subset $\A = \{U_1,

3739: \ldots, U_k\} \subseteq \F$, let $\F_{\A}$ be the (finite) algebra

3740: generated by $\A$ (that is, the smallest set containing $\{U_1, \ldots,

3741: U_k, W\}$ that is closed under unions and complement).  Let

3742: $\F'_{\A} = \F_{\A} \inter \F'$.  It follows from Theorem~\ref{FCPfin} that

3743: there is a finite SLPS $\vecmu_\A$ over $(W,\F_{\A})$ that is mapped to

3744: $(W,\F_{\A},\F'_{\A'}, \mu_{\A})$ by $\FCP$.  (Although

3745: Theorem~\ref{FCPfin} is stated for finite state spaces $W$, the proof

3746: relies on only the fact that the algebra is finite, so it applies without

3747: change here.)  It now follows from

3748: Theorem~\ref{lpsnps} that, for each $\A$, there is a nonstandard

3749: probability space $(W,\F_{\A},\nu_\A)$ with range $\IR(\epsilon)$ that is

3750: equivalent to $\vecmu_{\A}$.  By Proposition~\ref{FCPaeq}, it follows

3751: that for $U \in \F'_{\A}$ iff $\nu_{\A}(U) = 0$.

3752: Moreover, $\stand{\nu_{\A}(V \mid U)} = \mu_{\A}(V \mid U)$ for $U \in

3753: \F'_{\A}$ and $V \in \F_{\A}$.

3754:

3755: Let $J$ consist of all finite subsets of $\F$.  For a subset $\A$ of

3756: $\F$, let $G_{\A}$ be the subset of $2^J$ consisting of all sets in $J$

3757: containing $\A$.  Let $\G = \{G \subseteq J: G \supseteq G_{\A} \mbox{ for

3758: some } \A \subseteq \F\}$.  It is easy to check that $\G$ is a filter on

3759: $J$.  It is a standard result that every filter can be extended to an

3760: ultrafilter \cite{BellSlomson}.  Let $\U$ be an ultrafilter containing

3761: $\G$.  By the construction above, $\R(\epsilon)/\nsim_{\U}$ is a

3762: non-Archimedean field.

3763:

3764: Define $\nu$ on $(W,\F)$ by taking

3765: $\nu(U) = (\nu_{\A}(U): \A \in J)$, where $\nu_\A(U)$ is taken to be 0

3766: if $U \notin \F_{\A}$.  To see that $\nu$ is indeed a nonstandard

3767: probability measure with the required properties, note that clearly

3768: $\nu(W) = 1$ (where 1 is identified with the sequence of all 1's).

3769: Moreover, to see that $\nu(U) + \nu(V) = \nu(U \union V)$, let

3770: $\A_{U,V}$ be the smallest subalgebra containing $U$ and $V$.

3771: Note that if $\A \supset \A_{U,V}$, then

3772: $\nu_{\A}(U) + \nu_{\A}(V) = \nu_{\A'}(U \union V)$.  Since the set of

3773: algebras containing $\A_{U,V}$ is an element of the ultrafilter, the

3774: result follows.  Similar arguments show that $\nu(U) = 0$ iff $U \in

3775: \F'$ and that $\stand{\nu(V \mid U)} = \mu(V \mid U)$ if $U \in \F'$ and $V \in

3776: \F$. Clearly $\FNP(\nu) = \mu$.   \eprf

3777:

3778: \bigskip

3779:

3780:

3781: %joe6

3782: %\opro{simeqvsaeq} If $\nu_1 \aeq \nu_2$ than $\nu_1 \simeq \nu_2$.

3783: \opro{simeqvsaeq} If $\nu_1 \aeq \nu_2$ then $\nu_1 \simeq \nu_2$.

3784: \eopro

3785:

3786: \medskip

3787:

3788: \prf Suppose that $\nu_1 \aeq \nu_2$.  To show that $\nu_1 \simeq

3789: \nu_2$, first suppose that $\nu_1(U) \ne 0$ for some $U \subseteq W$.  Then

3790: $E_{\nu_1}(\chi_\emptyset) < E_{\nu_1}(\chi_U)$.  Since $\nu_1 \aeq

3791: \nu_2$, it must be the case that $E_{\nu_2}(\chi_\emptyset) <

3792: E_{\nu_2}(\chi_U)$.  Thus, $\nu_2(U) \ne 0$.  A symmetric argument shows

3793: that if $\nu_2(U) \ne 0$ then $\nu_1(U) \ne 0$.  Next, suppose that

3794: $\nu_1(U) \ne 0$ and $\nu_1(V \mid U) = \alpha$.  Thus,

3795: $E_{\nu_1}(\alpha \chi_U) = E_{\nu_1}(\chi_{U \inter V})$.  Since

3796: $\nu_1 \aeq \nu_2$, it follows that

3797: $E_{\nu_2}(\alpha \chi_U) = E_{\nu_2}(\chi_{U \inter V})$, and so

3798: $\nu_2(V \mid U) = \alpha$.  Thus, $\stand{\nu_1(V \mid U)} =

3799: \stand{\nu_2(V \mid  U)}$.

3800: Hence, $\nu_1 \simeq \nu_2$, as desired. \eprf

3801: \commentout{

3802: \bigskip

3803:

3804: \opro{indaeq}

3805: $U$ is approximately conditionally independent of $V$

3806: given $V'$ with respect to $\nu$ iff there exists a measure $\nu'$ such

3807: that $\nu \aeq \nu'$ and $U$ is conditionally independent of $V$ given

3808: $V'$ with respect to $\nu'$.

3809: \eopro

3810:

3811: \medskip

3812:

3813: \prf Suppose that $U$ is approximately conditionally independent of $V$

3814: given $V'$ with respect to $\nu$.   If $\nu(U \inter V') = 0$, then $U$

3815: is conditionally independent of $V$ given $V'$ with respect to $\nu$.

3816: If $\nu(U \inter V') \ne 0$, $\stand{\nu(V \mid U \inter V')}

3817: = \stand{\nu(V \mid V')}$.

3818: }

3819:

3820: \othm{BBDstrongindependence}  There exists an NPS $\nu$ whose

3821: range is an

3822: elementary extension of the reals such that $\vecmu \aeq \nu$ and $X_1,

3823: %joe5

3824: %\ldots, X_n$ are strongly independent with respect to $\nu$ iff there

3825: \ldots, X_n$ are  independent with respect to $\nu$ iff there

3826: exists a sequence $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$

3827: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$ as $j\rightarrow\infty$,

3828: and $X_1, \ldots, X_n$ are independent with respect to $\vecmu \, \Box

3829: \, \vec{r}^j$ for $j = 1, 2, 3, \ldots$.

3830: \eothm

3831:

3832: \prf Suppose that  there exists an NPS

3833: $\nu$ whose range is an elementary extension of the reals, $\vecmu

3834: \aeq \nu$,  and $X_1, \ldots, X_n$ are

3835: independent with respect to $\nu$.  Using arguments similar in spirit to

3836: those the

3837: arguments of BBD \citeyear[Proposition 2]{BBD2}, it follows that there exist

3838: positive infinitesimals $\epsilon_1, \ldots, \epsilon_k$ such that

3839: $\vecmu \, \Box \, (\epsilon_1, \ldots, \epsilon_k) = \nu$.  It is not

3840: hard to show that there exist a finite set of real-valued polynomials

3841: $p_1,\ldots, p_N$ such that $p_j(\epsilon_1, \ldots, \epsilon_k) = 0$

3842: for $j = 1, \ldots, N$ and if $\vec{r}$ is a vector of positive reals

3843: such that $p_j(\vec{r}) = 0$ for $j = 1, \ldots, N$, then $X_1, \ldots,

3844: X_n$ are independent with respect to $\vecmu \, \Box \, \vec{r}$.

3845: Thus, for all natural numbers $m \ge 1$, the range of

3846: $\nu$ satisfies the first-order property $$\exists x_1 \ldots \exists x_k

3847: (p_1(x_1, \ldots, x_k) = 0 \land \ldots \land p_N(x_1, \ldots, x_k) = 0

3848: \land 0 < x_1 < 1/m \land \ldots \land 0 < x_k < 1/m).$$

3849: Since the range of $\nu$ is an elementary extension of the reals, this

3850: first-order

3851: property holds of the reals as well.

3852: Thus, there exists a sequence

3853: $\vec{r}^j$ of vectors of positive reals converging to $\vec{0}$ such that

3854: $p_j(\vec{r}^j) = 0$ for $j = 1, \ldots, N$.

3855:

3856: The converse follows by a straightforward application

3857: of compactness in first-order logic \cite{Enderton}.

3858: Suppose that there exists a sequence

3859: $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$

3860: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$

3861: as $j\rightarrow\infty$, and $X_1, \ldots, X_n$ are

3862: independent with respect to $\vecmu \, \Box \, \vec{r}^j$ for $j = 1, 2, 3,

3863: \ldots$.   We now apply the compactness theorem.

3864: As I mentioned in the proof of Proposition~\ref{infiniteeq}, the

3865: compactness theorem says that,

3866: given a collection for formulas, if each finite subset has a model, then

3867: so does the whole set.

3868: Consider a language with the function symbols $+$ and $\times$,

3869: the binary relation $\le$, a constant

3870: symbol $\mathbf{r}$ for each

3871: real number $r$, a unary predicate $N$ (representing the natural numbers),

3872: and constant symbols $p_{U}$ for each set $U \in

3873: \F$.  Intuitively, $p_U$ represents $\nu(U)$.

3874: Consider the following (uncountable) collection of formulas:

3875: \begin{itemize}

3876: \item[(a)]  All first-order formulas in this language true of the reals.

3877: %joe9

3878: %(This includes, for example, a formula such as $\forall x\forall y(x= y

3879: (This includes, for example, a formula such as $\forall x\forall y(x+ y

3880: =  y+x)$, which says that addition is commutative, as well as formulas

3881: such as $\mathbf{2} + \mathbf{3} = \mathbf{5}$ and

3882: $\mathbf{\sqrt{2}} \times \mathbf{\sqrt{3}} = \mathbf{\sqrt{6}}$.)

3883: \item[(b)] Formulas $p_U > 0$ for $U \in \F'$ and $p_U = 0$ for $U \in \F -

3884: \F'$.

3885: \item[(c)] Formulas $p_U + p_V = p_{U \union V}$ if $U \inter V = \emptyset$.

3886: \item[(d)] The formula $p_W = 1$.

3887: \item[(e)] Formulas of the form $p_{X_1 = x_1} \times \cdots \times

3888: p_{X_n = x_n} =

3889: p_{X_1 = x_1 \inter \ldots \inter X_n = x_n}$, for all values $x_i \in

3890: \V(X_i)$, $i = 1, \ldots, n$; these formulas say that $X_1,

3891: \ldots, X_n$ are independent with respect to $\nu$.

3892: \item[(f)] For every pair of $Y$, $Y'$ of random variables such that

3893: $E_{\vecmu}(Y) \ge E_{\vecmu}(Y')$, a formula that says

3894: $E_{\nu}(Y) \ge E_{\nu}(Y')$, where $E_{\nu}(Y)$ and $E_{\nu}(Y')$ are

3895: expressed using the constant symbols $p_U$ (where the events $U$ are

3896: those of the form $Y=y$ and $Y'=y'$).

3897: %joe6

3898: Note that this formula is finite, since $X$ and $Y$ are assumed to have

3899: finite range.  The formula would not be expressible in first-order logic

3900: if $X$ or $Y$ had infinite range.

3901: \end{itemize}

3902:

3903: It is not hard to show that every finite subset of these formulas is

3904: satisfiable.  Indeed, given a finite subset of formulas, there must

3905: exist some $m$ such that taking $p_U = \vecmu \, \Box \, \vec{r}^m(U)$

3906: will work (and interpreting $\mathbf{r}$ as the real number $r$, of

3907: course).  The only nonobvious part is showing that we can deal with the

3908: formulas in part (f); that we can do so follows from the proof of

3909: Proposition 1 in  \cite{BBD2}, which shows that

3910: $E_{\vecmu}(Y') > E_{\vecmu}(Y)$ iff there exists some $M$ such that $E_{\vecmu \, \Box \,

3911: \vec{r}^m}(Y') >

3912: E_{\vecmu \, \Box \, \vec{r}^m}(Y)$ for all $m$, then

3913: $E_{\vecmu}(Y') > E_{\vecmu}(Y)$.

3914:

3915: Since every finite set of formulas is satisfiable,

3916: by compactness, the infinite set is satisfiable.  Let $\nu(U)$

3917: be the interpretation of $p_U$ in a model satisfying these formulas.

3918: Then it is easy to check that $\nu$ is an elementary extension of the

3919: reals, $\nu \aeq \vecmu$, and

3920: that $X_1, \ldots, X_n$ are independent with respect to $\nu$.

3921: \eprf

3922:

3923:

3924:

3925:

3926:

3927: \othm{KRindependence}

3928: $X_1, \ldots, X_n$ are strongly independent with respect to the Popper

3929: space $(W,\F,\F',\mu)$ iff there

3930: exists an NPS $(W,\F,\nu)$ such that

3931: %joe4

3932: %$\FNP(W,\F,\nu) = \mu$ and $X_1, \ldots,

3933: $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$ and $X_1, \ldots,

3934: X_n$ are independent with respect to $(W,\F,\nu)$.

3935: \eothm

3936:

3937: \prf  It easily follows from Kohlberg and Reny's \citeyear[Theorem

3938: 2.10]{KR97} characterization of strong independence that if

3939: $X_1, \ldots, X_n$ are independent with respect to the NPS

3940: %joe3: 7/28/05

3941: %$(W,\F,\nu$ then $X_1, \ldots, X_n$ are strongly independent with respect to

3942: $(W,\F,\nu)$ then $X_1, \ldots, X_n$ are strongly independent with respect to

3943: $\FNP(W,\F,\nu)$.

3944: \commentout{

3945: The converse follows by a straightforward application

3946: of compactness in first-order logic \cite{Enderton}.

3947:

3948: Suppose that $(W,\F,\F',\mu)$ is a Popper space and

3949: $\mu_j \rightarrow \mu$ are as required for $X_1, \ldots, X_n$ to be

3950: strongly independent with respect to $\mu$.

3951: As I mentioned in the proof of Proposition~\ref{infiniteeq}, the

3952: compactness theorem says that,

3953: given a collection for formulas, if each finite subset has a model, then

3954: so does the whole set.

3955: Consider a language with the function symbols $+$ and $\times$,

3956: the binary relation $\le$, a constant

3957: symbol $\mathbf{r}$ for each

3958: real number $r$, and constant symbols $p_{U}$ for each set $U \in

3959: \F$.  Intuitively, $p_U$ represents $\nu(U)$.

3960: Consider the following (uncountable) collection of formulas:

3961: \begin{itemize}

3962: \item All formulas true in fields (for example, $\forall x, y (x+ y =

3963: y+x)$, which says that addition is commutative).

3964: \item All true statements of the form $\mathbf{r_1} + \mathbf{r_2} =

3965: \mathbf{r_3}$ and $\mathbf{r_1} \times \mathbf{r_2} = \mathbf{r_3}$

3966: involving real constants $\mathbf{r_1}$, $\mathbf{r_2}$, $\mathbf{r_3}$

3967: (for example $\mathbf{2} + \mathbf{3} = \mathbf{5}$ and

3968: $\mathbf{\sqrt{2}} \times \mathbf{\sqrt{3}} = \mathbf{\sqrt{6}}$).

3969: \item Formulas $p_U > 0$ for $U \in \F'$ and $p_U = 0$ for $U \in \F -

3970: \F'$.

3971: \item Formulas $p_U + p_V = p_{U \union V}$ if $U \inter V = \emptyset$.

3972: \item The formula $p_W = 1$.

3973: \item Formulas of the form $p_{X_1 = x_1} \times \cdots \times p_{X_n = x_n} =

3974: p_{X_1 = x_1 \inter \ldots \inter X_n = x_n}$, for all values $x_i \in

3975: \V(X_i)$, $i = 1, \ldots, n$; these formulas say that $X_1,

3976: \ldots, X_n$ are independent with respect to $\nu$.

3977: \item Formulas of the form $(\mathbf{r - \frac{1}{n}})p_V  \le p_{U \inter V}

3978: \le (\mathbf{r + \frac{1}{n}})p_V$ for all $U$, $V$, $\mathbf{r}$, and

3979: $\mathbf{n} > 0$ such that $\mu(U \mid V) = r$.

3980: \end{itemize}

3981:

3982: It is easy to see that every finite subset of these formulas is

3983: satisfiable.  Indeed, given a finite subset of formulas, there must

3984: exist some $m$ such that taking $p_U = \mu_m(U)$ satisfies all the

3985: formulas (and interpreting $\mathbf{r}$ as the real number $r$, of

3986: course).  By compactness, the infinite set is satisfiable.  Let $\nu(U)$

3987: be the interpretation of $p_U$ in a model satisfying these formulas.

3988: Then it is easy to check that $\FLN(W,\F,\nu) = (W,\F,\F',\mu)$,

3989: and that $X_1, \ldots, X_n$ are independent with respect to $\nu$.

3990: }

3991: %\end{commentout}

3992:

3993: The converse follows using compactness, much as in the proof of

3994: Theorem~\ref{BBDstrongindependence}.

3995: Suppose that $(W,\F,\F',\mu)$ is a Popper space and

3996: $\mu_j \rightarrow \mu$ are as required for $X_1, \ldots, X_n$ to be

3997: strongly independent with respect to $\mu$.

3998: Consider the same language as in the proof of

3999: Theorem~\ref{BBDstrongindependence}, and essentially the same

4000: collection of formulas, except that the formulas of part (f) are

4001: replaced by

4002: \begin{itemize}

4003: \item[(f$'$)] Formulas of the form $(\mathbf{r - \frac{1}{n}})p_V  \le p_{U \inter V}

4004: \le (\mathbf{r + \frac{1}{n}})p_V$ for all $U$, $V$, $\mathbf{r}$, and

4005: $\mathbf{n} > 0$ such that $\mu(U \mid V) = r$.

4006: \end{itemize}

4007:

4008: Again, it is easy to see that every finite subset of these formulas is

4009: satisfiable.  Indeed, given a finite subset of formulas, there must

4010: exist some $m$ such that taking $p_U = \mu_m(U)$ satisfies all the

4011: formulas (and interpreting $\mathbf{r}$ as the real number $r$, of

4012: course).  By compactness, the infinite set is satisfiable.  Let $\nu(U)$

4013: be the interpretation of $p_U$ in a model satisfying these formulas.

4014: Then it is easy to check that $\FLN(W,\F,\nu) = (W,\F,\F',\mu)$,

4015: and that $X_1, \ldots, X_n$ are independent with respect to $\nu$.

4016: \eprf

4017:

4018:

4019: \bibliographystyle{chicago}

4020: %\bibliographystyle{alpha}

4021: \bibliography{z,joe}

4022: \end{document}

4023:

4024: