1: %joe5: corrections: restated theorem
2: %joe4: discsuss representation of beliefs in conclusion a la
3: %Asheim/Slovik; reference Stal98.
4: %joe2: check what Popper called Popper algebras: Q175.P83 (Uris
5: %and Carpenter)
6: %joe2: What's new? More motivation in intro, Proposition 4.3, major
7: %rewriting in Section~\ref{indep}:
8: %joe9: corrections in response to GEB reviews
9:
10: %\documentstyle[chicagob,times,uai97]{article}
11: %\documentstyle[chicagob,times,11pt]{article}
12: \documentstyle[chicagob,12pt]{article}
13: \input{defn}
14: \input{spage}
15: \input{bghkmac}
16: \newcommand{\vecmu}{\vec{\mu}}
17: \newcommand{\Supp}{\mbox{\it Supp}}
18: \newcommand{\FCP}{F_{S \rightarrow P}}
19: \newcommand{\FOP}{F_{O \rightarrow C}}
20: \newcommand{\FDC}{F_{D \rightarrow C}}
21: \newcommand{\FLN}{F_{L \rightarrow N}}
22: \newcommand{\FSN}{F_{S \rightarrow N}}
23: \newcommand{\FNP}{F_{N \rightarrow P}}
24: \newcommand{\LPS}{\mbox{\em LPS\/}}
25: \newcommand{\SLPS}{\mbox{\em SLPS\/}}
26: \newcommand{\NPS}{\mbox{\em NPS\/}}
27: \newcommand{\Bas}{{\it Basic}}
28: \newcommand{\Popper}{\mbox{\em Pop\/}}
29: \renewcommand{\T}{T}
30: \newcommand{\lab}{\mbox{\em label\/}}
31: \renewcommand{\aeq}{\approx}
32: \renewcommand{\naeq}{\!\!\approx}
33: \newcommand{\nsim}{\!\!\sim}
34: \newcommand{\stand}[1]{\mbox{\em st}\left (#1 \right )}
35: \renewcommand{\mid}{\, | \,}
36:
37: \begin{document}
38: %UAI
39: %\begin{titlepage}
40:
41: \title{Lexicographic probability, conditional probability, and
42: nonstandard probability%
43: \thanks{The work was supported in part by NSF under
44: grants IRI-96-25901, IIS-0090145, and CTC-0208535, by
45: ONR under grant
46: N00014-02-1-0455, and by the DoD Multidisciplinary University Research
47: Initiative (MURI) program administered by the ONR under
48: grant N00014-01-1-0795.
49: %joe2
50: A preliminary version appeared in the Proceedings of the Eighth
51: Conference on Theoretical Aspects of Rationality and Knowledge, 2001
52: [Halpern 2001].
53: %\cite{Hal26}.
54: This version includes detailed proofs and more discussion
55: and more examples; in addition, the material in Section~\ref{sec:indep}
56: (on independence) is new.
57: }}
58: \author{
59: Joseph Y.\ Halpern\\
60: Dept. Computer Science\\
61: Cornell University\\
62: Ithaca, NY 14853\\
63: halpern@cs.cornell.edu\\
64: http:/$\!$/www.cs.cornell.edu/home/halpern
65: }
66: \date{\today}
67: \maketitle
68: %joe:UAI
69: %\thispagestyle{empty}
70: \begin{abstract}
71: The relationship between {\em Popper spaces\/} (conditional probability
72: spaces that satisfy some regularity conditions),
73: lexicographic probability systems (LPS's) \cite{BBD1,BBD2}, and
74: nonstandard probability spaces (NPS's) is considered. If countable
75: additivity is assumed, Popper spaces and a subclass of
76: LPS's are equivalent; without the assumption of countable additivity,
77: the equivalence no longer holds. If the state space is finite, LPS's are
78: equivalent to NPS's. However, if the state space is infinite, NPS's are
79: shown to be more general than LPS's.
80: \end{abstract}
81:
82: %joe:UAI
83: %\end{titlepage}
84:
85: \nocite{Hal26}
86: \section{Introduction}
87: Probability is certainly the most commonly-used approach for
88: representing uncertainty and conditioning the standard way of updating
89: probabilities in the light of new information. Unfortunately, there is a
90: well-known problem with conditioning: Conditioning on events of measure
91: 0 is not defined. That makes it unclear how to proceed if an agent
92: learns something to which she initially assigned probability 0.
93: Although consideration of events of measure 0 may seem to be of little
94: practical interest, it turns out to play a critical role in game theory,
95: particularly in the analysis of strategic reasoning in extensive-form
96: %joe3
97: %games and in the analysis of various solution concepts in games
98: games and in the analysis of weak dominance in normal-form games
99: (see, for example,
100: %joe2: added BK00, Hammond99
101: %joe3
102: %\cite{BBD1,BBD2,BK00,Hammond94,Hammond99,KR97,KW82,Myerson86,Selten65}).
103: %joe4:
104: \cite{Bat96,BS02,BBD1,BBD2,BFK04,FT91a,Hammond94,Hammond99,KR97,KW82,Myerson86,Selten65,Selten75}).
105: It also arises in the analysis of conditional
106: statements by philosophers (see \cite{Adams66,McGee94}), and in dealing
107: with nonmonotonicity in Artificial Intelligence (see, for example,
108: \cite{LehmannMagidor}).
109:
110: There have been various attempts to deal with the problem of
111: conditioning on events of measure 0. Perhaps the
112: %joe2: added Popper34
113: best known
114: %joe9
115: involves \emph{conditional probability spaces} (CPS's). The idea,
116: which goes back to Popper \citeyear{Popper34,Popper68} and
117: de Finetti
118: \citeyear{Finetti36}, is to take as
119: primitive not probability, but conditional probability. If $\mu$ is a
120: %joe9
121: %conditional probability measure, then $\mu(V \mid U)$ may still be
122: %undefined
123: conditional probability measure on a space $W$, then $\mu(V \mid U)$ may
124: still be undefined
125: for some pairs $V$ and $U$, but it is also possible that $\mu(V \mid U)$ is
126: %joe9
127: %defined even if $\mu(U) = 0$. A second approach, which goes back to at
128: defined even if $\mu(U \mid W) = 0$. A second approach, which goes back to at
129: least Robinson~\citeyear{Robinson73} and has been explored in
130: the economics literature \cite{Hammond94,Hammond99}, the AI literature
131: \cite{LehmannMagidor,Wilson95}, and the philosophy literature (see
132: \cite{McGee94} and the references therein) is to consider {\em
133: nonstandard probability spaces\/} (NPS's), where there are
134: infinitesimals that can be used to model
135: events that, intuitively, have infinitesimally small probability yet
136: may still be learned or observed.
137:
138: There is a third approach to this problem, which uses
139: sequences of probability measures to represent uncertainty. The most
140: recent exemplar of this approach, which I focus on here,
141: are the {\em lexicographic probability systems\/}
142: %joe9
143: (LPS's)
144: of Blume, Brandenburger, and
145: Dekel \citeyear{BBD1,BBD2} (BBD from now on). However, the idea of using
146: a system of measures to represent uncertainty actually
147: was explored as far back as the 1950s by R\'{e}nyi
148: \citeyear{Renyi56}
149: (see Section~\ref{related}).
150: A {\em lexicographic
151: probability system\/} is a sequence
152: $\<\mu_0,\mu_1, \ldots\>$ of probability measures. Intuitively, the
153: first measure in the sequence, $\mu_0$, is the most important one,
154: followed by $\mu_1$, $\mu_2$, and so on.
155: %joe9
156: One way to understand LPS's is in terms of NPS's.
157: Roughly speaking, the
158: probability assigned to an event $U$ by a sequence such as
159: $\<\mu_0,\mu_1\>$ can be taken to be $\mu_0(U) + \epsilon\mu_1(U)$,
160: where $\epsilon$ is an infinitesimal.
161: Thus, even if the probability of $U$ according to $\mu_0$ is 0, $U$
162: still has a positive (although infinitesimal) probability if $\mu_1(U) > 0$.
163:
164: %joe9
165: %How are all these approaches related?
166: What is the precise relationship between these approaches?
167: %That question is the focus of this paper.
168: %joe9
169: %This question, which is the focus of the paper, has been considered
170: %before.
171: The relationship between LPS's and CPS's has been considered before.
172: For example, Hammond \citeyear{Hammond94} shows that
173: conditional probability spaces are equivalent to a subclass of LPS's
174: %joe9
175: %called lexicographic conditional probability spaces if the state space
176: called \emph{lexicographic conditional probability spaces} (LCPS's) if
177: the state space
178: is finite and it is possible
179: to condition on any nonempty set.%
180: %joe4:
181: \footnote{Despite this isomorphism; it is not clear that conditional
182: probability spaces are \emph{equivalent} to LPS's. It depends on
183: exactly what we mean by equivalence. The same comment applies below
184: where the word ``equivalent'' is used. See Section~\ref{discussion}
185: for further discussion. I thank Geir Asheim for bringing this point to
186: my attention.}
187: As shown by Spohn \citeyear{Spohn86},
188: Hammond's result can be extended to arbitrary countably additive {\em
189: Popper spaces}, where a Popper space is a conditional probability space
190: %joe9
191: %that satisfies certain regularity conditions.
192: where the events on which conditioning is allowed satisfy certain
193: regularity conditions.
194: %joe
195: As I show, this result is depends critically on a number of assumptions.
196: In particular, it does not work without the assumption of countable
197: additivity, it requires that we extend LCPS's appropriately to the
198: infinite case, and it is sensitive to the choice of conditioning events.
199: %The extension is nontrivial and, indeed, does not work without the
200: %assumption of countable additivity.
201: %joe9
202: For example, if we consider CPS's where
203: the conditioning events can be viewed as information sets, and so are
204: are not closed under supsersets (this is
205: essentially the case considered by Battigalli and Sinischalchi
206: \citeyear{BS02}), then the result no longer holds.
207: %joe9
208: %R\'{e}nyi \citeyear{Renyi56} and van Fraassen \citeyear{vF76} provide
209: %other representations of conditional probability spaces as sequences of
210: %measures, although not LPS's. Their results apply even if the
211: %underlying state space is infinite, but countable additivity does not play a
212: %role in their representations.
213: %(See Section~\ref{FCP} for further discussion of this issue.)
214:
215: Turning to the relationship between LPS's and NPS's,
216: I show that if the state space is finite, then LPS's are in a sense
217: equivalent to NPS's. More precisely, say that two measures of
218: uncertainty $\nu_1$ and $\nu_2$ (each of which can be either an LPS or
219: an NPS) are equivalent, denoted $\nu_1 \aeq \nu_2$, if they cannot be
220: distinguished by (real-valued)
221: random variables; that is, for all random variables $X$ and $Y$,
222: $E_{\nu_1}(X) \le E_{\nu_1}(Y)$ iff $E_{\nu_2}(X) \le E_{\nu_2}(Y)$
223: (where $E_\nu(X)$ denotes the expected value of $X$ with respect to
224: $\nu$).
225: To the extent that we are interested in these representations
226: of uncertainty for decision making, then we should not try to
227: distinguish two representations that are equivalent.
228: I show that, in finite spaces, there is a straightforward bijection
229: between $\aeq$-equivalence classes of LPS's and NPS's. This
230: equivalence breaks down if the state space is infinite; in this case,
231: NPS's are strictly more general than LPS's
232: (whether or not countable additivity is assumed).
233:
234: Finally, I consider the relationship between Popper spaces and NPS's,
235: and show that NPS's are more general.
236: (The theorem I prove is a generalization of one proved
237: by McGee \citeyear{McGee94}, but my interpretation of it is quite
238: different; see Section~\ref{PopperNPS}.)
239:
240: These results give some useful insight into independence of random
241: variables. There have been a number of alternative notions of
242: independence considered in the literature of extended probability spaces
243: (i.e., approaches that deal with the problem of conditioning on sets of
244: measure 0): BBD considered three; Kohlberg and Reny \citeyear{KR97}
245: considered two others. It turns out that these notions are perhaps
246: %joe9
247: %best understood in the context of NPS's. I describe and compare them
248: best understood in the context of NPS's; I describe and compare them
249: here.
250:
251: %joe3
252: %The most significant new results in this paper involve infinite spaces.
253: Many of the new results in this paper involve infinite spaces.
254: Given that most games studied by game theorists are finite, it is fair
255: to ask whether these results have any significance for game theory.
256: I believe they do. Even if the underlying game is finite, the set of
257: types is infinite. Epistemic characterizations of solution concepts
258: %joe3
259: %often make use of {\em universal\/} type spaces, which
260: %joe6
261: %joe9
262: %often make use of infinite type spaces,
263: often make use of \emph{complete} type spaces,
264: which include every possible type of every
265: player, where a type determines an (extended) probability over the strategies
266: and types of the other players; this must be an infinite space.
267: %joe9
268: %There are a number of closely related notions of such type spaces,
269: %which have been variously called \emph{terminal}, \emph{universal}, and
270: %\emph{complete} (see \cite{Sin07} for an overview).
271: %joe3
272: %When dealing with extensive-form games, the universal type spaces have
273: %to deal with the problem of conditioning on events of measure 0.
274: %Typically this has been done by using conditional probability spaces or
275: %LPS's.
276: For example, Battigalli and Siniscalchi \citeyear{BS02} use a
277: complete type space where the uncertainty is represented by cps's to
278: give an epistemic characterization of extensive-form rationalizability and
279: backward induction, while Brandenburger, Friedenberg, and Keisler
280: \citeyear{BFK04} use a complete type space where the uncertainty is
281: represented by LPS's to get a characterization of weak dominance in
282: normal-form games.
283: As the results of this paper show, the set of types depends
284: %joe3
285: to some extent
286: on the notion of extended probability used.
287: Similarly, a number of characterizations of solution concepts depend on
288: %joe4: added Bat96
289: independence (see, for example, \cite{Bat96,KR97,BS99a}). Again, the results
290: of this paper show that these notions can be somewhat sensitive to
291: exactly how uncertainty is represented,
292: %joe3
293: even with a finite state space.
294: While I do not present any new
295: game-theoretic results here, I believe that the characterizations I have
296: provided may be useful both in terms of defending particular choices of
297: representation used and suggesting new solution concepts.
298:
299:
300: The remainder of the paper is organized as follows. In the next
301: section, I review all the relevant definitions for the three
302: representations of uncertainty considered here. Section~\ref{FCP}
303: considers the relationship between Popper spaces and
304: LPS's. Section~\ref{LPSNPS} considers the relationship between
305: LPS's and NPS's. Finally,
306: Section~\ref{PopperNPS} considers the relationship between Popper spaces
307: and NPS's. In Section~\ref{sec:indep} I consider what these results
308: have to say about independence. I conclude with
309: some discussion in Section~\ref{discussion}.
310:
311: \section{Conditional, lexicographic, and nonstandard probability spaces}
312: In this section I briefly review the three approaches to representing
313: likelihood discussed in the introduction.
314:
315: \subsection{Popper spaces}\label{cpsdef}
316:
317: A {\em conditional probability measure\/} takes {\em pairs\/} $U, V$ of subsets
318: as arguments; $\mu(V,U)$ is generally written $\mu(V \mid U)$ to stress the
319: conditioning aspects. The first argument comes from some
320: algebra $\F$ of
321: subsets of a space $W$; if $W$ is infinite, $\F$ is often taken to be a
322: $\sigma$-algebra. (Recall that an algebra of subsets of $W$ is a set of
323: subsets containing $W$ and closed under union and complementation. A
324: $\sigma$-algebra is an algebra that is closed under union countable.)
325: %joe2:
326: The second argument comes from a set $\F'$ of conditioning
327: %joe9
328: %events, i.e., that is, events on which conditioning is allowed.
329: events, that is, that is, events on which conditioning is allowed.
330: %joe9
331: One natural choice is to take $\F'$ to be $\F - \emptyset$. But it may
332: be reasonable to consider other restrictions on $\F'$. For example,
333: Battigalli and Sinischalchi \citeyear{BS02} take $\F'$ to consist of the
334: information sets in a game, since they are interested only in agents who
335: update their beliefs conditional on getting some information.
336: The question is what
337: %joe2
338: %constraints, if any, should be placed on the second argument. I start
339: constraints, if any, should be placed on $\F'$.
340: %joe10:
341: %I start with three minimal requirements, and later add a fourth.
342: For most of this paper, I focus on \emph{Popper spaces} (named after
343: Karl Popper), defined next, where the set $\F'$ satisfies four arguably
344: reasonable requirements, but I occasionally consider other requirements
345: (see Section~\ref{sec:BS}).
346:
347: \commentout{
348: \dfn A {\em Popper
349: algebra\/} over $W$ is a set $\F \times \F'$ of subsets of $W \times W$
350: such that (a) $\F$ is an algebra over $W$, (b) $\F'$ is a nonempty
351: subset of $\F$ (not necessarily an algebra over $W$) that does not
352: contain $\emptyset$, and
353: (c) $\F'$ is closed under supersets in $\F$, in that if
354: $V \in \F'$, $V \subseteq V'$, and $V' \in \F$, then $V' \in \F'$.
355: (Popper algebras are named after Karl Popper.)
356: \edfn
357:
358:
359: %joe6
360: %joe9
361: %The role of the requirement that $\F'$ be closed under supersets is
362: %elucidated in Example~\ref{xam:noPopper}.
363: While I have called these requirements ``minimal'', note that if $\F'$
364: is taken to consist of information sets, then it is not closed under
365: supersets in $\F$. I return to this issue in Section~\ref{sec:BS}.
366: %Notice that the set $\F'$ in a Popper algebra $\F \times \F'$ is not
367: %itself required to be an algebra over $W$ (and, indeed,
368: %typically is not).
369: }
370:
371: \dfn\label{dfn.condprob} A {\em conditional probability space (cps) over
372: $(W,\F)$\/} is a tuple
373: $(W,\F,\F',\mu)$ such that
374: %$\F \times \F'$ is a Popper algebra over $W$ and
375: $\F$ is an algebra over $W$, $\F'$ is a set of subsets of $W$
376: (not necessarily an algebra over $W$) that does not
377: contain $\emptyset$, and
378: $\mu: \F \times \F' \rightarrow [0,1]$ satisfies the
379: following conditions:
380: \begin{itemize}
381: \item[CP1.] $\mu(U \mid U) = 1$ if $U \in\F'$.
382: \item[CP2.] $\mu(V_1 \union V_2 \mid U) = \mu(V_1 \mid U) + \mu(V_2 \mid U)$ if $V_1
383: \inter V_2 = \emptyset$, $U \in \F'$, and $V_1, V_2 \in \F$.
384: %if the
385: %$V_i$'s are pairwise disjoint sets in $\F$ and $U \in \F'$.
386: \item[CP3.] $\mu(V \mid U) = \mu(V \mid X) \times \mu(X \mid U)$ if $V \subseteq X
387: \subseteq U$, $U, X \in \F'$, $V \in \F$.
388: \end{itemize}
389: %joe6
390: Note that it follows from CP1 and CP2 that $\mu(\cdot \mid U)$ is a
391: probability measure on $(W,\F)$ (and, in particular, that $\mu(\emptyset
392: \mid U) = 0$) for each $U \in \F'$.
393: %joe10
394: A {\em Popper space over $(W,\F)$\/} is a conditional probability space
395: $(W,\F,\F',\mu)$
396: that satisfies
397: %an additional condition: if
398: three additional conditions: (a) $\F' \subseteq \F$, (b)
399: $\F'$ is closed under supersets in $\F$, in that if
400: $V \in \F'$, $V \subseteq V'$, and $V' \in \F$, then $V' \in \F'$, and
401: (c) if $U \in \F'$ and $\mu(V \mid U) \ne 0$ then $V \inter U \in \F'$.
402: If $\F$ is a $\sigma$-algebra and $\mu$ is countably additive
403: (that is, if $\mu(\union V_i \mid U) = \sum_{i = 1}^\infty \mu(V_i \mid U)$ if the
404: $V_i$'s are pairwise disjoint elements of $\F$ and $U \in \F'$), then the
405: Popper space is said to be {\em countably additive}.
406: Let $\Popper(W,\F)$ denote the set of Popper spaces over $(W,\F)$.
407: If $\F$ is a $\sigma$-algebra, I use a superscript $c$ to
408: denote the restriction to countably additive Popper spaces, so
409: $\Popper^c(W,\F)$ denotes the set of
410: countably additive Popper spaces over $(W,\F)$.
411: The probability measure $\mu$ in a Popper space is
412: called a {\em Popper measure}.
413: \edfn
414: %joe10
415: %The additional regularity condition on $\F'$ required in a Popper space
416: The last regularity condition on $\F'$ required in a Popper space
417: corresponds to the observation that for an unconditional
418: probability measure $\mu$, if $\mu(V \mid U) \ne 0$ then $\mu(V \inter U)
419: \ne 0$, so conditioning on $V \inter U$ should be defined.
420: %joe9
421: Note that, since this regularity condition depends on the Popper
422: measure, it may well be the case that $(W,\F,\F',\mu)$ and
423: $(W,\F,\F',\nu)$ are both cps's over $(W,\F)$, but only the former is a
424: Popper space over $(W,\F)$.
425:
426: %joe2: rewrote
427: Popper \citeyear{Popper34,Popper68}\index{Popper, K.~R.}
428: and de Finetti \citeyear{Finetti36} were the first to
429: formally consider conditional probability as the basic notion, although
430: as R\'{e}nyi \citeyear{Renyi64}\index{R\'{e}nyi, A.} points out, the
431: idea of taking conditional probability as primitive seems to go back as
432: far as Keynes \citeyear{Keynes}.
433: CP1--3 are essentially due to R\'{e}nyi \citeyear{Renyi55}.
434: Van Fraassen \citeyear{vF76} defined what I have called Popper measures;
435: he called them Popper functions, reserving the name Popper measure for
436: what I am calling a countably additive Popper measure.
437: Starting from the work of de Finetti, there has been a general study of
438: \emph{coherent conditional probabilities}. A coherent conditional
439: probability is essentially a
440: %joe10
441: %generalization of a cps, since it is defined
442: cps that is not necessarily a Popper space, since it is
443: defined
444: on a set $\F \times \F'$
445: %more general than a Popper algebra (for example,
446: where $\F'$ does not have to be a subset of $\F$); see, for example,
447: \cite{CS02} and the references therein.
448: Hammond \citeyear{Hammond94} discusses the use of conditional
449: probability spaces in philosophy and game theory, and provides an
450: extensive list of references.
451:
452:
453:
454: %Van Fraassen \cite{vF76} first Popper measures; he showed that Popper
455: %measures could be represented as sequences of probability measures that
456: %satisfied certain constraints, but these sequences seem to be quite
457: %different in spirit from lexicographic probability spaces, which I
458: %consider next.
459:
460: \subsection{Lexicographic probability spaces}\label{LPSdef}
461:
462: \dfn A {\em lexicographic probability space (LPS) (of length
463: $\alpha$) over $(W,\F)$\/} is a tuple
464: $(W,\F,\vecmu)$ where, as before, $W$ is a set of possible worlds and
465: $\F$ is an algebra over $W$, and $\vecmu$ is a sequence of finitely additive
466: probability measures on $(W,\F)$ indexed by ordinals $< \alpha$.
467: %\footnote{All probability measures are assumed to be only finitely
468: %additive, unless I explicitly say that they are countably additive.}
469: (Technically, $\vecmu$ is a function from the ordinals less
470: than $\alpha$ to probability measures on $(W,\F)$.)
471: I typically write $\vecmu$ as $(\mu_0, \mu_1, \ldots)$ or as
472: $(\mu_\beta: \beta < \alpha)$.
473: If $\F$ is a $\sigma$-algebra and each of the probability measures in
474: $\vecmu$ is countably additive, then $\vecmu$ is a {\em countably
475: additive LPS}.
476: Let $\LPS(W,\F)$ denote the set of LPS's over $(W,\F)$.
477: %joe9
478: %$\LPS(W,\F,\F')$ denote the set of LPS's $(W,\F,\vecmu)$ such that
479: %$\vecmu(U) > 0$ (i.e., $\mu_\beta(U) > 0$ for some
480: %$\beta$) iff $U \in \F'$.
481: Again, if $\F$ is a $\sigma$-algebra, a superscript $c$ is used to
482: denote countable additivity, so $\LPS^c(W,\F)$ denote the set of
483: countably additive LPS's over $(W,\F)$.
484: %joe10
485: % and $\LPS^c(W,\F,\F')$ consists
486: %of the countably additive LPS's $(W,\F,\vecmu)$ in $(W,\F,\F')$.
487: When $(W,\F)$ are understood, I often refer to $\vecmu$ as
488: the LPS.
489: %joe10
490: I write $\vecmu(U) > 0$ if $\mu_\beta(U) > 0$ for some $\beta$.
491: \edfn
492:
493: %joe9*
494: %joe10
495: %$\LPS(W,\F)$ is richer than $\Popper(W,\F)$, even if we restrict to
496: %finite %spaces $W$ (so that countable additivity is not an issue).
497: There is a sense in which $\LPS(W,\F)$ can capture a richer set of
498: preferences than $\Popper(W,\F)$,
499: even if we restrict to finite
500: spaces $W$ (so that countable additivity is not an issue).
501: For example, suppose that $W = \{w_1,w_2\}$, $\mu_0(w_1) = \mu_0(w_2) =
502: 1/2$, and $\mu_1(w_1) = 1$. The LPS $\vecmu = (\mu_0,\mu_1)$ can be
503: thought of describing the situation where $w_1$ is very slightly more
504: likely than $w_2$. Thus, for example, if $X_i$ is a bet that pays off 1
505: in state $w_i$ and 0 in state $w_{3-i}$, then according to $\vecmu$,
506: $X_1$ should be (slightly) prefereed to $X_2$, but for all $r > 1$,
507: $rX_2$ is preferred to $X_1$. There is no CPS on $\{w_1,w_2\}$ that
508: leads to these preferences
509:
510: Note that, in this example, the support of $\mu_2$ is a subset of that
511: of $\mu_1$. To obtain a bijection between LPS's and CPS's, we cannot
512: allow much overlap between the supports of the measures that make an
513: LPS. What counts as ``much overlap'' turns out to be a somewhat subtle.
514: One way to formalize it was proposed by BBD. They defined a {\em
515: lexicographic conditional probability space (LCPS)\/} to be an LPS such
516: that,
517: %joe6
518: roughly speaking,
519: the probability measures in the sequence have disjoint supports;
520: more precisely, there exist sets $U_\beta \in \F$ such that $\mu_\beta(U_\beta)
521: = 1$ and the sets $U_\beta$ are pairwise disjoint for $\beta < \alpha$.
522: One motivation for considering disjoint sets is to consider an agent who
523: has a sequence of hypotheses $(h_0, h_1, \ldots)$ regarding how the
524: world works. If the primary hyothesis $h_0$ is discarded, then the
525: agent judges events according to $h_1$; if $h_1$ is discarded, then the
526: agent uses $h_2$, and so on. Associated with hypothesis $h_\beta$ is
527: the probability measure $\mu_\beta$. What would cause $h_\beta$ to be
528: discarded is observing an event $U$ such that $\mu_\beta(U) = 0$.
529: The set $U_\beta$ is the support of the hypothesis $h_\beta$. In some
530: cases, it seems reasonable to think of the supports of these hypotheses
531: as disjoint. This leads to LCPS's.
532:
533: BBD considered only finite spaces. When we move to infinite spaces,
534: requiring disjointness of the supports of hypotheses may be too strong.
535: Brandenburger, Friedenberg, and Keisler \citeyear{BFK04} consider
536: finite-length LPS's $\vecmu$ that satisfy the property that
537: there exist sets $U_\beta$ (not necessarily disjoint) such that
538: $\mu_\beta(U_\beta) = 1$ and $\mu_\beta(U_\gamma) = 0$ for $\gamma \ne
539: \beta$. Call such an LPS an MSLPS (for \emph{mutually singular LPS}).
540: Let a {\em structured LPS (SLPS)\/} be an LPS $\vecmu$ such that there
541: exist sets $U_\beta \in \F$ such that $\mu_\beta(U_\beta) = 1$ and
542: $\mu_\beta(U_\gamma) = 0$ for $\gamma > \beta$.
543: Thus, in an SLPS, later hypotheses are given probability 0 according to
544: the probability measure induced by earlier hypotheses, but earlier
545: hypotheses do not necessarily get probability 0 according the later
546: hypotheses. (Spohn~\citeyear{Spohn86} also considered SLPS's; he called
547: them {\em dimensionally well-ordered families of probability measures}.)
548: Clearly every LCPS is an MSLPS, and every MSLPS is an SLPS. If $\alpha$
549: is countable and we require countable additivity (or if $\alpha$ is
550: finite) then the notions are easily seen to coincide. Given an SLPS
551: $\vecmu$ with associated sets $U_\beta, \beta <
552: \alpha$, define $U_\beta' = U_\beta - (\union_{\gamma > \beta} U_\gamma)$.
553: The sets $U_\beta'$ are
554: clearly pairwise disjoint elements of $\F$, and
555: %joe6
556: %$U_i'$ is a support for $\mu_i$.
557: $\mu_\beta(U_\beta') = 1$.
558: %Of course, the same argument holds even without the assumption
559: %of countable additivity if $\alpha$ is finite.
560: However, in general, LCPS's are a strict subset of MSLPS's, and MSLPS's
561: are a strict subsets of SLPS's, as the following two examples show.
562:
563:
564: %joe6: new paragraph
565: \commentout{
566: To understand the motivation for SLPS's, consider an agent with a
567: sequence of hypotheses (modeled as probability distributions). The first
568: hypothesis, modeled by $\mu_0$, is used as long as it is not
569: controverted by evidence. If an event $E$ is discovered that shows
570: that the first hypothesis must be wrong (i.e., $\mu_0(E) = 0)$, then
571: next hypothesis that gives $E$ positive measure is used. With this
572: intuition, it seems reasonable that if $i > j$, the set of states where
573: $\mu_j$ is used, namely, $U_j$, should be a set that is given measure 0
574: by all $\mu_i$ with $i < j$; hypothesis $j$ should not be used unless
575: all higher-ranking hypotheses have been discarded.
576: }
577:
578: % However, in general, countable additivity is required, as a
579: %simple modification of Example~\ref{SLPSxam} below shows.)
580:
581:
582: \xam\label{SLPSxam} Consider a well-ordering of the interval $[0,1]$,
583: that is,
584: %joe2
585: %an isomorphism
586: a bijection
587: from $[0,1]$ to an initial segment of the ordinals.
588: %a well-ordering of the reals. The existence of such a well-ordering is
589: %known to be equivalent to the Axiom of Choice. In any case,
590: Suppose that this initial segment of the ordinals has length $\alpha$.
591: Let $([0,1],\F,\vecmu)$ be an LPS of length $\alpha$ where $\F$ consists
592: of the Borel subsets of $[0,1]$.
593: Let $\mu_0$ be the standard Borel measure on $[0,1]$,
594: and let $\mu_\beta$ be the measure that gives probability 1 to
595: $r_\beta$, the $\beta$th real in the well-ordering. This clearly gives
596: an SLPS, since
597: %joe6
598: %the support of $\mu_0$ is $[0,1]$ and the support of $\mu_\beta$ for $0
599: we can take $U_0 = [0,1]$ and $U_\beta = \{r_\beta\}$ for $0
600: < \beta < \alpha$; note that $\mu_\alpha(U_\beta) = 0$ for $\beta > \alpha$.
601: %joe9
602: %However, this SLPS is not equivalent to any LCPS; there
603: However, this SLPS is not equivalent to any MSLPS (and hence not to any
604: LCPS); there
605: %joe6
606: %is no support of $\mu_0$ which is disjoint from the supports of
607: %joe9
608: is no
609: set $U_0'$ such that $\mu_0(U_0') = 1$ and $U_0'$ is disjoint from
610: $r_\beta$ for all $\beta$ with $0 < \beta < \alpha$.
611: %The intuition given above for SLPS's is also given by Brandenburger,
612: %Friedenberg, and Keisler for MSLPS's.}
613: %joe8
614: %I would argue that this example
615: %illustrates that the mutual singularity requirement is too strong to capture
616: %that intuition; it suffices that $\mu_\alpha(U_\beta) = 0$ for $\beta >
617: %\alpha$ (as is required for SLPS's). If $r_\beta$ is observed, then
618: %the agent should discard hypothesis $\mu_0$ and use hypothesis
619: %$\mu_\beta$, even though $\mu_\beta([0,1]) = 1$.}
620: \exam
621:
622: %joe9
623: \xam\label{MSLPSxam} Suppose that $W = [0,1] \times [0,1]$. Again,
624: consider a well-ordering on $[0,1]$. Using the notation of
625: Example~\ref{SLPSxam}, define $U_{0,\beta} = r_{\beta} \times [0,1]$ and
626: $U_{1,\beta} = [0,1] \times \{r_\beta\}$. Define $\mu_{i,\beta}$ to be
627: the Borel measure on $U_{i,\beta}$. Consider the LPS $(\mu_{0,0},
628: \mu_{0,1}, \ldots, \mu_{1,0}, \mu_{1,1}, \ldots)$. Clearly this is an
629: MSLPS, but not an LCPS. \exam
630:
631: The difference between LCPS's, MSLPS's, and SLPS's does not arise in the work
632: of BBD, since they consider only finite
633: sequences of measures. The restriction to finite sequences, in turn, is
634: due to their restriction to finite sets $W$ of possible worlds.
635: Clearly, if $W$ is finite, then all LCPS's over $W$ must have length $\le
636: |W|$, since the measures in an LCPS have disjoint supports. Here it
637: will play a more significant role.
638:
639: We can put an obvious lexicographic order $<_L$ on sequences $(x_0, x_1,
640: \ldots)$ of numbers in $[0,1]$ of length $\alpha$: $(x_0, x_1, \ldots)
641: <_L (y_0, y_1, \ldots)$ if there exists $\beta < \alpha$ such that
642: $x_\beta < y_\beta$ and $x_\gamma
643: = y_\gamma$ for all $\gamma < \beta$. That is, we
644: compare two sequences by comparing their components at the first place
645: they differ. (Even if $\alpha$ is infinite, because we are dealing with
646: ordinals, there will be a least ordinal at which the sequences differ if
647: they differ at all.) This lexicographic order will be used
648: to define decision rules.
649: %I return to this issue in Section~\ref{??}.
650:
651: BBD define conditioning in LPS's as follows. Given $\vecmu$ and $U \in
652: \F$ such that $\vecmu(U) > 0$, let $\vecmu|U =
653: (\mu_{k_0}(\cdot \mid U), \mu_{k_1}(\cdot \mid U), \ldots )$, where $(k_0,
654: k_1, \ldots)$ is the subsequence of all indices for which the
655: probability of $U$ is positive.
656: Formally, $k_0 = \min\{k: \mu_k(U) > 0\}$
657: and for an arbitrary ordinal $\beta > 0$, if $\mu_{k_\gamma}$ has been
658: defined for all $\gamma < \beta$ and there exists a measure $\mu_{\delta}$ in
659: $\vecmu$ such that $\mu_{\delta}(U) > 0$ and $\delta > k_\gamma$ for all
660: $\gamma < \beta$, then $k_\beta = \min\{\delta: \mu_{\delta}(U) > 0, \,
661: \delta > k_\gamma \mbox{ for all } \gamma < \beta\}$.
662: Note that
663: $\vecmu|U$ is undefined if $\vecmu(U) = 0$.
664:
665: \subsection{Nonstandard probability spaces}\label{NPSdef}
666:
667: It is well known that there exist {\em non-Archimedean fields}---fields
668: that include the real numbers as a subfield but also have
669: {\em infinitesimals\/}, numbers that are positive but still less than
670: any positive real number. The smallest such non-Archimedean field,
671: commonly denoted $\IR(\epsilon)$, is the smallest field generated by
672: adding to the reals a single infinitesimal $\epsilon$.%
673: \footnote{The construction of $\IR(\epsilon)$ apparently goes back to
674: Robinson \citeyear{Robinson73}. It is reviewed by
675: Hammond \citeyear{Hammond94,Hammond99} and Wilson
676: \citeyear{Wilson95} (who calls $\IR(\epsilon)$ the {\em extended
677: reals\/}).}
678: %joe3
679: %The {\em hyperreals}, nonstandard models of the reals that satisfy
680: %all the first-order properties that hold of the real numbers (see
681: %\cite{(Davis77}), are also instances of non-Archimedean fields.
682: We can further restrict to non-Archimedean fields that are
683: \emph{elementary extensions} of the standard reals: they
684: agree with the
685: standard reals on all properties that can be expressed in a first-order
686: language with a predicate $N$ representing the natural numbers.
687: For most of this paper, I use only the following properties
688: of non-Archimedean fields:
689: %in the language of arithmetic (i.e., first-order formulas involving 0,
690: %1, $+$, and $\times$). That is, there are nonstandard models of the
691: %reals that include infinitesimals that satisfy a first order formula
692: %$\phi$ in the language of arithmetic iff $\phi$ is satisfied in the
693: %%nstandard reals. The existence of such nonstandard models follows easily
694: %from the compactness theorem for first-order logic. (See \cite{Davis}
695: %for an introduction to both nondstandard reals and the relevant logic.)
696: %The logical details do not matter. All that I need for the purposes of
697: %this paper are the following basic facts, which I shall use without
698: %further comment in the remainder of the paper.
699: %\begin{itemize}
700: %\item There exist nonstandard models of the reals. (By nonstandard
701: %model here I mean any model that satisfies all the first-order
702: %properties of the reals but is not isomorphic to the reals.)
703: %Each of these nonstandard models embeds a nonstandard model of the
704: %natural numbers.
705: %\item
706: %Moreover, for any ordinal alpha, there exists a nonstandard model with
707: %%nnonstandard natural number $N_\beta$ for all ordinals $\beta \le \alpha$ such
708: %that if $\beta < \beta'$ then $N_\beta < N_{\beta'}$.
709: %(The existence of such models follows immediately from the compactness
710: %theorem too.)
711: \begin{enumerate}
712: \item If $\IR^*$ is a non-Archimedean field, then for all $b \in \IR^*$
713: such that $-r < b < r$ for some standard real $r > 0$,
714: there is a unique closest real number $a$ such that $|a - b|$ is an
715: infinitesimal. (Formally, $a$ is the inf of the set of real numbers
716: that are at least as large as $b$.) Let $\stand{b}$ denote the closest
717: standard real to $b$; $\stand{b}$ is sometimes read ``the standard
718: part of $b$''.
719: \item If $\stand{\epsilon/\epsilon'} =0$, then $a \epsilon < \epsilon'$
720: for all positive standard real numbers $a$. (If $a \epsilon$ were
721: greater than $\epsilon'$, then $\epsilon/\epsilon'$ would be greater
722: than $1/a$,
723: contradicting the assumption that $\stand{\epsilon/\epsilon'} = 0$.)
724: \end{enumerate}
725:
726: Given a non-Archimedean field $\IR^*$, a {\em
727: nonstandard probability space (NPS) over $(W,\F)$ (with range $\IR^*$)\/} is
728: a tuple $(W,\F,\mu)$, where $W$ is a set of possible worlds, $\F$ is an
729: algebra
730: of subsets of $W$, and $\mu$ assigns to sets in $\F$
731: %joe2
732: %an element of
733: a nonnegative element of
734: $\IR^*$ such that $\mu(W) = 1$ and $\mu(U \union V) = \mu(U) + \mu(V)$ if
735: $U$ and $V$ are disjoint.%
736: \footnote{Note that, unlike Hammond \citeyear{Hammond94,Hammond99},
737: I do not restrict the range of probability measures to consist of
738: ratios of polynomials in $\epsilon$ with nonnegative coefficients.}
739:
740: If $W$ is infinite, we may also require that
741: $\F$ be a $\sigma$-algebra and that $\mu$ be countably additive.
742: (There are some subtleties involved with countable additivity in
743: nonstandard probability spaces; see Section~\ref{countableadditivity}.)
744:
745:
746:
747: \section{Relating Popper Spaces to (S)LPS's}\label{FCP}
748:
749: In this section, I consider a mapping $\FCP$ from SLPS's over
750: $(W,\F)$ to Popper spaces over $(W,\F)$, for each fixed $W$ and $\F$,
751: and show that, in many cases of interest, $\FCP$ is a bijection.
752: Given an SLPS $(W,\F,\vecmu)$ of length $\alpha$,
753: consider the cps $(W,\F,\F',\mu)$ such that $\F' = \union_{\beta <
754: \alpha} \{V \in \F: \mu_\beta(V)
755: > 0 \}$. For $V \in \F'$, let $\beta_V$
756: be the smallest index such $\mu_{\beta_V}(V) > 0$. Define $\mu(U \mid V) =
757: \mu_{\beta_V}(U \mid V)$. I leave it to the reader to check that
758: $(W,\F,\F',\mu)$ is a Popper space.
759:
760: %joe2
761: %There are many isomorphisms between two spaces.
762: There are many bijections between two spaces.
763: Why is $\FCP$ of interest?
764: Suppose that $\FCP(W,\F,\vecmu) = (W,\F,\F',\mu)$. It is easy
765: to check that the following two important properties hold:
766: \begin{enumerate}
767: \item $\F'$ consists precisely of those events for which conditioning in
768: the LPS is defined; that is, $\F' = \{U:
769: %joe9
770: %\mu_\beta(U) \ne 0 \mbox{ for some } \mu_\beta \in \vecmu\}$.
771: \vecmu(U) > 0\}$.
772: \item For $U \in \F'$, $\mu(\cdot \mid U) = \mu'(\cdot \mid U)$, where
773: $\mu'$ is the first probability measure in the sequence $\vecmu|U$.
774: That is, the
775: Popper measure agrees with the most significant probability measure
776: in the conditional LPS given $U$. Given that an LPS assigns to an event
777: $U$ a sequence of numbers and a Popper measure assigns to $U$ just a
778: single number, this is clearly the best single number to take.
779: \end{enumerate}
780: %It seems that these are minimal properties that an
781: %isomorphism should satisfy. Moreover,
782: It is clear that these two properties in fact characterize $\FCP$.
783: Thus, $\FCP$ preserves the events on which conditioning is possible and
784: the most significant term in the lexicographic probability.
785:
786: \subsection{The finite case}
787: It is useful to separate the analysis of $\FCP$ into two cases, depending
788: on whether or not the state space is finite. I consider the finite case
789: first.
790:
791: BBD claim without proof that $\FCP$ is a bijection
792: from LCPS's to
793: conditional probability spaces. They work in finite spaces $W$ (so that
794: LCPS's are equivalent to SLPS's) and restrict
795: attention to LPS's where $\F
796: %joe4
797: %= 2^W$ and $\F' = 2^W - \emptyset$ (so that conditioning is defined for
798: = 2^W$ and $\F' = 2^W - \{\emptyset\}$ (so that conditioning is defined for
799: all nonempty sets). Since $\F' = 2^W - \{\emptyset\}$, the cps's they
800: consider are all Popper spaces.
801: Hammond \citeyear{Hammond94} provides a careful proof of this result,
802: under the restrictions considered by BBD.
803: I generalize Hammond's result by considering
804: %joe9
805: %arbitrary finite Popper spaces
806: finite Popper spaces
807: with arbitrary conditioning events.
808: No new conceptual issues arise in doing this extension; I
809: include it here only to be able to contrast it with the other
810: results.
811: %Hammond's result holds for arbitrary finite Popper spaces, with
812: %essentially no change in proof.
813:
814: Let $\SLPS(W,\F)$ denote the set of LPS's over $(W,\F)$; let
815: $\SLPS(W,\F,\F')$ denote the set of LPS's $(W,\F,\vecmu)$ such that
816: $\vecmu(U) > 0$ for all $U \in \F'$ (i.e., $\mu_\beta(U) > 0$ for some
817: $\beta$); as usual, I use a superscript $c$ to denote countable
818: additivity, so, for example, $\SLPS^c(W,\F)$ denotes the set of
819: countably additive SLPS's over $(W,\F)$.
820: Let $\Popper(W,\F,\F')$ denote the set of Popper spaces of the form
821: $(W,\F,\F')$ and let $\Popper^c(W,\F,\F')$
822: denote the set of Popper spaces of the form
823: $(W,\F,\F',\mu)$ where $\mu$ is countably additive.
824:
825:
826: \thm\label{FCPfin}
827: %joe9
828: %If $W$ is finite, then $\FCP$ is a bijection
829: %from $\SLPS(W,\F)$ to $\Popper(W,\F)$. \ethm
830: If $W$ is finite, then
831: $\FCP$ is a bijection from $\SLPS(W,\F,\F')$ to $\Popper(W,\F,\F')$. \ethm
832:
833: \prf It is immediate from the definition that if $(W,\F,\vecmu) \in
834: \SLPS(W,\F,\F')$, then $\FCP(W,\F,\vecmu) \in \Popper(W,\F,\F')$. It is
835: also straightforward to show that $\FCP$ is an injection (see the
836: appendix for details). The work comes in showing that $\FCP$ is a
837: surjection (or, equivalently, in constructing an inverse to $\FCP$).
838: I sketch the main ideas of the argument here, leaving details to the
839: appendix.
840:
841: Given $\mu \in \Popper(W,\F,\F')$, the idea is to choose $k \le |W|$ and $k$
842: disjoint sets $U_0, \ldots, U_k \in \F'$ appropriately such that $\mu_j
843: = \mu \mid U_j$ for $j= 0, \ldots, k$ (i.e., $\mu_j(V) = \mu(V \mid
844: U_j)$) amd $\FCP(W,\F,\vecmu) = \mu$. Since the sets $U_0, \ldots, U_k$
845: are disjoint, $\vecmu$ must be an SLPS. The difficulty lies in choosing
846: $U_0, \ldots, U_k$ so that $\vecmu(U) > 0$ iff $U \in \F'$. This is
847: done as follows. Let $U_0$ be the smallest set $U \in \F$ such that
848: $\mu(U) = 1$.
849: Since $W$ is finite, there is such a smallest set; it is simply the
850: intersection of all sets $U$ such that $\mu(U \mid W) = 1$. Since $\mu(U_0
851: \mid W) > 0$, it follows that $U_0 \in \F'$. If $\overline{U}_0 \notin
852: \F'$. then (because $\F'$ is closed under supersets in $\F$), no
853: subset of $\overline{U}_0$ is in $\F'$. If $\overline{U}_0 \in
854: \F'$, let $U_1$ be the smallest set in
855: $\F$ such that $\mu(U_1 \mid \overline{U}_0) = 1$. Note that $U_1
856: \subseteq \overline{U}_0$ and that $U_1 \in \F'$.
857: Continuing in this way, it is
858: clear that there exists a $k \ge 0$ and a sequence of pairwise disjoint
859: sets $U_0, U_1, \ldots, U_k$ such that (1) $U_i \in \F'$ for $i = 0,
860: \ldots, k$, (2) for $i < k$, $\overline{U_0 \union \ldots \union U_i}
861: \in \F'$ and $U_{i+1}$ is the smallest subset of $\F$ such that
862: $\mu(U_{i+1} \mid \overline{U_0 \union \ldots \union U_i}) = 1$, and (3)
863: $\overline{U_0 \union \ldots \union U_k} \notin \F'$.
864: Condition (2) guarantees that $U_{i+1}$ is a subset of
865: $\overline{U_0 \union \ldots \union U_i}$, so the $U_i$'s are pairwise
866: disjoint. Define the LPS $\vecmu = (\mu_1, \ldots, \mu_k)$ by taking
867: $\mu_i(V) = \mu(V \mid U_i)$. Clearly the support of $\mu_i$ is $U_i$, so
868: this is an LCPS (and hence an SLPS). \eprf
869:
870: %%joetark
871: %\thm\label{FCPfin} {\rm \cite{Hammond94}} If $W$ is finite, then $\FCP$
872: %is an isomorphism from $\SLPS(W,\F)$
873: %to $\Popper(W,\F)$. \ethm
874:
875: \cor\label{FCPfin1}
876: If $W$ is finite, then $\FCP$ is a bijection
877: from $\SLPS(W,\F)$ to $\Popper(W,\F)$. \ecor
878:
879:
880: %That is, Popper spaces are strictly more general than SLPS's in the case
881: %of infinite spaces where the probability measures are not necessarily
882: %countably additive. On the other hand, I show that $\FCP$ is an
883: %isomorphism from countably additive SLPS's to countable
884: %additive Popper spaces.
885:
886:
887: %\subsection{Technical results}
888:
889:
890:
891: %\subsection{Infinite State Spaces without Countability}
892: \subsection{The infinite case}
893:
894: The case where the state space $W$ is infinite is not considered
895: by either BBD or Hammond. It presents some interesting subtleties.
896:
897: It is easy to see that $\FCP$ is an injection from
898: %joe9
899: %from SLPS's to Popper spaces. However,
900: SLPS's to Popper spaces. However,
901: %joetark: added next line
902: as the following two examples show, if we do not require countable
903: additivity, then it is not a bijection.
904:
905:
906:
907: \xam\label{counter1} (This example is essentially due to Robert
908: Stalnaker [private communication, 2000].) Let $W = \IN$, the natural
909: numbers, let $\F$ consist of the finite and cofinite subsets of $\IN$
910: %joe6
911: (recall that a cofinite set is the complement of a finite set),
912: and let $\F' = \F - \{\emptyset\}$. If $U$ is cofinite,
913: take $\mu^1(V \mid U)$ to be 1 if $V$ is cofinite and 0 if $V$ is finite.
914: If $U$ is finite, define $\mu^1(V \mid U) = |V \inter
915: U|/|U|$. I leave it to the reader to check that $(\IN,\F,\F',\mu^1)$ is a
916: %joe9
917: %Popper space. Suppose there were some LPS $(\IN,\F,\vecmu)$ which was
918: Popper space. Note that $\mu^1$ is not countably additive (since
919: $\mu^1(\{i\} \mid \IN) = 0$ for all $i$, although $\mu^1(\IN \mid \IN) =
920: 1$).
921: Suppose that there were some LPS $(\IN,\F,\vecmu)$ which was
922: mapped by $\FCP$ to this Popper space. Then it is easy to check that
923: if $\mu_i$ is the first measure in $\vecmu$ such that $\mu_i(U) > 0$ for
924: some finite set $U$, then $\mu_i(U') > 0$ for all nonempty finite sets $U'$.
925: To see this, note that for any nonempty finite set $U'$, since
926: $\mu_i(U) > 0$, it follows that $\mu_i(U \union U') > 0$. Since $U
927: \union U'$ is finite, it must be the case that $\mu_i$ is the first
928: measure in $\vecmu$ such that $\mu_i(U \union U') > 0$. Thus, by
929: definition, $\mu^1(U' \mid U \union U') = \mu_i(U' \mid U \union U')$. Since
930: $\mu^1(U' \mid U \union U') > 0$, it follows that $\mu_i(U') > 0$.
931: Thus, $\mu_i(U') > 0$ for all nonempty finite sets $U'$.
932:
933: It is also easy to see that $\mu_i(U)$ must be proportional
934: to $|U|$ for all finite sets $U$. To show this, it clearly suffices to show
935: that $\mu_i(n) = \mu_i(0)$ for all $n \in \IN$. But this is immediate
936: from the observation that
937: $$\mu_i(\{0\} \mid \{0, n \}) = \mu^1(\{0\} \mid \{0, n \}) =
938: |\{0\}|/|\{0,n\}| = \frac{1}{2}.$$
939: But there is no probability measure $\mu_i$ on the natural
940: numbers such that $\mu_i(n) = \mu_i(0) > 0$ for all $n \ge 0$.
941: %For, by countable additivity, if
942: %$\mu_i(0) = 0$ then $\mu_i(\IN) = 0$ and if $\mu_i(0) > 0$, then
943: %$\mu_i(\IN) = \infty$.
944: For if $\mu_i(0) > 1/N$, then $\mu_i(\{0, \ldots, N-1\}) > 1$, a
945: contradiction.
946: (See Example~\ref{counter3} for further discussion of this setup.)
947: \exam
948:
949: %joetark
950: %\commentout{
951: \xam\label{counter2} Again, let $W = \IN$,
952: let $\F$ consist of the finite and cofinite subsets of $\IN$,
953: and let $\F' = \F - \{\emptyset\}$. As with $\mu^1$,
954: if $U$ is cofinite,
955: take $\mu^2(V \mid U)$ to be 1 if $V$ is cofinite and 0 if $V$ is finite.
956: However, now, if $U$ is finite, define $\mu^2(V \mid U) =
957: 1$ if $\max(V \inter U) = \max U$, and $\mu^2(V \mid U) = 0$ otherwise.
958: Intuitively, if $n > n'$, then $n$ is infinitely more probable than $n'$
959: according to $\mu^2$.
960: Again, I leave it to the reader to check that $(\IN,\F,\F',\mu^2)$ is a
961: Popper space. Suppose there were some LPS $(\IN,\F,\vecmu)$ which was
962: mapped by $\FCP$ to this Popper space. Then it is easy to check that
963: if $\mu_n$ is the first measure in $\vecmu$ such that $\mu_n(\{n\}) > 0$, then
964: $\mu_n$ comes before $\mu_{n'}$ in $\vecmu$ if $n > n'$. However, since
965: $\vecmu$ is well-founded, this is impossible.
966: \exam
967:
968: %joetark:
969: %As the following theorem shows, there are no such counterexamples if we
970: As the following theorem,
971: %joe6
972: originally
973: proved by Spohn \citeyear{Spohn86}, shows,
974: there are no such counterexamples if we
975: restrict to countably additive SLPS's and countably additive Popper spaces.
976:
977:
978:
979: \thm\label{infiso} {\rm \cite{Spohn86}}
980: For all $W$, the map $\FCP$ is a bijection from
981: $\SLPS^c(W,\F,\F')$
982: to $\Popper^c(W,\F,\F')$. \ethm
983:
984: \prf Again, the difficulty comes in showing that $\FCP$ is onto. Given
985: a Popper space $(W,\F,\F',\mu)$, I again construct sets $U_0, U_1,
986: \ldots$ and an LPS $\vecmu$ such that $\mu_\beta(V)=\mu(V \mid
987: U_\beta)$, and show that $\FCP(W,\F,\vecmu) = (W,\F,\F',\mu)$.
988: However, now a completely different construction is required; the
989: earlier inductive construction of the
990: sequence $U_0, \ldots, U_k$ no longer works. The problem already arises
991: in the construction of $U_0$. There may no longer be a smallest set
992: $U_0$ such that $\mu(U_0) = 1$. Consider, for example, the interval
993: $[0,1]$ with Borel measure. There is clearly no smallest subset $U$ of
994: $[0,1]$ such that $\mu(U) = 1$. The details can be found in the appendix.
995: \eprf
996:
997:
998: \cor\label{infiso1} For all $W$, the map $\FCP$ is a bijection from
999: $\SLPS^c(W,\F)$
1000: to $\Popper^c(W,\F)$.
1001: \ecor
1002:
1003: It is important in Corollary~\ref{infiso1} that we consider SLPS's and not
1004: %joe9
1005: %LCPS's. $\FCP$ is in fact not a bijection from LCPS's to Popper
1006: MSLPS's or LCPS's. $\FCP$ is in fact not a bijection from MSLPS's or
1007: LCPS's to Popper
1008: spaces.
1009:
1010: \xam\label{SLPSxam2} Consider the Popper space $([0,1],\F,\F',\mu)$
1011: which is the image under $\FCP$ of the SLPS constructed in
1012: Example~\ref{SLPSxam}. It is easy to see that this Popper space cannot
1013: %joe9
1014: %be the image under $\FCP$ of some LCPS. \exam
1015: be the image under $\FCP$ of some MSPLS (and hence not of some LCPS
1016: either). \exam
1017:
1018: %joe9: moved here
1019: \subsection{Treelike CPS's}\label{sec:BS}
1020:
1021: %joe10
1022: %One of the ``minimal'' requirements for $\F \times \F'$ to be a Popper
1023: %algebra over $W$
1024: One of the requirements in a Popper space is that
1025: $\F'$ be closed under supersets in $\F$. If we
1026: think of $\F'$ as consisting of all sets on which conditioning is
1027: possible, this makes sense; if we can condition on a set $U$, we should
1028: be able to consider on a superset $V$ of $U$. But if we think of $\F'$
1029: as representing all the possible evidence that can be obtained (and
1030: thus, the set of events on which an agent must be be able to
1031: condition, so as to update her beliefs), there is no reason that $\F'$
1032: should be closed under supersets; nor, for that matter, is it
1033: necessarily the case that if $U \in \F'$ and $\mu(V \mid U) \ne 0$, then
1034: $V \inter U \in \F'$.
1035: In general, a cps where $\F'$ does not have these properties
1036: cannot be represented by an LPS, as the following
1037: example shows.
1038:
1039: \xam\label{xam:noPopper} Let $W = \{w_1, w_2, w_3, w_4\}$, let $\F$ consist
1040: of all subsets of $W$, and let $\F'$ consist of all the 2-element
1041: subsets of $W$.
1042: %joe10
1043: %Clearly $\F \times \F'$ is not a Popper algebra, since
1044: Clearly
1045: $\F'$ is not closed under supersets. Define $\mu$ on $\F \times \F'$
1046: such that $\mu(w_1 \mid \{w_1,w_3\}) = \mu(w_4 \mid
1047: \{w_2,w_4\}) = 1/3$, and $\mu(w_1 \mid \{w_1,w_2\}) = \mu(w_4 \mid
1048: \{w_3,w_4\}) =
1049: 1/2$, and CP1 and CP2 hold. This is easily seen to determine $\mu$.
1050: Moreover, $\mu$ vaciously satisfies CP3, since there do not exist
1051: distinct sets $U$ and $X$ in $\F'$ such that $U \subseteq X$. It is
1052: easy to show that there is no unconditional probability $\mu^*$ on $W$
1053: such that $\mu^*(U \mid V) = \mu(U \mid V)$ for all pairs $(U,V) \in \F
1054: \times \F'$ such that $\mu^*(V) > 0$ (where, for $\mu^*$, the
1055: conditional probability is defined in the standard way).%
1056: \footnote{This example is closely related to examples of conditional
1057: probabilities for which there is no common prior; see, for example,
1058: \cite[Example 2.2]{Hal21}.} It easily follows that there is no LPS
1059: $\vecmu$ such that $\vecmu(U \mid V) = \mu(U \mid V)$ for all $(U,V) \in
1060: \F \times \F'$ (since otherwise $\mu_0$ would agree with $\mu$ on all
1061: pairs $(U,V) \in \F \times \F'$ such that $\mu(V) > 0$).
1062: Had $\F'$ been closed under supersets, it would have included $W$. It
1063: is easy to see that it is impossible to extend $\mu$ to $\F \times (\F'
1064: \union \{W\})$ so that CP3 holds.
1065: \exam
1066:
1067:
1068:
1069: In the game-theory literature, Battigalli and Siniscalchi
1070: \citeyear{BS02} use conditional probability measures to model players'
1071: beliefs about other players' strategies in
1072: extensive-form games where agents have perfect recall. The conditioning
1073: events are essentially
1074: information sets;
1075: %joe9
1076: %Thus, the cps's they consider are not necessarily Popper spaces. The set
1077: %of conditioning events (i.e., the set $\F'$) may not be closed under
1078: %supersets nor does it necessarily satisfy the condition that if
1079: %$U \in \F'$ and $\mu(V \mid U) \ne 0$ then $V \inter U \in \F'$.
1080: which can be thought of as representing the possible evidence that an
1081: agent can obtain in a agame.
1082: Thus, the cps's they consider are not necessarily Popper spaces, for the
1083: reasons described above.
1084: Nevertheless, the conditioning events considered by Battigalli and
1085: Sinischalchi satisfy certain properties that
1086: prevent an analogue of Example~\ref{xam:noPopper} from holding.
1087: I now make this precise.
1088:
1089: %joe7:
1090: Formally, I assume that there is a one-to-one
1091: correspondence between the sets in $\F'$ and the information sets of
1092: some fixed player $i$. For each set $U \in \F'$, there is a unique
1093: information set $I_U$ for player $i$ such that $U$ consists of all the
1094: strategy profiles
1095: that reach $I_U$. With this identification, it is immediate that we can
1096: organize the sets in $\F'$ into a forest (i.e., a collection of trees),
1097: with the same ``reachability'' structure as that of the information sets
1098: in the game tree. The topmost sets in the forest are the ones
1099: corresponding to the topmost information
1100: sets for player $i$ in the game tree. There may be several such topmost
1101: information sets if nature or some player $j$ other than $i$ makes the
1102: first move in the game. (That is why we have a forest, rather than a
1103: tree.) The immediate successors of a set $U$ are the sets of strategy
1104: profiles corresponding to information sets for player $i$ reached
1105: immediately after $I_U$.
1106: Because agents have perfect recall, the conditioning events $\F'$ have
1107: the following properties:
1108: \begin{itemize}
1109: \item[T1.] $\F'$ is countable.
1110: \item[T2.] The elements of $\F'$ can be organized as a forest (i.e., a collection of
1111: trees) where, for each $U \in \F'$, if there is an edge from $U$ to
1112: some $U' \in \F'$, then $U' \subseteq U$, all the immediate successors
1113: of $U$ are
1114: disjoint, and $U$ is the union of its immediate successors.
1115: \item[T3.] The topmost nodes in each tree of the forest form a
1116: partition of $W$.
1117: \end{itemize}
1118:
1119: Say that a set $\F'$ is \emph{treelike} if it satisfies T1--3.
1120: It follows from T2 and T3 that, for any sets $U$ and $U'$ in a treelike
1121: set $\F'$, either $U \subseteq U'$ (if $U$ is a descendant of $U'$
1122: in some tree), $U' \subseteq
1123: U$ (if $U'$ is a descendant of $U$), or $U$ and $U'$ are disjoint (if
1124: neither is a descendant of the other).
1125: If $\F'$ is treelike, let $\T^c(W,\F,\F')$ consist of all countably
1126: additive cps's defined on
1127: $\F \times \F'$.
1128: %Let $\SLPS^c(W,\F,\F')$ consist of all SLPS's $\vecmu$
1129: %in $\SLPS^c(W,\F)$
1130: %such that $\vecmu(U) > \vec{0}$ for all $U \in \F'$
1131: %(i.e., $\mu_j(U) > 0$ for some $j$).
1132: I abuse notation
1133: in the next result, viewing $\FCP$ as a mapping from
1134: $\SLPS^c(W,\F,\F')$ to $\T^c(W,\F,\F')$.
1135:
1136:
1137: \pro\label{prop:BS} The map $\FCP$ is a surjection from
1138: $\SLPS^c(W,\F,\F')$ onto $\T^c(W,\F,\F')$. \epro
1139:
1140: Since $\F'$ is countable, every SLPS in $\SLPS^c(W,\F,\F')$ must have at
1141: most countable length. Thus, there is no distinction between SLPS's,
1142: LCPS's, and MSPLS's in this case. (Indeed, in the proof of
1143: Proposition~\ref{prop:BS}, the LPS constructed to demonstrate the
1144: surjection is an LCPS.) Note that we cannot hope to get a bijection
1145: here, even if $W$ is finite. For example, suppose that $W = \{w_1,
1146: w_2\}$, $\F = 2^W$, and
1147: %joe9
1148: %$\F' = \{\{w_1\}, \{w_2\}\}$. $\F'$ is clearly treelike. There
1149: $\F' = \{\{w_1\}, \{w_2\}\}$. $\F'$ is clearly treelike, and there
1150: is a unique cps $\mu$ on $(W,\F,\F')$. $\FCP$ maps
1151: every SLPS in $\SLPS(W,\F,F')$ to $\mu$, but is clearly not a
1152: bijection. (This example also shows that we do not get a bijection by
1153: considering MSLPS's or LCPS's either.)
1154:
1155: %joetark:
1156: \subsection{Related Work}\label{related}
1157:
1158: It is interesting to contrast these results to those of R\'{e}nyi
1159: \citeyear{Renyi56} and van Fraassen \citeyear{vF76}. R\'{e}nyi considers
1160: what he calls {\em dimensionally ordered\/} systems.
1161: A dimensionally ordered system over $(W,\F)$ has the form
1162: $(W,\F,\F',\{\mu_i: i \in I\})$, where $\F$ is an algebra of
1163: subsets of $W$, $\F'$ is a subset of $\F$ closed under finite unions
1164: %joe9
1165: (but not necessarily closed under supersets in $\F$),
1166: $I$ is a totally ordered set (but not necessarily well-founded, so it
1167: may not, for example, have a first element) and $\mu_i$ is a measure on
1168: $(W,\F)$ (not necessarily a probability measure) such that
1169: \begin{itemize}
1170: \item for each $U \in \F'$, there is some $i \in I$ such that $0 <
1171: \mu_i(U) < \infty$ (note that the measure of a set may, in general, be
1172: $\infty$),
1173: \item if $\mu_i(U) < \infty$ and $j < i$, then $\mu_j(U) = 0$.
1174: \end{itemize}
1175: Note that it follows from these conditions that for each $U \in \F'$,
1176: there is exactly one $i \in I$ such that $0 < \mu_i(U) < \infty$.
1177:
1178:
1179: There is an obvious analogue of the map $\FCP$ mapping dimensionally
1180: ordered systems to cps's. Namely, let $\FDC$ map the dimensionally
1181: ordered system $(W,\F,\F',\{\mu_i: i \in I\})$ to the cps
1182: $(W,\F,\F',\mu)$, where $\mu(V \mid U) = \mu_i(V \mid U)$, where $i$ is the unique
1183: element of $I$ such that $0 < \mu_i(U) < \infty$. R\'{e}nyi shows that
1184: $\FDC$ is a bijection from dimensionally ordered systems to cps's
1185: where the set $\F'$ is closed under finite unions. (Cs\'{a}sz\'{a}r
1186: \citeyear{Csaszar55} extends this result to cases where the set $\F'$ is
1187: not necessarily closed under finite unions.)
1188: R\'{e}nyi assumes that all measures involved are countably additive and
1189: that $\F$ is a $\sigma$-algebra, but these are inessential assumptions.
1190: That is, his proof goes through without change if $\F$ is an algebra and
1191: the measures are additive; all that happens is that the resulting
1192: conditional probability measure is additive rather than
1193: $\sigma$-additive.
1194:
1195: It is critical in R\'{e}nyi's framework that the $\mu_i$'s are arbitrary
1196: measures, and not just probability measures. His result does not hold
1197: if the $\mu_i$'s are required to be probability measures. In the case of
1198: finitely additive measures, the Popper space constructed in
1199: Example~\ref{counter1} already shows why. It corresponds to a
1200: dimensionally ordered space $(\mu_1,\mu_2)$ where $\mu_1(U)$ is 1 if $U$ is
1201: cofinite and 0 if $U$ is finite and $\mu_2(U) = |U|$ (\ie
1202: the measure of a set is its cardinality). It cannot be captured by a
1203: dimensionally ordered space where all the elements are probability
1204: measures, for the same reason that it is not the image of an SLPS under
1205: $\FCP$. (R\'{e}nyi \citeyear{Renyi56} actually provides a
1206: general characterization of when the $\mu_i$'s can be taken to be
1207: (countably additive) probability measures.)
1208: %joe4: removed paragraph break
1209: Another example is provided by the Popper space considered in
1210: Example~\ref{counter2}. This corresponds to the dimensionally ordered
1211: system $\{\mu_\beta: \beta \in \IN \union \{\infty\}\}$, where
1212: $$
1213: \mu_n(U) =
1214: \left \{ \begin{array}{ll}
1215: 0 &\mbox{if $\max(U) < n$}\\
1216: 1 &\mbox{if $\max(U) = n$}\\
1217: \infty &\mbox{if $\max(U) > n$},
1218: \end{array} \right.
1219: $$
1220: where $\max(U)$ is taken to be $\infty$ if $U$ is cofinite.
1221:
1222: %joe4
1223: Krauss \citeyear{Kr68} restricts to Popper algebras of the form $\F
1224: %joe9
1225: %\times \F'-\{\emptyset\}$; this allows him to simplify and generalize
1226: \times (\F-\{\emptyset\})$; this allows him to simplify and generalize
1227: R\'{e}nyi's analysis. Interestingly, he also proves a representation
1228: theorem in the spirit of R\'{e}nyi's that involves nonstandard
1229: probability.
1230:
1231: Van Fraassen \citeyear{vF76} proves a
1232: result whose assumptions are somewhat closer to Theorem~\ref{infiso}.
1233: Van Fraassen considers what he calls {\em ordinal families of
1234: probability measures}. An ordinal family over $(W,\F)$ is a sequence of
1235: the form $\{(W_\beta,\F_\beta,\mu_\beta): \beta < \alpha\}$ such that
1236: \begin{itemize}
1237: \item $\union_{\beta < \alpha} W_\beta = W$;
1238: \item $\F_\beta$ is an algebra over $W_\beta$;
1239: \item $\mu_\beta$ is a probability measure with domain $\F_\beta$;
1240: \item $\union_{\beta < \alpha} \F_\beta = \F$;
1241: \item if $U \in \F$ and $V \in \F_\beta$, then $U \inter V \in \F_\beta$;
1242: \item if $U \in \F$, $U \inter V \in \F_\beta$, and $\mu_\beta(U \inter V) >
1243: 0$, then there exists $\gamma$ such that $U \in \F_\gamma$ and
1244: $\mu_\gamma(U) > 0$.
1245: \end{itemize}
1246:
1247: Given an ordinal family $\{(W_\beta,\F_\beta,\mu_\beta): \beta < \alpha\}$ over
1248: $(W,\F)$, consider the map $\FOP$ which associates with it the cps
1249: $(W,\F,\F',\mu)$, where $\F' = \{U \in \F:
1250: \mu_\gamma(U) > 0 \mbox{ for some } \gamma < \alpha\}$ and $\mu(V \mid U) =
1251: \mu_\beta(V \mid U)$, where $\beta$ is the smallest ordinal such that $U \in
1252: \F_\beta$ and $\mu_\beta(U) > 0$.
1253: Van Fraassen shows that $\FOP$ is a bijection from ordinal families
1254: over $(W,\F)$ to Popper spaces over $(W,\F)$. Again, for van Fraassen,
1255: countable additivity does not play a significant role. If $\F$ is a
1256: $\sigma$-algebra, a {\em countably additive\/} ordinal family over
1257: $(W,\F)$ is defined just as an ordinal family, except that now
1258: $\F_\beta$ is a $\sigma$-algebra over $W_\beta$ for all
1259: $\beta < \alpha$, $\mu_\alpha$ is a countably additive probability
1260: measure, and $\F$ is
1261: the least $\sigma$-algebra containing $\union_{\beta <
1262: \alpha} \F_\beta$ (since $\union_{\beta < \alpha} \F_\beta$ is not in
1263: general a $\sigma$-algebra).
1264: The same map
1265: $\FOP$ is also a bijection from countably additive ordinal families
1266: to countably additive Popper spaces.
1267:
1268: Spohn's result, Theorem~\ref{infiso}, can be viewed as a
1269: strengthening of van Fraassen's result in the countably additive case,
1270: since for Theorem~\ref{infiso} all the $\F_\beta$'s are
1271: required to be identical. This is a nontrivial requirement. The fact
1272: that it cannot be met in the case that $W$ is infinite and the measures
1273: are not countably additive is an indication of this.
1274:
1275: It is worth seeing how van Fraassen's approach handles the finitely
1276: additive examples which do not correspond to SLPS's.
1277: The Popper space in Example~\ref{counter1} corresponds to the ordinal
1278: family $\{(W_n,\F_n,\mu_n): n \le \omega\}$ where, for $n < \omega$,
1279: $W_n = \{1, \ldots, n\}$, $\F_n$ consists of all subsets of $W_n$, and
1280: $\mu_n$ is the uniform measure, while $W_\omega = \IN$, $\F_\omega$
1281: consists of the finite and cofinite subsets of $\IN$, and $\mu_\omega(U)$ is 1
1282: if $U$ is cofinite and 0 if $U$ is finite. It is easy to check that
1283: this ordinal family has the desired properties.
1284: The Popper space in
1285: Example~\ref{counter2} is represented in a similar way, using the
1286: ordinal family $\{(W_n,\F_n,\mu_n'): n \le \omega\}$, where $\mu_n'(U)$
1287: is 1 if $n \in U$ and 0 otherwise, while $\mu_\omega' = \mu_\omega$. I
1288: leave it to the reader to see that this family has the desired
1289: properties.
1290: The key point to observe here is the leverage obtained by
1291: allowing each probability measure to have a different domain.
1292:
1293:
1294:
1295: \section{Relating LPS's to NPS's}\label{LPSNPS}
1296:
1297: In this section, I show that LPS's and NPS's are
1298: isomorphic in a strong sense.
1299: Again, I separate the results for the finite case and the infinite case.
1300:
1301: \subsection{The finite case}
1302: Consider an LPS of the form $(\mu_1, \mu_2,\mu_3)$. Roughly speaking,
1303: the corresponding NPS should be $(1 - \epsilon - \epsilon^2) \mu_1 +
1304: \epsilon \mu_2 + \epsilon^2 \mu_3$, where $\epsilon$ is some
1305: infinitesimal. That means that $\mu_2$ gets infinitesimal weight
1306: relative to $\mu_1$ and $\mu_3$ gets infinitesimal weight relative to
1307: $\mu_2$. But which infinitesimal $\epsilon$ should be
1308: chosen? Intuitively, it shouldn't matter. No matter which
1309: infinitesimal is chosen, the resulting NPS should be equivalent to the
1310: %joe2
1311: %original LPS. How can we make this intuition precise?
1312: %joe6
1313: %original LPS. I now make this intuition precise?
1314: original LPS. I now make this intuition precise.
1315:
1316:
1317: Suppose that we want to use an LPS or an NPS to compute which of two
1318: bounded, {\em real-valued\/} random variables has higher expected value.%
1319: %{\em real-valued\/} random variables with finite range has higher
1320: %expected value.
1321: %joe2
1322: The intended
1323: application here is decision making, where the random variables can be
1324: thought of as the utilities corresponding to two actions; the one with
1325: higher expected utility is preferred.
1326: The idea is that two measures of
1327: uncertainty (each of which can be an LPS or an NPS) are equivalent if
1328: the preference order they place on (real valued) random variables
1329: (according to their expected value) is the same.
1330: %joe6
1331: I consider only random variables with countable range. This restriction
1332: both makes the
1333: exposition simpler and avoids having to define, for example, integration
1334: with respect to an NPS. Note that, given an LPS $\vecmu$, the
1335: expected value of a random variable $X$ is $\sum_x x \vecmu(X=x)$, where
1336: $\vecmu(X=x)$ is a sequence of probability values and the multiplication
1337: and addition are pointwise. Thus, the expected value is a sequence;
1338: these sequences can be compared using the lexicographic order $<_L$
1339: defined in Section~\ref{LPSdef}. If $\nu$ is either an LPS or NPS,
1340: then let $E_\nu(X)$ denote the expected value of random variable $X$
1341: according to $\nu$.
1342:
1343: \dfn\label{aeq} If each of $\nu_1$ and $\nu_2$ is either an NPS over
1344: $(W,\F)$ or an
1345: LPS over $(W,\F)$, then $\nu_1$ is {\em equivalent to\/} $\nu_2$,
1346: denoted $\nu_1 \aeq \nu_2$, if, for all real-valued random variables $X$
1347: and $Y$
1348: measurable with respect to $\F$, $E_{\nu_1}(X) \le E_{\nu_1}(Y)$ iff
1349: $E_{\nu_2}(X) \le E_{\nu_2}(Y)$.
1350: %joe2
1351: %(As usual, $X$ is said to be measurable with respect to $\F$ if
1352: (If $X$ has countable range, which is the only case I consider here, then
1353: $X$ is measurable with respect to $F$ iff $\{w: X(w) = x\} \in \F$ for
1354: all $x$ in the range of $X$.)%
1355: %joe3
1356: \footnote{As pointed out by Adam Brandenburger and Eddie Dekel,
1357: this notion of equivalence is essentially the same as one
1358: implicitly used by BBD. They work with preference orders on
1359: Anscombe-Aumann acts \cite{AA63}, that is, functions from states to
1360: probability measures on prizes. Fix a utility function $u$ on prizes. Then
1361: take $\nu_1 \sim_u \nu_2$ if the preference order on acts generated by
1362: $\nu_1$ and $u$ is the same as that generated by $\nu_2$ and $u$.
1363: It is not hard to show that this notion of equivalence is independent of
1364: the choice of utility function; if $u$ and $u'$ are two utility
1365: functions on prizes, then $\nu_1 \sim_u \nu_2$ iff $\nu_1 \sim_{u'}
1366: \nu_2$. Moreover, $\nu_1 \sim_u \nu_2$ iff $\nu_1 \aeq \nu_2$.
1367: The advantage of the notion of equivalence used here is that it is
1368: defined without the overhead of preference orders on acts.}
1369: %if $X^{-1}(A) \in \F$ for all Borel sets $A$.
1370: \edfn
1371:
1372: %joe: make this precise (changed ``lemma'' to ``proposition'' already)
1373: %In a precise sense, this notion of equivalence is stronger than that
1374: %provided by the map $\FCP$ of Section~\ref{FCP}, as the following
1375: %proposition shows.
1376: This notion of equivalence satisfies analogues of the two key
1377: properties of the map $\FCP$ considered at the beginning of Section~\ref{FCP}.
1378: \pro\label{FCPaeq}
1379: If $\nu \in \NPS(W,\F)$, $\vecmu \in \LPS(W,\F)$, and
1380: $\nu \aeq \vecmu$, then $\nu(U) > 0$ iff $\vecmu(U) > \vec{0}$
1381: Moreover, if $\nu(U) > 0$, then $\stand{\nu(V \mid U)} = \mu_j(V \mid U)$, where
1382: $\mu_j$ is the first probability measure in $\vecmu$ such that $\mu_j(U)
1383: > 0$. \epro
1384:
1385:
1386: As the next result shows,
1387: %joe9
1388: %for structured LPS's, the $\aeq$-equivalence classes
1389: for SLPS's, the $\aeq$-equivalence classes
1390: %joe9
1391: %are singletons. (This is not true for LPS's in general.
1392: are singletons, even if the set of worlds is infinite. (This is not true
1393: for LPS's in general.
1394: For example, $(\mu,\mu) \aeq (\mu)$.) This can be viewed as providing
1395: more motivation for the use of SLPS's.
1396:
1397:
1398: \pro\label{motivation} If $\vecmu, \vecmu' \in \SLPS(W,\F)$, then
1399: $\vecmu \aeq \vecmu'$
1400: iff $\vecmu = \vecmu'$.
1401: \epro
1402:
1403:
1404:
1405: The next result justifies restricting to finite LPS's if the state
1406: space is finite.
1407: Given an algebra $\F$, let $\Bas(\F)$ consist of the
1408: {\em basic sets\/} in $\F$, that is, the nonempty sets $\F$ that themselves
1409: contain no nonempty subsets in $\F$. Clearly the sets in $\Bas(\F)$ are
1410: disjoint, so that $|\Bas(\F)| \le |W|$. If all sets are measurable, then
1411: $\Bas(\F)$ consists of the singleton subsets of $W$. If $W$ is finite,
1412: it is easy to see that all sets in $\F$ are finite unions of the sets in
1413: $\Bas(\F)$.
1414:
1415: \pro\label{finiteeq} If $W$ is finite, then every LPS over $(W,\F)$ is
1416: equivalent to an LPS of length at most $|\Bas(\F)|$. \epro
1417:
1418:
1419: %joe2
1420: %I can now define the isomorphism that relates NPS's
1421: I can now define the bijection that relates NPS's
1422: and LPS's. Given $(W,\F)$, let $\LPS(W,\F)/\naeq$ be the equivalence
1423: classes of $\aeq$-equivalent LPS's over $(W,\F)$; similarly, let
1424: $\NPS(W,\F)/\naeq$ be the equivalence classes of $\aeq$-equivalent NPS's
1425: over $(W,\F)$. Note that in $\NPS(W,\F)/\naeq$, it is possible that
1426: different nonstandard probability measures could have different ranges.
1427: For this section, without loss of generality, I could also fix the range
1428: of all NPS's to be
1429: %joe4
1430: %fixed nonstandard model
1431: the nonstandard model
1432: $\IR(\epsilon)$ discussed in Section~\ref{NPSdef}. However, in the infinite
1433: case, it is not possible to restrict to a single nonstandard model, so I
1434: do not do so here either, for uniformity.
1435:
1436: Now define the mapping $\FLN$ from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$
1437: pretty much as suggested at the beginning of this subsection:
1438: If $[\vecmu]$ is an equivalence class of LPS's, then choose a
1439: representative $\vecmu' \in [\vecmu]$ with finite length.
1440: Fix an infinitesimal $\epsilon$.
1441: Suppose that
1442: $\vecmu' = (\mu_0, \ldots, \mu_k)$.
1443: Let $\FLN([\vecmu]) = [(1 - \epsilon -
1444: \cdots - \epsilon^{k}) \mu_0 + \epsilon \mu_1 + \cdots + \epsilon^k \mu_k]$.
1445:
1446: \thm\label{lpsnps} If $W$ is finite, then
1447: $\FLN$ is a bijection from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$
1448: that preserves equivalence (that is, each NPS in $\FLN([\vecmu])$ is
1449: equivalent to $\vecmu$).
1450: \ethm
1451:
1452: %joe9*: added
1453: \prf It is easy to check that if $\vecmu = (\mu_0, \ldots, \mu_k)$, then
1454: $\vecmu \aeq (1 - \epsilon -
1455: \cdots - \epsilon^{k}) \mu_0 + \epsilon \mu_1 + \cdots + \epsilon^k
1456: \mu_k$ (see Lemma~\ref{aeqchar} in the appendix for a formal proof).
1457: It follows that $\FLN$ is an injection from
1458: $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$. To show that $\FLN$ is a
1459: surjection, we must essentially construct an inverse map; that is, given
1460: an NPS $(W,\F,\nu)$ where $W$ is finite, we must find an LPS $\vecmu$
1461: such that $\vecmu \aeq \nu$. The idea is to find
1462: a finite collection $\mu_0, \ldots, \mu_k$ of (standard)
1463: probability measures, where $k \le |W|$, and nonnegative nonstandard reals
1464: $\epsilon_0, \ldots, \epsilon_k$ such that
1465: $\stand{\epsilon_{i+1}/\epsilon_i} = 0$ and $\nu = \epsilon_0 \mu_0 +
1466: \cdots + \epsilon_k\mu_k$. A straightforward argument then shows that
1467: $\nu \aeq \vecmu$ and $\FLN([\vecmu]) = [\nu]$. I leave details to
1468: the appendix. \eprf
1469:
1470:
1471: BBD \citeyear{BBD1} also relate nonstandard probability measures and
1472: LPS's under the assumption that the state space is finite,
1473: %joe6
1474: %However, the way they relate them is somewhat different in spirit from
1475: %is different in spirit from the notion of equivalence introduced here.
1476: but there are some significant technical differences between the way
1477: they relate them and the approach taken here.
1478: BBD prove representation theorems
1479: %joe4
1480: %essentially showing that a preference orders on lotteries
1481: essentially showing that a preference order on lotteries
1482: can be represented by a standard utility function on lotteries and an
1483: LPS iff it
1484: can be represented by a standard utility function on lotteries and an NPS.
1485: Thus, they show that NPS's and LPS's are equiexpressive in terms of
1486: representing preference orders on lotteries.
1487: The difference between
1488: BBD's result and Theorem~\ref{lpsnps} is essentially a matter of
1489: quantification. BBD's result can be viewed as showing that, given an
1490: LPS, for each utility function on lotteries, there is an NPS that
1491: generates the same preference order on lotteries for that particular
1492: utility function. In principle, the NPS might depend on the utility
1493: function. More precisely, for a fixed LPS $\vecmu$, all
1494: that follows from their result is that for each utility function $u$, there
1495: is an NPS $\nu$ such that $(\vecmu,u)$ and $(\nu,u)$ generate the same
1496: preference order on lotteries. Theorem~\ref{lpsnps} says that, given
1497: $\vecmu$, there is an NPS $\nu$ such that $(\vecmu,u)$ and $(\nu,u)$
1498: generate the same preference on lotteries for {\em all\/} utility
1499: functions $u$.
1500:
1501: \subsection{The infinite case}
1502:
1503:
1504: An LPS over an infinite state space $W$ may not be equivalent to any
1505: finite LPS. However, ideas analogous to those used to prove
1506: Proposition~\ref{finiteeq} can be used to provide a bound on the length
1507: of the minimal-length LPS's in an equivalence class.
1508:
1509: \pro\label{infiniteeq} Every LPS over $(W,\F)$ is
1510: equivalent to an LPS over $(W,\F)$ of length at most $|\F|$. \epro
1511:
1512: The first step in relating LPS's to NPS's is to show that, just as in
1513: the finite case, for every LPS $(\mu_\beta: \beta < \alpha)$ of length
1514: $\alpha$, there is an equivalent NPS $\nu$. The idea will
1515: be to
1516: %joe9
1517: set
1518: $\nu = (1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta}) +
1519: \sum_{0 < \beta < \alpha} \epsilon_{n_\beta} \mu_\beta$. In the finite
1520: case, we could take $n_\beta = \beta$. This worked because
1521: each $\beta$ was finite, and the field $\IR(\epsilon)$ includes
1522: $\epsilon^j$ for each integer $j$. But now, since $\alpha$ may be
1523: greater than $\omega$, we cannot just take $n_\beta = \beta$.
1524: To get this idea to work in the infinite setting, I consider a
1525: \emph{nonstandard} model of the integers, which includes an ``integer''
1526: corresponding to all the ordinals less than $\alpha$. I then
1527: %joe9
1528: %construct a field that include $\epsilon^{n_\alpha}$ even for
1529: construct a field that includes $\epsilon^{n_\alpha}$ even for
1530: these nonstandard integers $n_\alpha$.
1531:
1532: A {\em nonstandard model of the integers\/} is a
1533: model that contains the integers and satisfies every property of the
1534: integers expressible in first-order logic.
1535: It follows easily from the compactness theorem of first-order
1536: logic \cite{Enderton} that, given an ordinal $\alpha$, there exists a
1537: nonstandard model
1538: %joe6
1539: $I^\alpha$
1540: of the integers $I^\alpha$ that includes elements
1541: $n_\beta$, $\beta <
1542: \alpha$, such that $n_j = j$ for $j <\omega$ and $n_\beta < n_{\beta'}$
1543: if $\beta < \beta'$. (Note that since $I^\alpha$ satisfies all the
1544: properties of the integers, it follows that if $n_\beta < n_{\beta'}$,
1545: then $n_{\beta'} - n_\beta \ge 1$, a fact that will be useful later.)
1546: The compactness theorem says that, given a collection of
1547: formulas, if each finite subset has a model, then so does the whole set.
1548: Consider a language with a function $+$ and constant symbols for each
1549: integer, together with constants ${\bf n}_\beta$, $\beta < \alpha$.
1550: Consider the collection of first-order formulas in this language
1551: consisting of all the formulas true of the integers, together with the
1552: formulas ${\bf n}_i = i$ for $i < \omega$ and ${\bf n}_\beta < {\bf
1553: n}_{\beta'}$, for all $\beta < \beta' < \alpha$.
1554: Clearly any finite subset of this set has a model---namely, the
1555: integers. Thus, by compactness, so does the full set. Thus, for each
1556: ordinal $\alpha$, there is a model $I^\alpha$
1557: with the required properties.
1558:
1559:
1560: %There are two issues that must be dealt with in order to get this to
1561: %work. First, we must ensure that there is a non-Archimedean field where
1562: %there are infinitesimals $\epsilon_\beta$, $\beta < \alpha$, such that
1563: %$\stand{\epsilon_{\beta'}/\epsilon_{\beta}} = 0$ if $\beta < \beta' <
1564: %\alpha$. Note, for example, that
1565: %this cannot be done in $\IR(\epsilon)$ if $\alpha > \omega$.
1566: %Another problem is making sense of the infinite sum. Fields are closed
1567: %under finite sums; in general, infinite sums may not be defined.
1568:
1569: Given $\alpha$, I now construct a field $\IR(I^\alpha)$
1570: that includes $\epsilon^n$ for each ``integer'' $n \in I^\alpha$.
1571: %These fields are all similar in spirit to $\IR(\epsilon)$. To
1572: To explain the construction,
1573: it is best to first consider $\IR(\epsilon)$ in a
1574: little more detail. Since $\IR(\epsilon)$ is a field, once it includes
1575: $\epsilon$, it must include $p(\epsilon)$,
1576: where $p$ is a polynomial with real coefficients. To ensure the every
1577: nonzero element of $\IR(\epsilon)$ has an inverse, we need not just
1578: finite polynomials in $\epsilon$, but \emph{infinite} polynomials in
1579: $\epsilon$. The inverse of a polynomial in $\epsilon$ can then be
1580: computer using standard ``formal'' division of polynomials.
1581: Moreover, the leading coefficient of the polynomial can be negative.
1582: Thus, the inverse of $\epsilon^3$ is, not surprisingly, $\epsilon^{-3}$;
1583: the inverse of $1-\epsilon$ is $1 + \epsilon + \epsilon^2 + \ldots$.
1584:
1585: The field $\IR(I^\alpha)$ also includes polynomials in $\epsilon$, but
1586: now the exponents are not just integers, but elements of
1587: $I^\alpha$. Since a field is closed under multiplication, if it
1588: contains $\epsilon^{n_1}$ and $\epsilon^{n_2}$, it must also include
1589: their product. Since $I^\alpha$ satisfies all the properties of the
1590: integers, if it includes $n_1$ and $n_2$, it also includes an element
1591: $n_1 + n_2$, and we can take $\epsilon^{n_1} \times \epsilon^{n_2} =
1592: \epsilon^{n_1 + n_2}$. Formally, let $\IR(I^\alpha)$ be the
1593: non-Archimedean model defined as follows:
1594: $\IR(I^\alpha)$ consists of all polynomials of the form
1595: $\sum_{n \in J} r_n \epsilon^{n}$, where $r_n$ is a standard real,
1596: $\epsilon$ is an infinitesimal, and $J$ is a \emph{well-founded} subset
1597: of $I^\alpha$. (Recall that a set is well founded if it has no
1598: infinite descending sequence; thus, the set of integers is not well
1599: founded, since $\ldots -3 < -2 < -1$ is an infinite descending
1600: sequence. The reason I require well foundedness will be clear shortly.)
1601: We can identify the standard real $r$ with the polynomial
1602: $r \epsilon^0$.
1603:
1604: The polynomials in $\IR(I^\alpha)$ can be added and
1605: multiplied using the standard rules for addition and multiplication of
1606: polynomials.
1607: It is easy to check that
1608: %joe3
1609: %since $\alpha$ is a limit ordinal,
1610: the result of adding or multiplying two
1611: polynomials is another polynomial in $\IR(I^\alpha)$. In particular, if
1612: $p_1$ and $p_2$ are
1613: %joe4:
1614: %two polynomials, $N_1$ is the set of coefficients of $p_1$, and
1615: %$N_2$ is the set of coefficients of $p_2$, then the
1616: %coefficients of $p_1 + p_2$ lie in $N_1 \union N_2$, while the
1617: %coefficients of $p_1p_2$ lie in the set $N_3 = \{n_1 + n_2:
1618: two polynomials, $N_1$ is the set of exponents of $p_1$, and
1619: $N_2$ is the set of exponents of $p_2$, then the
1620: exponents of $p_1 + p_2$ lie in $N_1 \union N_2$, while the
1621: exponents of $p_1p_2$ lie in the set $N_3 = \{n_1 + n_2:
1622: %joe4: typo
1623: %n \in N_1, n_2 \in N_2\}$. Both $N_1 \union N_2$ and $N_3$ are easily
1624: n_1 \in N_1, n_2 \in N_2\}$. Both $N_1 \union N_2$ and $N_3$ are easily
1625: seen to be well founded if $N_1$ and $N_2$ are. Moreover, for each
1626: expression $n_1 + n_2 \in N_3$, it
1627: follows from the well-foundedness of $N_1$ and $N_2$ that there are only
1628: finitely many pairs $(n,n') \in N_1 \times N_2$ such that $n+n' = n_1 +
1629: n_2$,
1630: %joe6
1631: so the coefficient of $\epsilon^{n_1 + n_2}$ in $p_1p_2$ is well defined.
1632: Finally, each polynomial (other than 0) has an
1633: inverse that can be computed using standard ``formal'' division of
1634: polynomials; I leave the details to the reader.
1635: %joe6:
1636: This step is where the well foundedness comes in. The formal division
1637: process cannot be applied to a polynomial with coefficients that are not
1638: well founded, such as $\cdots + \epsilon^{-3} + \epsilon^{-2} +
1639: \epsilon^{-1}$. An element
1640: of $\IR(I^\alpha)$ is {\em positive\/} if its leading coefficient is
1641: positive. Define an order $\le$ on $\IR(I^\alpha)$ by taking $a \le b$ if
1642: $b-a$ is positive.
1643: With these definitions, $\IR(I^\alpha)$ is a non-Archimedean field.
1644: %Moreover, $\stand{\epsilon^{n_2}/\epsilon^{n_1}} = 0$ if $n_1 < n_2$.
1645:
1646: Given $(W,\F)$, let $\alpha$ be
1647: the minimal ordinal whose cardinality is greater than
1648: %joe3
1649: or equal to
1650: $|\F|$.
1651: %Let $I^*_{(W,\F)}$ be a nonstandard model of the
1652: %integers such that there exist
1653: %elements $n_\beta$ in $I^*_{(W,\F)}$ for all $\beta < \alpha$ such
1654: By construction, $I^\alpha$ has elements $n_\beta$ for all $\beta <
1655: \alpha$ such that
1656: $n_i = i$ for $i < \omega$ and $n_\beta < n_{\beta'}$ if $\beta <
1657: \beta' < \alpha$.
1658: I now define a map $\FLN$ from
1659: $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$ just as suggested earlier.
1660: In more detail, given an equivalence class $[\vecmu] \in \LPS(W,\F)$, by
1661: Proposition~\ref{infiniteeq}, there exists $\vecmu' \in
1662: [\vecmu]$ such that $\vecmu'$ has length $\alpha' \le \alpha$.
1663: %joe9
1664: %Let $\nu = (1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta}) +
1665: Let $\nu = (1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta})\mu_0 +
1666: \sum_{0 < \beta < \alpha} \epsilon_{n_\beta} \mu_\beta'$.
1667: By definition, $\sum_{0 < \beta < \alpha} \epsilon^{n_\beta} \in
1668: \IR(I^\alpha)$ (the set
1669: of exponents is well ordered since the ordinals are well ordered), hence
1670: so is $(1 - \sum_{0 < \beta < \alpha} \epsilon^{n_\beta})$.
1671: The elements $\epsilon^{n_\beta}$ for $\beta \le \alpha$ are also all
1672: in $\IR(I^\alpha)$. It easily follows that $\nu$ is nonstandard
1673: probability measure over the field $\IR(I^\alpha)$. As observed
1674: earlier, if $\beta' < \beta$, then $\beta - \beta' \ge 1$, so
1675: $\epsilon^{n_\beta'}$ is infinitesimally smaller than
1676: $\epsilon^{n_\beta}$. Arguments essentially identical to those of
1677: Lemma~\ref{aeqchar} in the appendix can be
1678: used to show that $\nu \aeq \vecmu'$.
1679: Define $\FLN[\vecmu] = [\nu]$.
1680: %joetark
1681: The following result is immediate.
1682:
1683: \thm\label{injection} $\FLN$ is an injection from $\LPS(W,\F)/\naeq$ to
1684: $\NPS(W,\F)/\naeq$ that preserves equivalence. \ethm
1685:
1686: What about the converse? Is it the case that for every NPS there is an
1687: equivalent LPS?
1688: %joe9
1689: The technique for finding an equivalent LPS used in the finite case
1690: fails. There is no obvious way to find a well-ordered sequence
1691: of standard probability measures $\mu_0, \mu_1, \ldots$ and a sequence
1692: of nonnegative nonstandard reals $\epsilon_0, \epsilon_1, \ldots$ such
1693: that $\stand{\epsilon_{\beta+1}/\epsilon_\beta} = 0$ and
1694: $\nu = \epsilon_0 \mu_0 + \epsilon_1 \mu_1 + \cdots$. As the following
1695: example shows, this is not an accident.
1696: %joe9
1697: %As the following example shows, the answer is no.
1698: There exists NPSs that are not equivalent to any LPS.
1699:
1700: \xam\label{counter3} As in Example~\ref{counter1}, let $W = \IN$, the
1701: natural numbers, let $\F$ consist of the finite and cofinite subsets of
1702: $\IN$,
1703: and let $\F' = \F - \{\emptyset\}$. Let $\nu^1$ be an NPS with
1704: range $\IR(\epsilon)$, where $\nu^1(U) = |U|\epsilon$ if $U$ is finite and
1705: %joe6
1706: %$\nu(U) = 1 - |\overline{U}|\epsilon$ if $U$ is cofinite
1707: $\nu^1(U) = 1 - |\overline{U}|\epsilon$ if $U$ is cofinite
1708: %joe6
1709: (as usual, $\overline{U}$ denotes the complement of $U$, which in this
1710: case is finite).
1711: This is
1712: clearly an NPS, and it corresponds to the cps $\mu^1$ of
1713: Example~\ref{counter1}, in the sense that $\stand{\nu^1(V \mid U)}
1714: = \mu^1(V \mid U)$ for all $V \in \F$, $U\in \F'$. Just as in
1715: Example~\ref{counter1}, it can be shown that there is
1716: no LPS $\vecmu$ such that $\nu^1 \aeq \vecmu$.
1717:
1718:
1719:
1720: %joe6
1721: To see the potential relevance of this setup, suppose that
1722: %joe9
1723: %there is a lottery with countably many where a natural number can be
1724: %chosen and,
1725: a natural number is chosen at random and,
1726: intuitively, all numbers are equally likely to be chosen. An agent may
1727: place a bet
1728: on the number being in a finite or cofinite set. Intuitively, the agent
1729: should prefer a bet on a set with larger cardinality. More precisely,
1730: if $U_1$ and $U_2$ are two sets in the algebra, the agent should prefer
1731: a bet on $U_1$ over a bet on $U_2$ iff
1732: (a) $U_1$ and $U_2$ are both cofinite and the complement of $U_1$ has
1733: smaller cardinality than that of $U_2$, (b) $U_1$ is cofinite and $U_2$
1734: is finite, or (c) $U_1$ and $U_2$ are both finite, and $U_1$ has larger
1735: cardinality than $U_2$. These preferences on acts or bets
1736: should translate to statements of likelihood.
1737: The NPS captures these preferences directly; they cannot
1738: be captured in an LPS. The cps of Example~\ref{counter1} captures (b)
1739: directly, and (c) indirectly: when conditioning on any finite set that
1740: contains $U_1 \union U_2$, the probability of $U_1$ will be higher than
1741: that of $U_2$.
1742: \exam
1743:
1744:
1745: \subsection{Countably additive nonstandard probability
1746: measures}\label{countableadditivity}
1747:
1748: Do things get any better if countable additivity is required?
1749: To answer this question, I must first make precise what countable
1750: additivity means in the context of non-Archimedean fields.
1751: To understand the issue here, recall that for the standard real numbers,
1752: every bounded nondecreasing sequence has a unique least upper bound, which
1753: can be taken to be its limit. Given a countable sum each of whose terms
1754: is nonnegative, the partial sums form a nondecreasing sequence.
1755: If the partial sums are bounded (which they are if the terms in the sums
1756: represent the probabilities of a pairwise
1757: disjoint collection of sets), then the limit is well defined.
1758:
1759: None of the above is true in the case of non-Archimedean fields. For a
1760: trivial counterexample,
1761: consider the sequence $\epsilon, 2 \epsilon, 3 \epsilon, \ldots$.
1762: Clearly this sequence is bounded (by any positive real number), but it
1763: does not have a least upper bound. For a more subtle example, consider
1764: the sequence $1/2, 3/4, 7/8, \ldots$ in the field $\IR(\epsilon)$. Should
1765: its limit be 1? While this does not seem to be an unreasonable choice,
1766: note that 1 is not the least upper bound of the sequence. For example,
1767: $1-\epsilon$ is greater than every term in the sequence, and is less
1768: than 1. So are $1-3\epsilon$ and $1 - \epsilon^2$. Indeed, this
1769: sequence has no least upper bound in $\IR(\epsilon)$.
1770:
1771: Despite these concerns, I define limits in
1772: $\IR(I^*)$ pointwise. That is,
1773: %the elements of $\IR(I^*)$ are (infinite) polynomials over $\epsilon$
1774: %where the power of $\epsilon$ are in $I^*$. Convergence is taken
1775: %pointwise. That is,
1776: a sequence $a_1, a_2, a_3, \ldots$ in $\IR(I^*)$
1777: converges to $b \in \IR(I^*)$ if, for every $n \in I^*$, the
1778: coefficients of $\epsilon^n$ in $a_1, a_2, a_3, \ldots$ converge to the
1779: coefficient of $\epsilon^n$ in $b$. (Since the coefficients are standard
1780: reals, the notion of convergence for the
1781: coefficients is just the standard definition of convergence in the reals.
1782: Of course, if $\epsilon^n$ does not appear explicitly, its coefficient
1783: is taken to be 0.)
1784: %joe6
1785: Note that here and elsewhere I use the letters $a$ and $b$ (possibly with
1786: subscripts) to denote (standard) reals, and $\epsilon$ to denote an
1787: infinitesimal.
1788: As usual, $\sum_{i=1}^\infinity a_i$ is taken to be $b$ if
1789: the sequence of partial sums $\sum_{i=1}^n a_i$ converges to $b$.
1790: Note that, with this notion of convergence, $1/2, 3/4, 7/8, \ldots$
1791: converges to 1 even though 1 is not the least upper bound of the
1792: sequence.%
1793: \footnote{For those used to thinking of convergence in topological
1794: terms, what is going on here is that the topology corresponding to this
1795: notion of convergence is not Hausdorff.}
1796: %joetark
1797: I discuss the consequences of this choice further in
1798: Section~\ref{discussion}.
1799:
1800: With this notion of countable sum, it makes perfect sense to consider
1801: countably-additive nonstandard probability measures. If $\F$ is a
1802: $\sigma$-algebra and $\LPS^c(W,\F)$ and $\NPS^c(W,\F)$ denote the
1803: countably additive LPS's and NPS's on $(W,\F)$, respectively, then
1804: Theorem~\ref{injection} can be applied with no change in proof to
1805: show the following.
1806:
1807: \thm\label{injection1} $\FLN$ is an injection from $\LPS^c(W,\F)/\naeq$
1808: to $\NPS^c(W,\F)/\naeq$.
1809: \ethm
1810:
1811: However, as the following example shows, even with the requirement of
1812: countable additivity, there are nonstandard probability measures that
1813: are not equivalent to any LPS.
1814:
1815: \xam\label{counter4} Let $W = \{w_1, w_2, w_3, \ldots\}$, and let $\F =
1816: 2^W$. Choose any nonstandard $I^*$ and fix an infinitesimal $\epsilon$
1817: in $\IR(I^*)$.
1818: Define an NPS $(W,\F,\nu)$ with range $\IR(I^*)$
1819: by taking $\nu(w_j) = a_j + b_j \epsilon$, where $a_j = 1/2^j$, $b_{2j-1} =
1820: %joe6
1821: %\epsilon/2^{j-1}$, and $b_{2j} = -\epsilon/2^{j-1}$, for $j = 1, 2, 3,
1822: %\ldots$.
1823: 1/2^{j-1}$, and $b_{2j} = -1/2^{j-1}$, for $j = 1, 2, 3, \ldots$.
1824: Thus, the probabilities of $w_1, w_2, \ldots$ are characterized by the
1825: sequence $1/2 + \epsilon, 1/4 - \epsilon, 1/8 + \epsilon/2, 1/16 -
1826: \epsilon/2, 1/32 + \epsilon/4, \ldots$. For $U \subseteq W$, define
1827: $\nu(U) = \sum_{\{j: w_j \in U\}} a_j + \epsilon \sum_{\{j: w_j \in U\}}
1828: b_j$. It is easy to see that these sums are well-defined.
1829: %joe6
1830: These likelihoods correspond to preferences. For example, an agent
1831: should prefer a bet that gives a payoff of 1 if $w_2$ occurs and 0 otherwise
1832: to a bet that gives a payoff of 4 if $w_4$ occurs and 0 otherwise.
1833: %joetark
1834: As I show in the appendix (see Proposition~\ref{counter}), there
1835: %As I show in the full paper, there
1836: is no LPS $\vecmu$ over $(W,\F)$ such that $\nu \aeq
1837: \vecmu$.
1838: \exam
1839:
1840: Roughly speaking, the reason that $\nu$ is not equivalent to
1841: any LPS in Example~\ref{counter4} is that the ratio between $a_j$ and
1842: $b_j$ in the definition of $\nu$ (i.e., the ratio
1843: %joe6
1844: between the
1845: ``standard part'' of
1846: $\nu(w_j)$ and the ``infinitesimal part'' of $\nu(w_j)$)
1847: %joe9
1848: %grows unboundedly large. This can be generalized so as to give
1849: goes to zero. This can be generalized so as to give
1850: a condition on nonstandard probability measures that
1851: is necessary and sufficient to guarantee that they can be represented by
1852: an LPS.
1853: % however, I do not pursue this issue here.
1854: However, the condition is rather technical and I have not found an
1855: interesting interpretation of it, so I do not pursue it here.
1856:
1857:
1858: \section{Relating Popper Spaces to NPS's}\label{PopperNPS}
1859: Consider the map $\FNP$ from nonstandard probability spaces to Popper
1860: spaces such that $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$, where
1861: $\F' = \{U: \nu(U) \ne 0\}$ and $\mu(V \mid U) = \stand{\nu(V \mid U)}$ for $V \in
1862: \F$, $U \in \F'$. I leave it to the reader to check that
1863: $(W,\F,\F',\mu)$ is indeed a Popper space.
1864: %joe9
1865: This is arguably the most natural map; for example, it is easy to check
1866: that $\FNP \circ \FSN = \FCP$, where $\FSN$ is the restriction of $\FLN$
1867: to SLPSs. (Note that $\FLN$ is well-defined on SLPS's, since if
1868: $\vecmu$ is an SLPS, by Proposition~\ref{motivation}, $[\vecmu] =
1869: \{\vecmu\}$.)
1870:
1871: %joetark: this is new
1872: %joe9
1873: We might hope that $\FNP$ is a bijection from $\NPS(W,\F)/\naeq$ to
1874: $\Popper(W,\F)$. As I show shortly, it is not. To understand $\FLN$
1875: better, define an equivalence relation $\simeq$ on $\NPS(W,\F)$ (and
1876: $\NPS^c(W,\F)$) by taking $\nu_1 \simeq \nu_2$ if $\{U: \nu_1(U) = 0\} =
1877: \{U: \nu_2(U) = 0\}$ and $\stand{\nu_1(V \mid U)} = \stand{\nu_2(V \mid
1878: U)}$ for
1879: all $V, U$ such that $\nu_1(U) \ne 0$.
1880: %joe9
1881: Thus, $\simeq$ essentially says that infinitesimal differences between
1882: conditional probabilities do not count.
1883: Let $\NPS/\!\simeq$
1884: (\respc $\NPS^c/\!\simeq$) consist of the
1885: $\simeq$ equivalence classes in $\NPS$ (\respc $\NPS^c$). Clearly
1886: $\FNP$ is well defined as a map from $\NPS/\!\simeq$ to $\Popper(W,\F)$
1887: and
1888: from $\NPS^c/\!\simeq$ to $\Popper^c(W,\F)$. As the following result
1889: shows, $\FNP$ is actually a bijection from $\NPS^c/\!\simeq$ to
1890: $\Popper^c(W,\F)$.
1891:
1892:
1893: \thm\label{FNP} $\FNP$ is a bijection from $\NPS(W,\F)/\!\simeq$ to
1894: $\Popper(W,\F)$ and from $\NPS^c(W,\F)/\!\simeq$ to $\Popper^c(W,\F)$.
1895: \ethm
1896:
1897: \prf
1898: It is easy to see that $\FNP$ is an injection.
1899: In the countable case, the inverse map can be defined using earlier results.
1900: If $(W,\F,\F',\mu) \in \Popper^c(W,\F)$, by
1901: Theorem~\ref{infiso},
1902: there is a countably additive SLPS $\vec{\mu}'$ such that
1903: $\FCP((W,\F,\vec{\mu}')) = (W, \F,\F', \mu)$. By
1904: Theorem~\ref{injection}, there is some
1905: $(W,\F,\nu) \in \NPS^c(W,\F)$ such that $\nu \aeq \vecmu'$. It is not
1906: hard to show that $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$; see the appendix
1907: for details. Showing that $\FNP$ is a surjection in the finitely
1908: additive case requires more work; again, see the appendix for details.
1909: \eprf
1910:
1911: McGee \citeyear{McGee94} proves essentially the same result as
1912: Theorem~\ref{FNP} in the case that $\F$ is an algebra (and the measures
1913: involved are not necessarily countably additive). McGee
1914: \citeyear[p.~181]{McGee94} says that his
1915: result shows that ``these two approaches amount to the same thing''.
1916: However, this is far from clear. The $\simeq$ relation is rather
1917: coarse. In particular, it is coarser than~$\aeq$.
1918:
1919: %joe6
1920: %\opro{simeqvsaeq} If $\nu_1 \aeq \nu_2$ than $\nu_1 \simeq \nu_2$.
1921: \pro\label{simeqvsaeq} If $\nu_1 \aeq \nu_2$ then $\nu_1 \simeq \nu_2$.
1922: \epro
1923:
1924:
1925: The converse of Proposition~\ref{simeqvsaeq} does not hold in general.
1926: As a result,
1927: the $\simeq$ relation identifies nonstandard measures that behave quite
1928: differently in decision contexts.
1929: %Indeed, the results in
1930: %Sections~\ref{FCP} and~\ref{LPSNPS} already suggest the nature of the
1931: %gap between Popper spaces and NPS's. To simplify the discussion,
1932: This difference already arises in finite spaces, as the following example
1933: shows.
1934:
1935: \xam\label{McGee}
1936: Suppose $W =
1937: \{w_1,w_2\}$. Consider the nonstandard probability measure $\nu_1$ such that
1938: $\nu_1(w_1) = 1/2 + \epsilon$ and $\nu_1(w_2) = 1/2 - \epsilon$. (This is
1939: equivalent to the LPS $(\mu_1,\mu_2)$ where $\mu_1(w_1) = \mu_2(w_2) =
1940: 1/2$, $\mu_2(w_1) = 1$, and $\mu_2(w_2) = 0$.)
1941: Let $\nu_2$ be the nonstandard probability measure such that $\nu_2(w_1)
1942: = \nu_2(w_2) = 1/2$. Clearly $\nu_1 \simeq \nu_2$. However, it is not
1943: the case that $\nu_1 \aeq \nu_2$.
1944: Consider the two
1945: random variables $\chi_{\{w_1\}}$ and $\chi_{\{w_2\}}$.
1946: (I use the notation $\chi_U$ to denote the indicator function for $U$;
1947: that is, $\chi_U(w) = 1$ if $w \in U$ and $\chi_U(w) = 0$ otherwise.)
1948: According to $\nu_1$, the
1949: expected value of $\chi_{\{w_1\}}$ is (very slightly) higher than that of
1950: $\chi_{\{w_2\}}$.
1951: According to $\nu_2$, $\chi_{\{w_1\}}$ and $\chi_{\{w_2\}}$ have the
1952: same expected value. Thus, $\nu_1 \not\aeq \nu_2$.
1953: Moreover, it is easy to see that there
1954: is no Popper measure $\mu$ on $\{w_1,w_2\}$ that can make the same
1955: distinctions with respect to $\chi_{\{w_1\}}$ and $\chi_{\{w_2\}}$ as
1956: $\nu_1$, no matter how we define expected value with respect to a
1957: Popper measure. According to $\nu_1$, although the expected value of
1958: $\chi_{\{w_1\}}$ is higher than that of $\chi_{\{w_2\}}$, the expected
1959: value of $\chi_{\{w_1\}}$ is less than
1960: that of $\alpha \chi_{\{w_2\}}$ for any (standard) real $\alpha > 1$.
1961: There is no Popper measure with this behavior.
1962: \exam
1963:
1964: %suppose that $W$ is finite. Then
1965: More generally, in finite spaces, Theorem~\ref{FCPfin} shows that
1966: Popper spaces are equivalent to SLPS's, while
1967: Theorem~\ref{lpsnps} shows that $\LPS(W,\F)/\naeq$ is equivalent to
1968: $\NPS(W,\F)/\naeq$. By Proposition~\ref{motivation},
1969: $\SLPS(W,\F)/\naeq$ is essentially identical to $\SLPS(W,\F)$ (all the
1970: equivalence classes in $\SLPS(W,\F)/\naeq$ are singletons),
1971: so in finite spaces, the gap in expressive power between Popper spaces
1972: and NPS's essentially amounts to the gap between $\SLPS(W,\F)$ and
1973: $\LPS(W,\F)/\naeq$. This gap is nontrivial. For example, there is no
1974: SLPS equivalent to the LPS $(\mu_1,\mu_2)$ that represents the NPS in
1975: Example~\ref{McGee}.
1976:
1977: %I do not know any way of making this gap completely precise. There
1978: %does not seem to be an analogue of the equivalence $\aeq$ from
1979: %Definition~\ref{aeq} for Popper spaces, since it is not clear how to
1980: %define expected value with respect to a Popper measure.
1981: %The following example may help to clarify the issue.
1982:
1983: \section{Independence}\label{sec:indep}
1984: %joe3
1985: %BBD \citeyear{BBD1} and Hammond \citeyear{Hammond94} discuss
1986: %independence, but they consider
1987: %only when a (standard or nonstandard) probability measure can be viewed
1988: %as a {\em product measure\/} (that is, a product of other measures).
1989: %%joe2
1990: %%Interestingly, their discussion does {\em not\/}
1991: %%consider independence directly for LPS's; indeed, it is far from clear
1992: %%what it would mean that an LPS can be written as a product measure.
1993: %Rather than considering product measures,
1994: %I consider more standard notions of independence:
1995: %joe6
1996: %In this section, I consider
1997: The notion of independence is fundamental. As I show in this section, the
1998: results of the previous sections sheds light on various notions of
1999: independence considered in the literature for LPS's and (variants of)
2000: cps's. I first consider independence for events and then independence
2001: for random variables. I then relate my definitions to those of BBD,
2002: Hammond, and Kohlberg and Reny \citeyear{KR97}.
2003:
2004: Intuitively, event $U$ is independent of $V$ if learning $U$ gives no
2005: information about $V$. Certainly if learning $U$ gives no information
2006: about $V$, then if $\mu$ is an arbitrary probability measure, we would
2007: expect that $\mu(V \mid U) = \mu(V)$. Indeed, this is often taken as the
2008: definition of $V$ being independent of $U$ with respect to $\mu$.
2009: If standard probability measures are used, conditioning is not
2010: defined if $\mu(U) = 0$. In this case, $U$ is still considered
2011: independent of $V$. As is well known, if $U$ is independent of $V$,
2012: then $\mu(U \inter V) = \mu(V) \times \mu(U)$ and $V$ is independent of
2013: $U$, that is, $\mu(U \mid V) = \mu(U)$. Thus, independence of events with
2014: respect to a probability measure can be
2015: defined in any of three equivalent ways. Unfortunately, these
2016: definitions are not equivalent for other representations of uncertainty
2017: (see \cite[Chapter 4]{Hal31} for a general discussion of this issue).
2018:
2019:
2020: The situation is perhaps simplest for nonstandard probability measures.%
2021: \footnote{Although I talk about $U$ being independent of $V$ with
2022: respect to a nonstandard measure $\nu$, technically I should talk about
2023: $U$ being independent of $V$ with respect to an NPS $(W,\F,\nu)$, for
2024: $U, V \in \F$. I continue to be sloppy at times, reverting to more
2025: careful notation when necessary.}
2026: In this case, the three notions coincide, for exactly the same reasons
2027: as they do for standard probability measures. However, independence is
2028: perhaps too strong a notion in some ways. In particular, nonstandard
2029: measures that are equivalent do not in general agree on independence, as
2030: the following example shows.
2031: \xam\label{xam:approximatelyindep}
2032: Suppose that $W = \{w_1, w_2, w_3, w_4\}$. Let
2033: $\nu_i(w_1 ) = 1 - 2 \epsilon + \epsilon_i$, $\nu_i(w_2) = \nu_i(w_3) =
2034: \epsilon - \epsilon_i$, and $\nu_i(w_4) = \epsilon_i$, for $i = 1,
2035: 2$, where $\epsilon_1 = \epsilon^2$ and $\epsilon_2 = \epsilon^3$. If
2036: $U = \{w_2, w_4\}$ and $V = \{w_3, w_4\}$, then $\nu_i(U) =
2037: \nu_i(V) = \epsilon$ and $\nu_i(U \inter V) = \epsilon_i$. It
2038: follows $U$ and $V$ are independent with
2039: respect to $\nu_1$, but not with respect to $\nu_2$. However, it is
2040: easy to check that $\nu_1 \aeq \nu_2$.
2041: \exam
2042:
2043: Example~\ref{xam:approximatelyindep} shows that independence of events in the
2044: context of nonstandard
2045: measures is very sensitive to the choice of $\epsilon$, even if this
2046: choice does not affect decision making at all. This suggests the
2047: following definition: $U$ is {\em approximately independent\/} of $V$ with
2048: respect to $\nu$ if $\nu(U) \ne 0$ implies that
2049: $\nu(V \mid U) - \nu(V)$ is infinitesimal, that is, if
2050: $\stand{\nu(V \mid U)} = \stand{\nu(V)}$.
2051: Note that $U$ can be approximately independent of $V$ without
2052: $V$ being approximately independent of $U$. For example, consider the
2053: nonstandard probability measure $\nu_1$ from
2054: Example~\ref{xam:approximatelyindep}. Let
2055: %joe2
2056: %$V' = \{w_1, w_2\}$;
2057: $V' = \{w_4\}$;
2058: as before, let $U = \{w_2, w_4\}$. It is easy to check that
2059: $\stand{\nu_1(V' \mid U)} = \stand{\nu_1(V')} = 0$, but
2060: $\stand{\nu_1(U \mid V')} = 1$, while $\stand{\nu_1(U)} = 0$. Thus,
2061: $U$ is approximately independent of $V'$ with respect to $\nu_1$, but
2062: $V'$ is not
2063: approximately independent of $U$. Similarly, $U$ can be approximately
2064: independent of $V$ without $\overline{U}$ being approximately
2065: independent of $V$. For example, it is easy to check that
2066: $\overline{V}'$ is approximately independent of $U$ with respect to
2067: $\nu_1$, although $V'$ is not.
2068:
2069:
2070:
2071: A straightforward argument shows that $U$
2072: is approximately independent of $V$ with respect to $\nu$ iff
2073: $\nu(U) \ne 0$ implies $\stand{(\nu(V
2074: \inter U) - \nu(V) \times \nu(U))/ \nu(U)} = 0$, while $V$ is
2075: approximately independent of
2076: $U$ with respect to $\nu$ iff the same statement holds with the roles of
2077: $V$ and $U$ reversed.
2078: Note for future reference that each of these requirements
2079: is stronger than just
2080: requiring that $\stand{\nu(V \inter U) - \nu(V) \times \nu(U)} = 0$.
2081: The latter requirement is automatically met, for example, if the
2082: probability of either $U$ or $V$ is infinitesimal.
2083:
2084: The definition of (approximate) independence extends in a straightforward
2085: way to (approximate) conditional independence. $U$ is
2086: conditionally independent of $V$ given $V'$ with respect to a (standard
2087: or nonstandard) probability measure $\nu$ if
2088: $\nu(U \inter V') \ne 0$ implies $\nu(V \mid U \inter V') = \nu(V \mid V')$.
2089: %Conditional independence is typically taken to hold by convention
2090: %if $\nu(U \inter V') = 0$.
2091: %joe2
2092: %Intuitively, this is because $\nu(V \inter U
2093: %\inter V')$ is indeterminate in this case.
2094: %However, it may seem
2095: %reasonable to say that conditional independence does {\em not\/} hold if
2096: %$U \inter V' = \emptyset$ but $V \inter V' \ne \emptyset$. Intuitively,
2097: %in this case, it is clear that, given $V'$, finding out $
2098: Again, for probability, $U$ is
2099: conditionally independent of $V$ given $V'$ iff $V$ is conditionally
2100: independent of $U$ given $V'$ iff $\nu(V \inter U \mid V') = \nu(V
2101: \mid V') \times \nu(U \mid V')$.
2102: $U$ is approximately
2103: conditionally independent of $V$ given $V'$ with respect to $\nu$ if
2104: $\stand{\nu(V \mid U \inter V')} = \stand{\nu(V
2105: \mid V')}$. If $V'$ is taken to be $W$, the whole space, then (approximate)
2106: conditional independence reduces to (approximate) independence.
2107:
2108: The following proposition shows that, although independence is not
2109: preserved by equivalence, approximate independence is.
2110:
2111: \pro\label{indaeq} If $U$ is approximately conditionally independent of
2112: $V$ given
2113: $V'$ with respect to $\nu$, and $\nu \aeq \nu'$, then
2114: $U$ is approximately conditionally independent of $V$ given
2115: $V'$ with respect to $\nu'$.
2116: \epro
2117:
2118: \prf Suppose that $\nu \aeq \nu'$. I claim that for all events $U_1$
2119: and $U_2$ such that $\nu_1(U_2) \ne 0$, $\stand{\nu(U_1)/\nu(U_2)} =
2120: \stand{\nu'(U_1)/\nu'(U_2)}$. For suppose that
2121: $\stand{\nu(U_1)/\nu(U_2)} = \alpha$. Then it easily follows that
2122: $E_\nu(\chi_{U_1}) < E_\nu(\alpha'\chi_{U_2})$ for all $\alpha' > \alpha$,
2123: and $E_\nu(\chi_{U_1}) > E_\nu(\alpha''\chi_{U_2})$ for all $\alpha'' <
2124: \alpha$. Thus, the same must be true for $E_{\nu'}$, and hence
2125: $\stand{\nu'(U_1)/\nu'(U_2)} = \alpha$. It thus follows
2126: that $\stand{\nu (V \mid U \inter V')} = \stand{\nu' (V \mid U \inter
2127: V')}$ and $\stand{\nu(V \mid V')} = \stand{\nu'(V \mid V')}$, from which
2128: the result is immediate. \eprf
2129:
2130:
2131: \commentout{
2132: There is also an interesting connection between approximate independence
2133: and independence, which will prove useful in understanding issues
2134: involving independence between random variables.
2135:
2136:
2137: \pro\label{indaeq1} There exists a measure $\nu'$ such
2138: that $\nu \aeq \nu'$ and $U$ is conditionally independent of $V$ given
2139: $V'$ with respect to $\nu'$ iff (a) both $U$ and $\overline{U}$ are
2140: approximately conditionally independent of $V$ given $V'$ with respect
2141: to $\nu$ and (b) both $V$
2142: and $\overline{V}$ are approximately conditionally independent of $U$
2143: given $V'$ with respect to $\nu$.
2144: \epro
2145:
2146:
2147: \prf First suppose that $\nu \aeq \nu'$ and and that $U$ is
2148: conditionally independent of $U$ given $V$ with respect to $\nu'$.
2149: Then, by standard properties of independence, both $U$ and
2150: $\overline{U}$ are conditionally independent of $V$ given $V'$
2151: and both $V$ and $\overline{V}$ are conditionally independent of $U$
2152: given $V'$. Since conditional independence certainly implies
2153: conditional approximate independence, the forward implication follows
2154: from Proposition~\ref{indaeq}.
2155:
2156: For the reverse implication, suppose that (a) both $U$ and
2157: $\overline{U}$ are approximately conditionally independent of $V$ given
2158: $V'$ with respect to $\nu$ and (b) both $V$
2159: and $\overline{V}$ are approximately conditionally independent of $U$
2160: given $V'$ with respect to $\nu$. Suppose that $\stand{U \mid V'} =
2161: r_1$, $\stand{V \mid V'} = r_2$
2162: We now need to consider a number of
2163: cases. First, suppose that both $0 < r_1, r_2 < 1$.
2164: }
2165:
2166:
2167: There is an obvious definition of independence for events for Popper spaces:
2168: $U$ is independent of $V$ given $V'$ with respect to the Popper space
2169: $(W,\F,\F',\mu)$ if $U \inter V' \in\F'$ implies that $\mu(V \mid U
2170: \inter V') = \mu(V \mid V')$; if $U \inter V' \notin \F'$, then $U$ is also
2171: taken to be independent of $V$ given $V'$. If
2172: $U$ is independent of $V$ given $V'$ and $V' \in \F'$, then $\mu(U
2173: \inter V \mid V') = \mu(U \mid V') \times \mu(V \mid V')$. However, the
2174: converse does not necessarily hold. Nor is it the case that if $U$ is
2175: independent of $V$ given $V'$ then $V$ is independent of $U$ given
2176: $V'$. A counterexample can be obtained by taking the Popper space
2177: arising from the NPS in Example~\ref{xam:approximatelyindep}. Consider the
2178: Popper space $(W,2^W,\F',\mu)$ corresponding to the NPS $(W,2^W,\nu_1)$
2179: %joe2
2180: %via the isomorphism $\FNP$. It is easy to check that $U$ is
2181: %independent
2182: via the bijection $\FNP$. It is easy to check that $U$ is independent
2183: of $V'$ but $V'$ is not independent of $U$ with respect to this Popper
2184: space, although $\mu(V' \inter U) = \mu(U \mid V') \times \mu(V') \ (=
2185: 0)$. This observation is an instance of the following more general
2186: result, which is almost immediate from the definitions:
2187:
2188: \pro\label{pro:approximatelyindep} $U$ is approximately independent of
2189: $V$ given $V'$
2190: with respect to the NPS $(W,\F,\nu)$ iff $U$ is independent of
2191: $V$ given $V'$ with respect to the Popper space $\FNP(W,\F,\nu)$.
2192: \epro
2193:
2194: How should independence be defined in LPS's?
2195: %joe2
2196: Interestingly, neither BBD nor Hammond define independence
2197: %joe3
2198: directly
2199: for LPS's.
2200: %joe3
2201: \commentout{
2202: BBD \citeyear{BBD1} give three definitions of independence: two of them
2203: are given in terms of NPS's; the third is an indirect definition in
2204: terms of preference orders. Hammond also works in NPS's. Note that
2205: requiring that $\vecmu(V
2206: \mid U) = \vecmu(V)$ will not work since $\vecmu \mid
2207: U$ and $\vecmu$ are, in general, LPS's of different lengths. Nor is
2208: there any obvious way to define multiplication of two LPS's.
2209: %joe3:
2210: %It seems
2211: %to me that the most natural way
2212: One way
2213: to define independence in LPS's is to
2214: essentially reduce the definition to that for Popper spaces. That is,
2215: $U$ is independent of $V$ given $V'$ with respect to the LPS
2216: $(W,\F,\vecmu)$ if the leftmost number in the sequence $\vecmu(V \mid U
2217: \inter V')$ is the same as the leftmost number in $\vecmu(V \mid V')$;
2218: as usual, independence is taken to hold trivially if $\vecmu(U \inter
2219: V') = \vec{0}$. The following result is almost immediate from the
2220: definitions.
2221:
2222: \pro\label{pro:approximatelyindep1} $U$ is independent of $V$ given $V'$
2223: with respect to the LPS $\vecmu$ iff $U$ is approximately independent of
2224: $V$ given $V'$ with respect to each NPS in the equivalence class
2225: $\FLN([\vecmu])$.
2226: \epro
2227:
2228: \noindent Propositions~\ref{pro:approximatelyindep}
2229: and~\ref{pro:approximatelyindep1} emphasize
2230: the naturalness of approximate independence in this context.
2231: }
2232: %joe3: \end{commentout}
2233: However, they do give definitions in terms of NPS's that can be
2234: applied to equivalent LPS's; indeed, BBD \citeyear{BBD2} do just this
2235: %joe4: typo
2236: %(see the discussion of BBD strong equivalence below).
2237: (see the discussion of BBD strong independence below).
2238:
2239: I now consider independence for random variables. If $X$ is a random
2240: variable on $W$, let $\V(X)$ denote
2241: range (set of possible values) of random variable $X$; that is, $\V(X) =
2242: \{X(w): w \in W\}$.
2243: %joe2
2244: %For simplicity here, assume that the range of all random variables is
2245: Recall that I am assuming that all random variables have countable range.
2246: Random variable $X$ is
2247: independent of $Y$ with respect to a standard probability measure $\mu$
2248: if the event $X=x$ is independent of the
2249: event $Y=y$ with respect to $\mu$, for all $x \in \V(X)$ and $y \in \V(Y)$.
2250: %joe2
2251: By analogy, for nonstandard probability measures, following Kohlberg and
2252: Reny \citeyear{KR97},
2253: define $X$ and $Y$ to
2254: be {\em weakly independent\/} with respect to $\nu$ if $X=x$ is
2255: approximately independent of $Y=y$ and $Y=y$ is approximately
2256: independent of $X=x$ with respect to $\nu$ for all $x\in
2257: \V(X)$ and $y \in \V(Y)$.%
2258: \footnote{Kohlberg and Reny's definition of weak independence also
2259: requires that the joint
2260: range of $X$ and $Y$ be the product of the individual ranges. That is,
2261: for $X$ and $Y$ to be weakly independent, it must be the case that for
2262: all $x \in \V(X)$ and $y \in \V(Y)$, there exists some $w \in W$ such
2263: that $X(w) = x$ and $Y(w) = y$.
2264: %joe3
2265: %Of course, this requirement could also be added to the definitions of
2266: %weak and strong independence I have proposed here; adding it does not
2267: %seem to make a significant difference.}
2268: Of course, this requirement could also be added to the definition
2269: I am proposing here; adding it would not affect any of the results
2270: of this paper.}
2271:
2272:
2273:
2274:
2275: For standard probability measures, it easily follows
2276: that if $X$ is independent of $Y$, then $X \in U_1$ is independent of $Y
2277: \in V_1$ conditional on $Y \in V_2$ and $Y \in V_1$ is independent of $X
2278: \in U_1$ conditional on $X \in U_2$, for all $U_1, U_2 \subseteq \V(X)$
2279: and $V_1, V_2 \subseteq \V(Y)$. The same arguments show that this is
2280: also true for for nonstandard probability measures. However, the
2281: argument breaks down for approximate independence.
2282:
2283: \xam\label{xam:needapproximate} Suppose that $W = \{1,2,3\} \times
2284: \{1,2\}$. Let $X$ and $Y$ be the random variables that project onto the
2285: first and second components of a world, respectively, so that $X(i,j) =
2286: i$ and $Y(i,j) = j$. Let $\nu$ be the nonstandard probability measure
2287: on $W$ given by the following table:
2288:
2289:
2290: %\begin{table}[h]
2291: \begin{center}
2292: \begin{tabular}{| c | c | c | c |}
2293: \hline
2294: %joe2: added X=, Y=
2295: & $Y=1$ & $Y=2$ \\
2296: \hline
2297: $X=1$ & $1 - 3 \epsilon - 3\epsilon^2$ & $\epsilon$\\
2298: \hline
2299: $X=2$ & $\epsilon$ & $\epsilon^2$\\
2300: \hline
2301: $X=3$ & $\epsilon$ & $2\epsilon^2$\\
2302: \hline
2303: \end{tabular}
2304: \end{center}
2305: %\end{table}
2306: It is easy to check that
2307: %$X = i$ is approximately independent of $Y=j$ and that $Y=j$
2308: %is approximately independent of $X=i$
2309: $X$ and $Y$ are weakly independent
2310: with respect to $\nu$, for all $i \in
2311: \{1,2,3\}$, $j \in \{2,3\}$. However, $\stand{\nu(X = 2 \mid X \in
2312: \{2,3\} \inter Y=2)} = 1/3$, while $\stand{\nu(X=2 \mid X \in \{2,3\})}
2313: = 1/2$.
2314: \exam
2315:
2316: In light of this example, I define $X$ to be {\em approximately independent of
2317: %joe2
2318: $\{Y_1, \ldots, Y_n\}$ with respect to $\nu$\/} if $X \in U_1$ is
2319: %joe3
2320: %approximately independent of $Y_1 \in V_1 \inter \ldots \inter Y_n \in V_n$
2321: %conditional on $Y_1 \in V_1' \inter \ldots \inter Y_n \in V_n'$ with
2322: approximately independent of $(Y_1 \in V_1) \inter \ldots \inter (Y_n
2323: \in V_n)$ conditional on $(Y_1 \in V_1') \inter \ldots \inter (Y_n \in
2324: V_n')$ with
2325: respect to $\nu$ for all
2326: $U_1 \subseteq \V(X)$, $V_i, V_i' \subseteq \V(Y_i)$, and $i = 1, \ldots,
2327: n$. $X_1, \ldots, X_n$ are {\em approximately independent with respect
2328: to $\nu$\/} if $X_i$ is approximately independent of $\{X_1, \ldots,
2329: X_n\} - \{X_i\}$ with respect to $\nu$ for $i = 1, \ldots, n$. I leave
2330: to the reader the obvious extensions to
2331: conditional independence and the
2332: analogues of this definition for Popper spaces and LPS's.
2333: %joe2
2334: %Note that the events $U$ and $V$ are approximately independent iff the
2335: %random variables $\chi_U$ and $\chi_V$ are approximately independent (or
2336: %weakly independent---approximate independence and weak independence
2337: %coincide for binary random variables).
2338:
2339: %joe3
2340: %I consider one last notion of independence for random variables,
2341: %\emph{strong independence}, first considered by Kohlberg and Reny
2342: %\citeyear{KR97}.
2343:
2344: %joe3
2345: %We can, of course, define strong independence for LPS's and NPS's by
2346: %translating the definition from Popper spaces using the mapping $\FNP$
2347: %and $\FLN$. But there is as more direct, and much more natural,
2348: %definition in the case of NPS's, as the following the following theorem
2349: %shows.
2350:
2351: As I said, BBD consider three notions of independence for random variables.
2352: One is a decision-theoretic notion of stochastic independence on preference
2353: relations on acts over $W$. Under appropriate assumptions, it can be
2354: shown that a preference relation is stochastically independent
2355: iff it can be
2356: represented by some (real-valued) utility function $u$ and a nonstandard
2357: probability measure $\nu$ such that $X_1, \ldots, X_n$ are approximately
2358: independent with respect to $\nu$ \cite{BV96}.
2359: A second notion they consider is a weak notion of
2360: product measure that requires only that there exist measures $\nu_1,
2361: \ldots, \nu_n$ such that $\stand{(\nu(w_1, \ldots, w_n)} =
2362: \stand{\nu_1(w_1) \times \cdots \nu(w_n)}$. As we have already
2363: observed, this notion of independence is rather weak. Indeed, an
2364: example in BBD shows that it misses out on some interesting
2365: decision-theoretic behavior.
2366:
2367: \commentout{
2368: Approximate independence and strong independence differ in the order of
2369: universal and existential quantification. $X$ and $Y$ are
2370: approximately independent with respect to $\nu$ if, for all values $x$
2371: and $y$ in the range of $X$ and $Y$, respectively, there is an NPS
2372: $\nu_{xy}$ such that $\nu_{xy} \aeq \nu$ and $X=x$ and $Y=y$ are
2373: independent with respect to $\nu_{xy}$. On the other hand, $X$ and $Y$
2374: are strongly independent if there exists an NPS $\nu'$ such that $\nu'
2375: \aeq \nu$ and for all $x$ and $y$ in the range of $X$ and $Y$,
2376: respectively, $X = x$ is independent of $Y=y$. Clearly KR-strong
2377: independence implies approximate independence. As the following
2378: example (due to Kohlberg and Reny \citeyear{KR97}) shows, in general, it
2379: is strictly stronger.
2380:
2381: \xam Suppose that $W = \{1,2,3\} \times \{1,2,3\}$.
2382: Let $X$ and $Y$ be the random variables that project onto the first and second
2383: components of a world, respectively, so that $X(i,j) = i$ and $Y(i,j) =
2384: j$. Let $\nu$ be the nonstandard probability measure on $W$ given by the
2385: following table:
2386:
2387: %\begin{table}[h]
2388: \begin{center}
2389: \begin{tabular}{| c | c | c | c |}
2390: \hline
2391: %joe2: added X=, Y=
2392: & $Y=1$ & $Y=2$ & Y=3\\
2393: \hline
2394: $X=1$ & $1 - 3 \epsilon - 4\epsilon^2 - 3 \epsilon^3 - \epsilon^4$ &
2395: $2\epsilon$ & $\epsilon^2$\\
2396: \hline
2397: $X=2$ & $\epsilon$ & $\epsilon^2$ & $2\epsilon^3$\\
2398: \hline
2399: $X=3$ & $2\epsilon^2$ & $\epsilon^3$ & $\epsilon^4$\\
2400: \hline
2401: \end{tabular}
2402: \end{center}
2403: %\end{table}
2404: It is easy to check that $X$ and $Y$ are approximately independent.
2405: However, they are not strongly independent. Suppose, by way of
2406: contradiction, that there exists some probability measure $\nu' \aeq
2407: \nu$ such that $X$ and $Y$ are independent with respect to $\nu'$.
2408: Note that
2409: $$\stand{\frac{\nu(X=1 \inter Y=2)}{\nu(X=2 \inter Y=1)}} =
2410: \stand{\frac{\nu(X=3 \inter Y=1)}{\nu(X=1 \inter Y=3)}} =
2411: \stand{\frac{\nu(X=2 \inter Y=3)}{\nu(X=3 \inter Y=2)}} = 2.$$
2412: Since $\nu' \aeq \nu$, it is easy to check that
2413: $$\stand{\frac{\nu'(X=1 \inter Y=2)}{\nu'(X=2 \inter Y=1)}} =
2414: \stand{\frac{\nu'(X=3 \inter Y=1)}{\nu'(X=1 \inter Y=3)}} =
2415: \stand{\frac{\nu'(X=2 \inter Y=3)}{\nu'(X=3 \inter Y=2)}} = 2.$$
2416: Thus, it follows that
2417: $$\stand{\frac{\nu'(X=1 \inter Y=2) \times \nu'(X=3 \inter Y=1) \times \nu'(X=2
2418: \inter Y=3)}{ \nu'(X=2 \inter Y=1) \times \nu'(X=1 \inter Y=3) \times
2419: \nu'(X=3 \inter Y=2)}} = 8.$$
2420: However, since $X$ and $Y$ are independent with respect to $\nu'$, we
2421: must have
2422: $$\stand{\frac{\nu'(X=1 \inter Y=2) \times \nu'(X=3 \inter Y=1) \times
2423: \nu'(X=2
2424: \inter Y=3)}{ \nu'(X=2 \inter Y=1) \times \nu'(X=1 \inter Y=3) \times
2425: \nu'(X=3 \inter Y=2) }} = 1.$$
2426: This gives the desired contradiction.
2427: \exam
2428: \commentout{
2429: We can define two events $U$ and $V$ to be strongly independent if the
2430: random variables $\chi_U$ and $\chi_V$ are strongly independent.
2431: However, it is not hard to check (using techniques much like those used
2432: to prove Proposition~\ref{indaeq}) that $\chi_U$ and $\chi_V$ are
2433: strongly independent iff they are weakly (or approximately) independent.
2434: Distinctions that are significant when considering independence of
2435: random variables disappear at the level of independence of independence
2436: of events.
2437: }%\end{commsentout}
2438: }%\end{commentout}
2439:
2440: %joe3: all new
2441: The third notion of independence that BBD consider is the strongest.
2442: BBD \citeyear{BBD2} define $X_1, \ldots, X_n$ to be
2443: strongly independent with respect to an LPS
2444: $\vecmu$ if they are independent (in the usual sense) with respect to an NPS
2445: $\nu$ such that $\mu \aeq \nu$.%
2446: \footnote{In \cite{BBD2}, BBD say that this definition of strong
2447: independence is given in \cite{BBD1}. However, the definition appears
2448: to be given only in terms of NPS's in \cite{BBD1}.}
2449: Moreover, they give a characterization
2450: of this notion of strong independence, which I henceforth call \emph{BBD
2451: strong independence}, to distinguish it from the KR notion of strong
2452: independence that I discuss shortly.
2453: Given a tuple $\vec{r} = (r^0, \ldots, r^{k-1})$ of vectors of reals in
2454: $(0,1)^k$ and a finite LPS
2455: $\vecmu = (\mu^0, \ldots, \mu^k)$, let $\vecmu \, \Box \, \vec{r}$
2456: be the (standard) probability measure
2457: $$(1 - r^0) \mu^0 + r^0[(1-r^1) \mu^1 + r^1[(1-r^2)\mu^2 + r^2[\cdots +
2458: r^{k-2}[(1-r^{k-1})\mu^{k-1} + r^{k-1}\mu^k)]\ldots ]]].$$
2459: Note that $\vecmu \, \Box \, \vec{r}$ is defined only if $\vecmu$ is
2460: finite. Thus, in discussing BBD strong independence, I restrict to
2461: finite LPS's.
2462: %joe6
2463: In addition, for technical reasons that will become clear in the proof
2464: of Theorem~\ref{BBDstrongindependence}, I consider only random variables
2465: with finite range, which is what BBD do as well.
2466: BBD \citeyear[p.~90]{BBD2} claim without proof that ``it is
2467: straightforward to show'' that $X_1, \ldots, X_n$ are BBD strongly
2468: independent with respect to $\vecmu$ iff there is a
2469: sequence $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$
2470: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$
2471: as $j\rightarrow\infty$,
2472: and $X_1, \ldots, X_n$ are
2473: independent with respect to $\vecmu \, \Box \, \vec{r}^j$ for $j = 1, 2, 3,
2474: \ldots$. I can prove this result only if the NPS $\nu$ such that
2475: $\vecmu \aeq \nu$ and $X_1, \ldots, X_n$ are independent with respect to
2476: $\nu$ has a range that is an elementary extension of the reals (and thus
2477: has the same first-order properties as the reals).
2478:
2479: \thm\label{BBDstrongindependence}
2480: There exists an NPS $\nu$ whose range
2481: is an
2482: elementary extension of the reals such that $\vecmu \aeq \nu$ and $X_1,
2483: \ldots, X_n$ are
2484: %joe5
2485: %strongly
2486: independent with respect to $\nu$ iff there
2487: exists a sequence
2488: $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$
2489: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$
2490: as $j\rightarrow\infty$,
2491: and $X_1, \ldots, X_n$ are
2492: independent with respect to $\vecmu \, \Box \, \vec{r}^j$ for $j = 1, 2, 3,
2493: \ldots$.
2494: \ethm
2495: I do not know if this result holds without requiring that $\nu$ be an
2496: elementary extension of the reals.
2497:
2498: Kohlberg and Reny \citeyear{KR97} define a notion of strong independence with
2499: respect to what they call {\em relative probability spaces}, which are
2500: closely related to Popper spaces of the form
2501: $(W,2^W,2^W-\{\emptyset\},\mu)$, where all subsets of $W$ are measurable and
2502: it is possible to condition on all nonempty sets.
2503: %joe3
2504: Their definition is similar in spirit to the characterization of BBD
2505: strong independence given in Theorem~\ref{BBDstrongindependence}.
2506: For ease of exposition, I recast their definition in terms of Popper spaces.
2507: $X_1, \ldots, X_n$ are {\em KR-strongly independent\/} with respect to the
2508: Popper space $(W,\F,\F', \mu)$, where $\F'$ includes all events of the
2509: form $X_i = x$ for $x \in \V(X_i)$, if there exist a sequence of
2510: standard probability measures $\mu_1, \mu_2, \ldots$ such that $\mu_j
2511: \rightarrow \mu$, and for all $j = 1, 2, 3, \ldots$,
2512: $\mu_j(U) > 0$ for $U \in \F'$ and $X_1,
2513: \ldots, X_n$ are independent with respect to $\mu_j$.
2514: As Kohlberg and Reny show,
2515: KR-strong independence implies approximate independence%
2516: \footnote{They actually show only that it implies weak independence, but
2517: the same argument shows that it implies approximate independence.}
2518: and is, in general, strictly stronger.
2519:
2520: The following theorem characterizes KR strong independence in terms of
2521: NPS's.
2522:
2523: \thm\label{KRindependence}
2524: %joe3
2525: %$X_1, \ldots, X_n$ are strongly independent with respect to the Popper
2526: $X_1, \ldots, X_n$ are KR-strongly independent with respect to the Popper
2527: space $(W,\F,\F',\mu)$ iff there
2528: exists an NPS $(W,\F,\nu)$ such that
2529: %joe4
2530: %$\FNP(W,\F,\nu) = \mu$ and $X_1, \ldots,
2531: $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$ and $X_1, \ldots,
2532: X_n$ are independent with respect to $(W,\F,\nu)$.
2533: \ethm
2534: It follows from the proof that we can require the range of $\nu$ to be a
2535: nonelementary extension of the reals, but this is not necessary.
2536:
2537: %joe3:
2538: \commentout{
2539: There is a sense in which KR-strong independence is weaker than
2540: BBD-strong independence.
2541: Define $X_1, \ldots, X_n$ to be KR-strongly independent (resp.,
2542: BBD-strongly independent) with respect to
2543: NPS $\nu$ if there exists an NPS $\nu'$ such that $\nu \simeq \nu'$
2544: (resp., $\nu \aeq \nu'$) and $X_1, \ldots, X_n$ are independent with
2545: respect to $\nu'$. As we have seen, $\simeq$ is a coarser notion of
2546: equivalence than $\sim
2547: }
2548: %joe3: \end{commentout}
2549:
2550: %We must be a little careful regarding the interepretation of
2551: %Theorem~\ref{KRindependence} since, as observed after Theorem~\ref{FNP},
2552: %the map $\FNP$ acts the same on all $\simeq$ equivalence classes, and
2553: %$\simeq$ is a rather coarse equivalence relation. I discuss this issue
2554: %in more detail after the proof of Theorem~\ref{KRindependence} in the
2555: %appendix.
2556:
2557: %joe3:
2558: \commentout{
2559: Now I can compare the definitions given here to those discussed by BBD,
2560: Hammond, and Kohlberg and Reny. BBD define a (standard or nonstandard)
2561: probability measure
2562: $\nu$ on $W= W_1 \times \cdots \times W_n$ to be a product measure if
2563: there exist measures $\nu_i$ on $W_i$ for $i = 1, \ldots, n$, such that
2564: such that $\nu((w_1, \ldots, w_n)) = \nu_1(w_1) \times \cdots \times
2565: \nu_n(w_n)$. If $X_i$ is the random variable that projects on to the
2566: $i$th component, then it is easy to see that $\nu$ is a product measure
2567: iff $X_1, \ldots, X_n$ are independent.
2568:
2569:
2570: Hammond mainly focuses on Popper spaces, and follows BBD's lead
2571: in considering when a Popper space can be, in a sense, viewed as a
2572: product measure. He defines a notion of conditional independence of a
2573: Popper space defined on $W = W_1 \times \cdots \times W_n$ which is
2574: similar in spirit to the notion of independence of random variables in
2575: Popper spaces as defined here. In fact, it is straightforward to show
2576: that the Popper space $(W_1 \times \cdots \times W_n, \F,\F',\mu)$ is
2577: conditionally independent in Hammond's sense iff the projections
2578: $X_1, \ldots, X_n$ are independent with respect to the Popper space, in
2579: the sense defined here. }
2580: %\end{commentout}
2581:
2582: %joe3
2583: %Finally, I compare these definitions to those discussed by Kohlberg and
2584: %Reny \citeyear{KR97}.
2585: %As I said,
2586: %Kohlberg and Reny defined weak independence and strong independence of
2587: %random variables with respect to relative probability spaces.%
2588: Kohlberg and Reny show that their notions of weak and strong independence
2589: can be used to characterize
2590: Kreps and Wilson's
2591: \citeyear{KW82} notion of sequential equilibrium.
2592: BBD \citeyear{BBD2} use their notion of strong independence in their
2593: characterization of perfect equilibrium and proper equilibrium for games
2594: with more than two players.
2595: %joe4
2596: Finally, Battigali \cite{Bat96} uses approximate independence (or,
2597: equivalently, independence in cps's) to characterize sequential
2598: equilibrium.
2599: %joe3
2600: %Thus, all these notions
2601: %play a significant role in characterizing concepts of great
2602: %relevance to game theory.
2603: \commentout{
2604: Sequential equilibrium uses the notion of an {\em assessment}.
2605: Given a game $\Gamma$, an assessment is
2606: a pair $(\rho,\pi)$, where $\rho$ is a
2607: function that assigns to each information set $I$ in $\Gamma$ a probability
2608: measure $\rho(I)$ on the set of histories in that information set, and $\pi$
2609: assigns to each a node $x$ a probability $\pi(x)$ on the possible next
2610: moves at that node so that $\pi(x) = \pi(x')$ for two nodes $x$ and $x'$
2611: in the same information set. Roughly speaking, an assessment is {\em
2612: consistent\/} if, whenever information set $I$ immediately follows
2613: information set $I'$, if $I$ can be reached from $I'$ with positive
2614: probability, then $\rho(I)$ is obtained from
2615: $\rho(I')$ by the obvious computation; if $I$ is not reachable from
2616: $I'$ with positive probability, then $\rho(I)$ must be the
2617: limit of the probabilities on $I$ induced by imposing small trembles on the
2618: moves (so that all of them have some small positive probability, which
2619: goes to 0).
2620: Kohlberg and Reny \cite{KR97} show that $(\rho,\pi)$ is an
2621: assessment (i.e., $\pi(x) = \pi(x')$ for all $x$ and $x'$ in the same
2622: information set) iff $S_1, \ldots, S_n$ are weakly independent and
2623: $(\rho, \pi)$ is a consistent iff $S_1, \ldots, S_n$ are strongly
2624: independent.
2625: }
2626: %Thus, all of weak independence, approximate independence, and strong
2627: %independence
2628:
2629:
2630: \section{Discussion}\label{discussion}
2631: As the preceding discussion shows, there is a sense in which NPS's
2632: are more general than both Popper spaces and LPS's.
2633: %joe6
2634: It would be of interest to get a natural characterization of those NPS's
2635: that are equivalent to Popper spaces and LPS's; this remains an open
2636: problem.
2637: LPS's are more expressive than Popper measures in finite spaces and in
2638: infinite spaces where we assume countable additivity (in the sense
2639: discussed at the end of Section~\ref{PopperNPS}), but without assuming
2640: countable additivity, they are incomparable,
2641: %joetark:
2642: as Examples~\ref{counter1} and~\ref{counter2} show.
2643: %as Example~\ref{counter1} shows.
2644: %Although NPS's are equivalent to LPS's in
2645: %finite state spaces, NPS's have other advantages.
2646: %For example, as
2647: %pointed out by Hammond \citeyear{Hammond94} and BBD, it is also easier
2648: %to define
2649: %independence in NPS's.
2650: %joe2:
2651: Since all of these approaches to representing uncertainty have been
2652: using in characterizing solution concepts in extensive-form games and
2653: notions of admissibility, the results here suggest that it is worth
2654: considering the extent to which these results depend on the particular
2655: representation used.
2656:
2657: It is worth stressing here that this notion of equivalence depends on
2658: the fact that I have been viewing cps's, LPS's, and NPS's as
2659: representations of uncertainty. But, as Asheim \citeyear{Asheim06}
2660: emphasizes, they can also be viewed as representations of conditional
2661: preferences. Example~\ref{McGee} shows that, even in finite spaces,
2662: NPS's and LPS's can express preferences that cps's cannot. However, as
2663: %joe9
2664: %Asheim and Pereira \citeyear{AP05} point out, in finite spaces, cps's
2665: Asheim and Perea \citeyear{AP05} point out, in finite spaces, cps's
2666: can also represent conditional preferences that cannot be represented by
2667: LPS's and NPS's. See \cite{Asheim06} for a detailed discussion of the
2668: expressive power of these representations with respect to
2669: conditional preferences.
2670:
2671: Although NPS's are the most expressive of the three approaches I have
2672: considered, they have some disadvantages. In particular,
2673: working with a nonstandard probability measure requires defining and
2674: working with a non-Archimedean field.
2675: LPS's have the advantage of using just standard probability measures.
2676: Moreover, their lexicographic structure may give useful insights.
2677: It seems to be worth considering the
2678: extent to which LPS's can be generalized so as to increase their
2679: expressive power.
2680: %joe8
2681: %I am currently exploring LPS's ordered by an arbitrary (not necessarily
2682: %well-founded) index set. It seems that such LPS's
2683: %may be useful in understanding
2684: %%characterizing iterated deletion of weakly dominated strategies.
2685: In particular, it may be of interest to consider LPS's indexed by
2686: partially ordered and not necessarily well-founded sets, rather than
2687: just LPS's indexed by the ordinals.
2688: For example,
2689: Brandenburger, Friedenberg, and Keisler~\citeyear{BFK04} characterize
2690: $n$ rounds of
2691: iterated deletion using finite LPS's, for any $n$.
2692: %joe8
2693: %it seems that that these results are more cleanly stated using infinite
2694: %LPS's ordered by the
2695: Rather than using a sequence of (finite) LPS's of different lengths to
2696: characterize (unbounded) iterated deletion,
2697: it seems that a result similar in spirit can be obtained using a single LPS
2698: indexed by the (positive and negative) integers.
2699: %I hope to report on this in future work.
2700:
2701: %One final point: defining belief.
2702: I conclude with a brief discussion of a few other issues raised by this
2703: paper.
2704: \begin{itemize}
2705: \item Belief:
2706: The connections between LPS's, NPS's, and cps's are relevant to the
2707: notion of belief.
2708: %joe3
2709: There are two standard notions of belief that can be defined in LPS's.
2710: %joe6
2711: %Say that $U$ is {\em strongly believed\/} in LPS $\vecmu$ of length
2712: Say that $U$ is a {\em certain belief\/} in LPS $\vecmu$ of length
2713: $\alpha$ if $\mu_\beta(U) = 1$ for all $\beta < \alpha$; $U$ is {\em
2714: weakly believed\/} if $\mu_0(U) = 1$.
2715: Brandenburger, Friedenberg, and Keisler \citeyear{BFK04} defined a
2716: %joe3
2717: third
2718: notion of belief,
2719: intermediate between weak and strong belief,
2720: % using LPS's
2721: and provided an elegant decision-theoretic justification of it.
2722: According to their definition, an agent {\em assumes $U$
2723: %joe3
2724: %in LPS
2725: in
2726: %$\vecmu$\/} if there is some $j \le m$ such that (a) $\mu_i(U) = 1$ for all
2727: $\vecmu$\/} if there is some $\beta < \alpha$ such that (a) $\mu_{\beta'}(U) =
2728: 1$ for all
2729: $\beta' \le \beta$, (b) $\mu_{\beta''}(U) = 0$ for all $\beta'' > \beta$, and
2730: (c) $U \subseteq \union_{\beta' \le \beta} \Supp(\mu_{\beta'})$, where
2731: $\Supp(\mu_{\beta'})$
2732: denotes the support of
2733: the probability measure $\mu_{\beta'}$. (Condition (c) is unnecessary if $W$
2734: is finite, given Brandenburger, Friedenberg, and Keisler's assumption that
2735: $W = \union_{\beta'} \Supp(\mu_{\beta'})$.)
2736: %joe3
2737: %The usual notion of belief in probability
2738: %spaces is that $U$ is believed with respect to probability measure $\mu$
2739: %if $\mu(U) = 1$.
2740: %Assumption can be viewed as a strong notion of belief.
2741: %joe3:
2742: There are straightforward analogues of certain belief and weak belief in
2743: Popper spaces. $U$ is strongly believed in a Popper space
2744: $(W,\F,\F',\mu)$ if $\mu(U \mid V) = 1$ for all $V \in \F'$; $U$ is
2745: weakly believed if $\mu(U \mid V) = 1$ for all $V \in \F'$ such that
2746: $\mu(V) > 0$.
2747: %joe6
2748: Analogues of this notion of assumption have been considered elsewhere in
2749: the literature.
2750: Van Fraassen
2751: \citeyear{vF95} independently defined a
2752: %joe6
2753: %strong
2754: notion of belief using Popper spaces; in a finite state space, an event
2755: is what van Fraassen calls a \emph{belief core} iff it is assumed in
2756: the sense of Brandenburger, Friedenberg,
2757: and Keisler. Battigalli and Siniscalchi's \citeyear{BS02} notion of
2758: \emph{strong belief} is also essentially equivalent.
2759: Assumption also corresponds to Stalnaker's \citeyear{Stal98} notion of
2760: \emph{absoutely robust belief} and Asheim and S{\o}vik's \citeyear{AS05}
2761: notion of \emph{robust belief}.
2762: Asheim and S{\o}vik \citeyear{AS05} do a careful comparison of all these
2763: notions (and others).
2764: %joe3
2765: %in a finite state space,
2766: %an event is what van Fraassen calls a {\em belief core\/}
2767: %iff it is assumed in the sense of Brandenburger and Keisler.
2768: %%joe4
2769: %}
2770:
2771: %joe3
2772: %That there should be equivalent notions of strong belief in the
2773: %That there should be equivalent notions of belief in the
2774: %context of LPS's and Popper spaces is perhaps not that surprising, in
2775: %light of the close connection between them.
2776: It is easy to define analogues of certain and weak belief in NPS's:
2777: $U$ is certain belief if $\nu(U) = 1$; $U$ is weakly believed if
2778: $\stand{\nu(U)} = 1$.
2779: The results of this paper
2780: %joe3
2781: %suggest that it may also be worth considering such strong notions of
2782: suggest that it may also be worth
2783: %considering what the analogue of these notions is
2784: %in the context of NPS's.
2785: investigating an analogue of assumption in NPS's.
2786:
2787: \item Nonstandard utility:
2788: In this paper, while I have allowed probabilities to be
2789: lexicographically ordered or nonstandard, I have implicitly assumed that
2790: utilities are standard real numbers (since I have restricted to
2791: real-valued random variables).
2792: There is a tradition in decision theory going back to Hausner
2793: \citeyear{Hausner54} and continued recently in a sequence of papers by
2794: Fishburn and Lavalle (see \cite{FL99} and the references therein)
2795: %joe2
2796: and Hammond \citeyear{Hammond99} of
2797: considering nonstandard or lexicographically-ordered utilities. I have
2798: not considered the relationship between these ideas and the ones
2799: considered here, but there may be some fruitful connections.
2800:
2801: \item Countable additivity for NPS's:
2802: Countable additivity for standard
2803: probability measures is essentially a continuity condition. The
2804: fact that $\sum_{i=1}^\infty a_i$ may not be the least upper bound of
2805: the partial sums $\sum_{i=1}^n a_i$ in an NPS leads to a certain lack of
2806: continuity in decision-making. For example, let $W = \{w_1, w_2, \ldots\}$.
2807: Consider a nonstandard probability measure $\nu$ such that $\nu(w_1) =
2808: 1/3 -\epsilon$, $\nu(w_2) = 1/3 + \epsilon$, and $\nu(w_{k+2}) = 1/(3
2809: \times 2^k)$, for $k = 1, 2, \ldots$. Let $U_n = \{w_3, \ldots, w_n\}$
2810: and let $U_\infty = \{w_3, w_4, \ldots \}$. Clearly $\nu(U_n) \tendsto
2811: \nu(U_\infty) = 1/3$. However, $\nu(U_n) < \nu(w_1)$ for all $n$.
2812: Thus, $E_\nu(\chi_{\{w_1\}}) > E_\nu(\chi_{U_n})$ for all $n \ge 3$ although
2813: $E_\nu(\chi_{\{w_1\}}) < E_\nu(\chi_{U_\infty})$.
2814:
2815: Not surprisingly, the same situations can be modeled with LPS's.
2816: Consider the LPS $(\mu_1, \mu_2)$, where
2817: %joe9
2818: %$\mu_1 = \stand{\nu_1}$, $\mu(w_1) = 0$,
2819: $\mu_1 = \stand{\nu}$, $\mu_2(w_1) = 0$,
2820: $\mu_2(w_2) = 2/3$, and $\mu_2(w_{k+2}) = 1/(3\times 2^k)$ for $k = 1,
2821: 2, \ldots$. It is easy to see
2822: that again $E_{\vecmu}(\chi_{\{w_1\}}) > E_{\vecmu}(\chi_{U_n})$ for all $n
2823: \ge 3$ although $E_{\vecmu}(\chi_{\{w_1\}}) < E_\nu(\chi_{U_\infty})$.
2824: (A similar example can be obtained using SLPS's, by replacing each world
2825: $w_i$ by a pair of worlds $w_i', w_i''$, where $w_i'$ is in the support
2826: of $\mu_1$ and $w_i''$ is in the support of $\mu_2$.)
2827:
2828: An analogous continuity problem arises even in finite domains. Let $W = \{w_1, w_2, w_3\}$ and consider a sequence of
2829: probability measures $\nu_n$ such that $\nu_n(w_1) = 1/3
2830: -1/n$, $\nu_n(w_2) = 1/3 - \epsilon$ and $\nu(w_3) = 1/3 + 1/n +
2831: \epsilon$. Clearly $\nu_n
2832: \tendsto \nu$, where $\nu(w_1) = 1/3$, $\nu(w_2) = 1/3 - \epsilon$, and
2833: $\nu(w_3) = 1/3 + \epsilon$. However, $\nu_n(\chi_{\{w_1\}}) <
2834: \nu_n(\chi_{\{w_2\}})$ for all $n$, while $\nu(\chi_{\{w_1\}}) >
2835: \nu(\chi_{\{w_2\}})$. Again, the same situation can be modeled using LPS's
2836: (and even SLPS's).
2837:
2838:
2839: %joe2
2840: %Is this lack of continuity a problem?
2841: Of course, continuity plays a significant role in standard
2842: axiomatizations of SEU, and is vital in proving the existence of a Nash
2843: equilibrium. None of the uses of continuity that I am familiar with
2844: have the specific form of this example, but I believe it is worth
2845: considering further the impact of this lack of continuity.
2846: %I am not sure, but I believe it deserves further thought.
2847: \end{itemize}
2848:
2849: \paragraph{Acknowledgments:} I'd like to thank Adam Brandenburger and
2850: Peter Hammond for a number of very enlightening discussions, Bob
2851: Stalnaker for pointing out Example~\ref{counter1}, Brian Skyrms for
2852: pointing me to Hammond's work, Bas van Fraassen for pointing
2853: me to Spohn's work, Amanda Friedenberg for her careful reading of an
2854: earlier draft, her many useful comments, and for encouraging me to try
2855: to understand what my results had to say about Battigalli and
2856: Sinischalchi's work,
2857: and Horacio Arlo-Costa, Geir Asheim, Larry Blume, Adam Brandenburger,
2858: Eddie Dekel, and the anonymous reviewers for a number of
2859: useful comments on earlier drafts of this paper.
2860:
2861:
2862: %joetark:
2863:
2864: \appendix
2865:
2866: \section{Appendix: Proofs}
2867: In this section, I prove all the results claimed in the main part of the
2868: paper. For the convenience of the reader, I repeat the statements of
2869: the results.
2870:
2871: \medskip
2872:
2873: \othm{FCPfin}
2874: %joe9
2875: %If $W$ is finite, the map $\FCP$ is a bijection from $\SLPS(W,\F)$ to
2876: %$\Popper(W,\F)$.
2877: %joe10
2878: %If $W$ is finite and $(\F,\F')$ is a Popper algebra over $W$, then
2879: If $W$ is finite and $(\F,\F')$, then
2880: $\FCP$ is a bijection from $\SLPS(W,\F,\F')$ to $\Popper(W,\F,\F')$.
2881: \eothm
2882:
2883: \medskip
2884:
2885: \prf The first step is to show that $\FCP$ is an injection.
2886: If $\vecmu, \vecmu' \in \SLPS(W,\F,\F')$ and $\vecmu \ne \vecmu'$, let
2887: $\mu = \FCP(\W,\F,\vecmu)$, and let $\mu' = \FCP(\W,\F,\vecmu')$. Let
2888: $i$ be the least index such that $\mu_i \ne \mu'_i$.
2889: There is some set $U$ such that $\mu_i(U) \ne \mu'_i(U)$.
2890: Let $U_i$ be the set such $\mu_i(U_i) = 1$ and $\mu_j(U_i) = 0$ for $j <
2891: i$; since $\vecmu$ is an SLPS, such a set $U_i$ exists. Similarly, let
2892: $U_i'$ be such that $\mu_i'(U_i) = 1$ and $\mu_j'(U_i) = 0$ for $j <
2893: i$. Since $\mu_j = \mu_j'$ for all $j < i$, we must have $\mu_j(U_i \union
2894: U_i') = \mu_j(U_i \union U_i') = 0$ for all $j < i$.
2895: Clearly $\vecmu(U_j \union U_j') > 0$, so $U_j \union U_j' \in \F'$.
2896: Moreover,
2897: $\mu(U \mid U_i \union U_i') = \mu_i(U \mid U_i \union U_i') =
2898: \mu_i(U)$. Similarly, $\mu'(U \mid U_i \union U_i') = \mu_i'(U)$.
2899: Hence, $\mu \ne \mu'$.
2900:
2901: To show that $\FCP$ is a surjection, given a cps $\mu$, let $\vecmu =
2902: (\mu_0, \ldots, \mu_k)$ be the LPS constructed in the main text. We
2903: must show that
2904: $\FCP(\vecmu) = (W,\F,\F',\mu)$. Suppose that $\FCP(\vecmu) =
2905: (W,\F,\F'',\mu')$.
2906: I first show that $\F' = \F''$. Suppose that $V \in \F''$. Then
2907: $\mu_i(V) > 0$ for some $i$. Thus, $\mu(V \mid U_i) > 0$. Since
2908: $U_i \in \F'$, it follows that $V \in \F'$. Thus, $\F'' \subseteq \F'$.
2909:
2910: To show that $\F' \subseteq \F''$, first note that, by
2911: construction, $\mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}
2912: ) = 1$.
2913: %Since $U_j \union \ldots \union U_k \subseteq \overline{U_0 \union
2914: %\ldots \union U_{j-1}}$, it follows from CP3 that
2915: %$$1= \mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}} ) = \mu(U_j \mid
2916: %U_j \union \ldots \union U_k) \times \mu(U_j \union \ldots \union U_k \mid
2917: %\overline{U_0 \union \ldots \union U_{j-1}}).$$
2918: %Thus, $\mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}} ) = 1$ and
2919: %$\mu(U_{j'} \mid \overline{U_0 \union \ldots \union U_{j-1}} ) = 0$ if $j' >
2920: %j$.
2921: It easily follows that if $V \subseteq \overline{U_0 \union \ldots
2922: \union U_{j-1}}$
2923: then $$\mu(V \mid \overline{U_0 \union \ldots \union U_{j-1}}) = \mu(V
2924: \inter U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}).$$
2925: Thus, by CP3,
2926: $$\mu(V \mid \overline{U_0 \union \ldots \union U_{j-1}}) =
2927: \mu(V \inter U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}) = \mu(V \mid
2928: U_j) \times
2929: \mu(U_j \mid \overline{U_0 \union \ldots \union U_{j-1}}),$$ so
2930: %if $V \subseteq \overline{U_0 \union \ldots \union U_{j-1}}$, then
2931: \begin{equation}\label{eq1}
2932: \mu(V \mid U_j) = \mu(V \mid \overline{U_0 \union \ldots \union U_{j-1}}).
2933: \end{equation}
2934:
2935: Now suppose that $V \in \F'$.
2936: Clearly $V \inter (U_0 \union \ldots \union U_k) \ne
2937: \emptyset$, for otherwise $V \subseteq \overline{U_0 \union \ldots
2938: \union U_k}$, contradicting the fact that $\overline{U_0 \union \ldots
2939: \union U_k} \notin \F'$. Let $j_V$ be the smallest index $j$ such that $V
2940: \inter U_j \ne \emptyset$.
2941: I claim that $\mu(V \mid \overline{U_0 \union \ldots \union U_{j_V - 1}}) \ne
2942: 0$. For if $\mu(V \mid \overline{U_0 \union \ldots \union U_{j_V - 1}}) =
2943: 0$, then $\mu(U_{j_V} - V \mid \overline{U_0 \union \ldots \union U_{j_V -
2944: 1}}) = 1$, contradicting the definition of $U_{j_V}$
2945: as the smallest set $U'$ such that $\mu(U' \mid \overline{U_0 \union
2946: \ldots \union U_{j_V - 1}}) = 1$. Moreover, since
2947: $V \subseteq \overline{U_0 \union \ldots U_{j_V-1}}$, it follows
2948: from (\ref{eq1}) that
2949: $\mu(V \mid U_{j_V}) = \mu(V \mid \overline{U_0 \union
2950: \ldots \union U_{j_V - 1}}) > 0$. Thus, $\mu_{j_V}(V) > 0$, so $V \in
2951: \F''$.
2952:
2953: This argument can be extended to show that $\mu(V' \mid V) = \mu'(V' \mid
2954: V)$ for all $V' \in \F$.
2955: Since $V \inter U_i = \emptyset$ for $i < j_V$, it follows that
2956: $\mu'(V' \mid V) = \mu_{j_V}(V' \mid V)$.
2957: By CP3, $\mu(V' \mid V) \times \mu(V \mid \overline{U_0 \union \ldots
2958: \union U_{j_V
2959: - 1}}) = \mu(V'\inter V \mid \overline{U_0 \union \ldots \union U_{j_V - 1}})$.
2960: By (\ref{eq1}) and the fact that $\mu(V \mid U_{j_V}) > 0$, it follows
2961: that $\mu(V' \mid V) = \mu(V'\inter V \mid U_{j_V})/\mu(V \mid
2962: U_{j_V})$,
2963: %joe9
2964: %i.e., that $\mu(V' \mid V) = \mu_{j_V}(V' \mid V)$.
2965: that is, that $\mu(V' \mid V) = \mu_{j_V}(V' \mid V)$.
2966: \eprf
2967:
2968:
2969: \bigskip
2970:
2971: Although Theorem~\ref{infiso} was proved by Spohn \citeyear{Spohn86}, I
2972: include a proof here as well, to make the paper self-contained.
2973:
2974:
2975: \othm{infiso} For all $W$, the map $\FCP$ is a bijection from
2976: $\SLPS^c(W,\F,\F')$
2977: to $\Popper^c(W,\F,\F')$. \eothm
2978:
2979: \medskip
2980:
2981: \prf Again, the difficulty comes in showing that $\FCP$ is onto.
2982: As it says in the main text, given a Popper space $(W,\F,\F',\mu)$, the
2983: idea is to
2984: construct sets $U_0, U_1, \ldots$ and an LPS $\vecmu$ such that
2985: $\mu_\beta(V)=\mu(V \mid U_\beta)$, and show that $\FCP(W,\F,\vecmu) =
2986: (W,\F,\F',\mu)$. The construction is somewhat involved.
2987:
2988: As a first step, put an order $\le$ on sets in
2989: $\F'$ by defining $U \le V$ if
2990: $\mu(U \mid U \union V) > 0$.
2991: %joe1
2992: (Essentially, the same order is considered by van Fraassen \citeyear{vF76}.)
2993:
2994: \lem\label{lem0} $\le$ is transitive. \elem
2995:
2996: \prf
2997: By definition, if $U \le V$ and $V \le V'$, then $\mu(U \mid U \union V) >0$
2998: and $\mu(V \mid V \union V') > 0$. To see that $\mu(U \mid U \union
2999: V') > 0$, note that
3000: $\mu(U \mid U \union V \union V') + \mu(V \mid U \union V \union V') + \mu(V' \mid U
3001: \union V \union V') = 1$, so at least one of $\mu(U \mid U \union V \union
3002: V')$, $\mu(V \mid U \union V \union V')$, or $\mu(V' \mid U \union V \union V')$
3003: is positive. I consider each of the cases separately.
3004:
3005: \paragraph{Case 1:} Suppose that $\mu(U \mid U \union V \union V') > 0$. By CP3,
3006: $$\mu(U \mid U \union V \union V') = \mu(U \mid U \union V') \times \mu(U \union
3007: V' \mid U \union V \union V').$$
3008: Thus, $\mu(U \mid U \union V') > 0$, as desired.
3009:
3010: \paragraph{Case 2:} Suppose that $\mu(V \mid U \union V \union V') > 0$.
3011: By assumption, $\mu(U \mid U \union V) > 0$; since $\mu(V \mid U \union V \union
3012: V') > 0$, it follows that $\mu(U \union V \mid U \union V \union V') > 0$.
3013: Thus, by CP3,
3014: $$\mu(U \mid U \union V \union V') = \mu(U \mid U \union V) \times \mu(U \union
3015: V \mid U \union V \union V') > 0.$$
3016: Thus, case 2 can be reduced to case 1.
3017:
3018: \paragraph{Case 3:} Suppose that $\mu(V' \mid U \union V \union V') > 0$.
3019: By assumption, $\mu(V \mid V \union V') > 0$; since $\mu(V' \mid U \union V \union
3020: V') > 0$, it follows that $\mu(V \union V' \mid U \union V \union V') > 0$.
3021: Thus, by CP3,
3022: $$\mu(V \mid U \union V \union V') = \mu(V \mid V \union V') \times \mu(V \union
3023: V' \mid U \union V \union V') > 0.$$
3024: Thus, case 3 can be reduced to case 2.
3025:
3026:
3027: This completes the proof, showing that $\le$ is transitive.
3028: \eprf
3029:
3030: Define $U \sim V$ if $U \le V$ and $V \le U$.
3031:
3032: \lem\label{lem1} $\sim$ is an equivalence relation on $\F'$. \elem
3033:
3034: \prf It is immediate from the
3035: definition that $\sim$ is reflexive and symmetric; transitivity follows
3036: from the transitivity of $\le$. \eprf
3037:
3038: R\'{e}nyi \citeyear{Renyi56}
3039: and van Fraassen \citeyear{vF76} also considered the $\sim$ relation in
3040: their papers, and the argument that $\le$ is transitive is similar in
3041: spirit to R\'{e}nyi's argument that $\sim$ is transitive.
3042: However, the rest of this proof diverges from those of R\'{e}nyi and van
3043: Fraassen.
3044:
3045: Let $[U]$ denote the $\sim$-equivalence class of $U$, and
3046: let $\F'/\nsim = \{[U]: U \in \F'\}$.
3047:
3048:
3049: \lem\label{lem2} Each equivalence class $[V] \in \F'/\nsim$ is closed under
3050: countable unions. \elem
3051:
3052: \prf Suppose that $V_1, V_2, \ldots \in [V]$. I must show that
3053: $\union_{i=1}^\infty V_i \in [V]$. Clearly $V_j \le \union_{i=1}^\infty
3054: V_i$ for all $j$. Suppose, by way of contradiction, that
3055: $\union_{i=1}^\infty V_i \not\le V_j$ for some $j$. Since $\le$ is
3056: transitive, it follows that $V_j < \union_{i=1}^\infty V_i$ for all $j$.
3057: Thus, $\mu(V_j \mid \union_{i=1}^\infty V_i) = 0$ for all $j$.
3058: But then, by countable additivity,
3059: %joe9
3060: % $$1 = \mu(\union_{i=1}^\infty V_i) \mid \union_{i=1}^\infty V_i) \le
3061: $$1 = \mu(\union_{i=1}^\infty V_i \mid \union_{i=1}^\infty V_i) \le
3062: \sum_{j=1}^\infty \mu(V_j \mid \union_{i=1}^\infty V_i) = 0,$$
3063: a contradiction. Thus, $[V]$ is closed under countable unions.
3064: \eprf
3065:
3066:
3067: \commentout{
3068: Next, observe that we can define a total preorder $\preceq$ (i.e., a
3069: reflexive and transitive relation) on $[V]$ using
3070: the same techniques as used to define $\le$. Namely,
3071: $V_1 \preceq V_2$ if $\mu(V_1 \mid V_1 \union V_2) \le \mu(V_2 \mid V_1 \union
3072: V_2)$. To see that $\preceq$ is transitive, suppose that $V_1, V_2, V_3
3073: \in [V]$ and $V_1 \preceq V_2$ and $V_2 \preceq V_3$.
3074: By CP3,
3075: \begin{equation}\label{eq2}
3076: \mu(V_i \mid V_1 \union V_2 \union V_3) = \mu(V_i \mid V_i \union V_j)
3077: \times \mu(V_i \union V_j \mid V_1 \union V_2 \union V_3),
3078: \end{equation}
3079: for all $i, j$.
3080: Since $V_1 \le V_2, applying (\ref{eq2}) first with $i = 1$ and $j=2$
3081: and then with $i=2$ and $j=1$,
3082: it follows that $\mu(V_1 \mid V_1 \union V_2 \union V_3)
3083: \le \mu(V_2 \mid V_1 \union V_2 \union V_3)$. Similarly, since $V_2 \le
3084: V_3$, it follows that $\mu(V_2 \mid V_1 \union V_2 \union V_3)
3085: \le \mu(V_3 \mid V_1 \union V_2 \union V_3)$.
3086: Thus,
3087: \begin{equation}\label{eq3}
3088: \mu(V_1 \mid V_1 \union V_2 \union V_3) \le \mu(V_3 \mid V_1 \union V_2 \union
3089: V_3).
3090: \end{equation}
3091: Since $[V]$ is
3092: closed under unions, it follows that $V_1 \union V_3 \in [V]$ and
3093: $V_1 \union V_2 \union V_3 \in [V]$. Thus, $\mu(V_1 \union V_3 \mid V_1
3094: \union V_2 \union V_3) > 0$. Now it immediately follows from
3095: (\ref{eq2}) and (\ref{eq3}) that $\preceq$ is transitive.}
3096:
3097: Fix an element $V_0 \in [V]$.
3098: \lem\label{lem3} $\inf \{\mu(V_0 \mid V_0 \union V'): V' \in [V]\} > 0$. \elem
3099:
3100: \prf Suppose that $\inf \{\mu(V_0 \mid V_0 \union V'): V' \in [V]\} = 0$.
3101: Then there exist sets $V_1, V_2, \ldots$ such that $\mu(V_0 \mid V_0 \union
3102: V_n) < 1/n$. Since $[V]$ is closed under countable unions,
3103: $\union_{i=1}^n V_i \in [V]$. Since $V_0 \sim \union_{i=1}^n V_i$, it
3104: follows that $\mu(V_0 \mid \union_{i=0}^\infty V_i) > 0$.
3105: But, by CP3, $$\mu(V_0 \mid \union_{i=0}^\infty V_i) = \mu(V_0 \mid V_0 \union V_n)
3106: \times \mu(V_0 \union V_n \mid \union_{i = 0}^\infty V_i) \le \mu(V_0 \mid V_0
3107: \union V_n) \le 1/n.$$
3108: Since this is true for all $n > 0$, it follows that
3109: $\mu(V_0 \mid \union_{i=0}^\infty V_i) = 0$, a contradiction.
3110: \eprf
3111:
3112: The next lemma shows that each equivalence class in $\F'/\nsim$ has a
3113: ``maximal element''.
3114:
3115: \lem\label{lem4} In each equivalence class $[V]$, there is an element
3116: $V^*\in [V]$ such that $\mu(V^* \mid V' \union V^*) = 1$ for all $V' \in
3117: [V]$. \elem
3118:
3119: \prf Again, fix an element $V_0 \in [V]$. By Lemma~\ref{lem3}, there
3120: exists some $\alpha_V > 0$ such that $\inf \{\mu(V_0 \mid V_0 \union V'): V'
3121: \in [V]\} = \alpha_V$. Thus, there exist sets $V_1, V_2, V_3, \ldots \in
3122: [V]$ such that $\mu(V_0 \mid V_0 \union V_n) < \alpha + 1/n$. By
3123: Lemma~\ref{lem2}, $V^* = \union_{i=0}^\infty V_i \in [V]$. By CP3,
3124: %\begin{equation}
3125: $$\mu(V_0 \mid V^*) = \mu(V_0 \mid V_0 \union V_n) \times \mu(V_0 \union V_n \mid V^*)
3126: \le \mu(V_0 \mid V_0 \union V_n) < \alpha_V + 1/n.$$
3127: %\end{equation}
3128: Thus, $\mu(V_0 \mid V^*) \le \alpha_V$. By choice of $\alpha_V$, it follows
3129: that $\mu(V_0 \mid V^*) = \alpha_V$.
3130:
3131: Suppose that
3132: $\mu(V^* \mid V' \union V^*) < 1$ for some $V' \in [V]$. But then, by CP3,
3133: $$\mu(V_0 \mid V' \union V^*) = \mu(V_0 \mid V^*) \times \mu(V^* \mid V' \union V^*) <
3134: \alpha_V,$$ contradicting the choice of $\alpha_V$. Thus,
3135: $\mu(V^* \mid V' \union V^*) = 1$ for all $V' \in [V]$. \eprf
3136:
3137: Define a
3138: total order on these equivalence relations by taking $[U] \le [V]$ if
3139: $U' \le V'$ for some $U' \in [U]$ and $V' \in [V]$. It is easy to check
3140: (using the transitivity of $\le$) that if $U' \le V'$ for some $U' \in
3141: [U]$ and some $V' \in [V]$, then $U'' \le V''$ for all $U'' \in [U]$ and
3142: all $V'' \in [V]$.
3143:
3144: \lem $\le$ is a well-founded relation on $\F'/\nsim$. \elem
3145:
3146: \prf
3147: Note that if $[U] < [V]$, then $\mu(V \mid U \union V) = 0$. It now
3148: follows from countable additivity that $<$ is a well-founded order on
3149: these equivalence classes. For suppose that there exists an infinite
3150: decreasing sequence $[U_0] > [U_1] > [U_2] > \ldots$.
3151: Since $\F$ is a $\sigma$-algebra, $\union_{i=0}^\infty U_i \in \F$; since
3152: $\F'$ is closed under supersets, $\union_{i=0}^\infty U_i \in \F'$.
3153: By CP3,
3154: $$\mu(U_j \mid \union_{i=0}^\infty U_i) = \mu(U_j \mid U_{j} \union U_{j+1}) \times
3155: \mu(U_j \union U_{j+1} \mid \union_{i=0}^\infty U_i) = 0.$$ Let $V_0 = U_0$
3156: and, for $j > 0$, let
3157: $V_j = U_j - (\union_{i =0}^{j-1} U_j)$. Clearly the $V_j$'s are
3158: pairwise disjoint, $\union_i U_i = \union_i V_i$, and
3159: $\mu(V_j \mid \union_{i=0}^\infty U_i) \le \mu(U_j \mid \union_{i=0}^\infty U_i) = 0$.
3160: It now follows that using countable additivity that
3161: %joe9
3162: %$$1 = \mu(\union_{i=0}^\infty U_i \mid \union_{i=0}^\infty U) =
3163: $$1 = \mu(\union_{i=0}^\infty U_i \mid \union_{i=0}^\infty U_i) =
3164: \sum_{i=0}^\infty \mu(V_i \mid \union_{i=0}^\infty U_i) = 0.$$
3165: This is as contradiction, so the equivalence classes are well-founded.
3166: \eprf
3167:
3168: Because $\le$ is well-founded, there is an order-preserving bijection
3169: $O$ from $\F'/\nsim$ to an initial segment of the ordinals (i.e., $[U]
3170: \le [V]$ iff $O([U]) \le O([V])$.
3171: Thus, the equivalence classes can be enumerated using all the ordinals
3172: less than some ordinal $\alpha$. By Lemma~\ref{lem4}, there are
3173: sets $U_\beta$, $\beta < \alpha$, in $\F'$ such that if $O([U]) =
3174: \beta$, then $U_\beta \in [U]$ and $\mu(U_\beta \mid U \union U_\beta) = 1$
3175: for all $U' \in [U]$. Define an LPS $\vecmu = (\mu_0, \mu_1, \ldots )$ of
3176: length $\alpha$ by taking $\mu_\beta(V) = \mu(V \mid U_\beta)$. The choice
3177: of the $U_\beta$'s guarantees that this is actually an SLPS.
3178:
3179: It remains to show that $(W,\F,\F',\mu)$ is the result of applying
3180: %joe9
3181: %$F_{C \rightarrow P}$
3182: $\FCP$
3183: to $(W,\F,\vecmu)$. Suppose that instead $(W,\F,\F'',\mu')$ is the
3184: result. The argument that $\F'' \subseteq \F'$ is identical to that in
3185: the finite case: If $V \in \F''$, then
3186: $\mu_\beta(V) > 0$ for some $\beta$. Thus, $\mu(V \mid U_\beta) > 0$. Since
3187: $U_\beta \in \F'$, it follows that $V \in \F'$. Thus, $\F'' \subseteq \F'$.
3188:
3189: Now suppose that $V \in \F'$. Thus, $V \sim V_\beta$ for some $\beta <
3190: \alpha$. It follows that $\mu(V \mid V_\beta) > 0$, so $V \in \F''$.
3191:
3192: Finally, to show that $\mu(U \mid V) = \mu'(U \mid V)$, suppose that $\beta$ is
3193: such that $V \sim V_\beta$. It follows that $\mu(V \mid V_{\beta'}) = 0$ for
3194: $\beta' < \beta$ and $\mu(V \mid V_{\beta}) > 0$. Thus, by definition,
3195: $\mu'(U \mid V) = \mu_\beta(U \mid V)$. Without loss of generality, assume
3196: that $U \subseteq V$ (otherwise replace $U$ by $U \inter V$). Thus, by
3197: CP3,
3198: \begin{equation}\label{eq4}
3199: \mu(U \mid V) \times \mu(V \mid V \union V_\beta) = \mu(U \mid V \union V_\beta).
3200: \end{equation}
3201: Suppose $V' \subseteq V$.
3202: Clearly $$\mu(V' \mid V \union V_\beta) = \mu(V' \inter V_\beta \mid V \union
3203: V_\beta) + \mu(V' \inter \overline{V_\beta} \mid V \union V_\beta).$$
3204: Now by CP3 and the fact that $\mu(V_\beta \mid V \union V_\beta) = 1$,
3205: $$\mu(V' \inter V_\beta \mid V \union V_\beta) = \mu(V' \mid V_\beta) \times
3206: \mu(V_\beta \mid V \union V_\beta) = \mu(V' \mid V_\beta)$$
3207: and
3208: $$\mu(V' \inter \overline{V_\beta} \mid V \union V_\beta) \le
3209: \mu(\overline{V_\beta} \mid V \union V_\beta) = 0.$$
3210: Thus, $\mu(V' \mid V \union V_\beta) = \mu(V' \mid V_\beta)$.
3211: Applying this observation to both $U$ and $V$ shows that
3212: $\mu(V \mid V \union V_\beta) = \mu(V \mid V_\beta)$ and $\mu(U \mid V
3213: \union V_\beta)
3214: =\mu(U \mid V_\beta)$. Plugging this into (\ref{eq4}), it follows that
3215: $$\mu(U \mid V) = \mu(U \mid V_\beta)/\mu(V \mid V_\beta) = \mu_\beta(U)/\mu_\beta(V) =
3216: \mu_\beta(U \mid V) = \mu'(U \mid V).$$
3217: This completes the proof of the theorem.
3218: \eprf
3219:
3220:
3221: \bigskip
3222: \opro{prop:BS} The map $\FCP$ is a surjection from
3223: $\SLPS^c(W,\F,\F')$ onto $\T^c(W,\F,\F')$. \eopro
3224:
3225: \medskip
3226:
3227: \prf Suppose that $\mu \in \T^c(W,\F,\F')$. I want to construct an
3228: SLPS $\vecmu \in \SLPS^c(W,\F,\F')$ such that $\FCP(\vecmu) = \mu$.
3229: I first label each element of $\F'$ with a natural
3230: number. Intuitively, if $U \in \F'$ is labeled $k$, then $k$ will be
3231: the least index such that $\mu_k(U) > 0$. The labeling is done by
3232: induction on $k$. Each topmost set in the forest
3233: (i.e., the root of some tree in the forest) is labeled 0, as are all
3234: sets $U'$ such that $\mu(U' \mid U) > 0$, where $U$ is a topmost node.
3235: These are all the nodes labeled by 0. Label all the maximal unlabeled
3236: sets by 1 (that is, label $U \in \F'$ by 1 if it is not labeled 0, and
3237: is not a subset of another unlabeled set); in addition, label a set $U'$
3238: by 1 if $\mu(U' \mid U) > 0$ and $U$ is labeled by 1. Note that every
3239: set at depth 0 or 1 in the forest is labeled by either 0 or 1.
3240:
3241: Suppose that the labeling process has been completed for labels $0,
3242: \ldots, k$ such that the following properties hold, where $\lab(U)$
3243: denotes the label of the event $U$:
3244: \begin{itemize}
3245: \item all sets up to depth $k$ in the forest have been labeled;
3246: \item if $\lab(U) = k'$, $U' \in \F'$, and $\mu(U' \mid U) > 0$, then
3247: $\lab(U') \le \lab(U)$.
3248: \end{itemize}
3249: Label all the maximal unlabeled sets with $k+1$; in addition, if $U'$
3250: is unlabeled and $\mu(U' \mid U) > 0$ for some $U$ such that $\lab(U)
3251: = k+1$, then assign label $k+1$ to $U'$. Clearly the two properties
3252: above continue to hold. This completes the labeling process.
3253:
3254: Let $\C_k$ be the set of maximal sets in $\F'$ labeled $k$.
3255: T2 and T3 guarantee that, for all $k$, the sets in $\C_k$ are
3256: disjoint. Let $\mu_k'$ be
3257: an arbitrary probability on $W$ such that $\mu_k'(U) > 0$ for all $U \in
3258: \C_k$ and $\sum_{U \in C_k} \mu_k'(U) = 1$. Define an LPS
3259: $\vecmu = (\mu_0, \mu_1, \ldots)$ as follows (where the length of
3260: $\vecmu$ is $\omega$ if $\C_k \ne \emptyset$ for all $k$, and
3261: is $k+1$ if $k$ is the largest integer such that $\C_k \ne \emptyset$).
3262: For $V \in \F$, let $\mu_j(V) = \sum_{U \in \C_j} \mu(V \mid U)
3263: \mu_j'(U)$. I now show that $\vecmu(V \mid U) = \mu(V \mid U)$ for all $V \in
3264: \F$ and $U \in \F'$. Suppose that $U \in \C_k$. Then $\mu_j(U) = 0$ for
3265: all $j < k$, and $\mu_k(U) > 0$. Thus, $\vecmu(V \mid U) = \mu_k(V \mid
3266: U)$. But it is immediate from the definition that $\mu_k(V \mid U) =
3267: \mu(V \mid U)$. Thus, $\FCP(\vecmu) = \mu$. Moreover, if $U \in \F'$
3268: and $\lab(U) = k$, let $U'$ be the maximal set containing $U$ such
3269: that $\lab(U') = k$. (The labeling guarantees that such a set
3270: exists.) Then $\mu_k(U') = \mu(U' \mid U) > 0$. It follows that
3271: $\vecmu(U) > 0$ for all $u \in \F'$. Finally, note that
3272: $\vecmu$ is an SLPS (in fact, an LCPS). If $U_k = \union \C_k -
3273: \union_{k' > k} (\union \C_{k'})$, then the sets $U_k$ are disjoint, and
3274: $\mu_k(U_k) = 1$. \eprf
3275:
3276:
3277:
3278: \bigskip
3279:
3280: \opro{FCPaeq}
3281: If $\nu \aeq \vecmu$, then $\nu(U) > 0$ iff $\vecmu(U) > \vec{0}$.
3282: Moreover, if $\nu(U) > 0$, then $\stand{\nu(V \mid U)} = \mu_j(V \mid U)$, where
3283: $\mu_j$ is the first probability measure in $\vecmu$ such that $\mu_j(U)
3284: > 0$. \eopro
3285:
3286: \medskip
3287:
3288: \prf Recall that for $U \subseteq W$, $\chi_U$ is the indicator
3289: function for $U$;
3290: that is, $\chi_U(w) = 1$ if $w \in U$ and $\chi_U(w) = 0$ otherwise.
3291: Notice that $E_\nu(\chi_U) > E_\nu(\chi_{\emptyset})$ iff $\nu(U) > 0$
3292: and $E_{\vecmu}(\chi_U) > E_{\vecmu}(\chi_{\emptyset})$ iff $\vecmu(U) >
3293: \vec{0}$. Since $\nu \aeq \vecmu$, it follows that
3294: $\nu(U) > 0$ iff $\vecmu(U) > \vec{0}$. If $\nu(U) > 0$,
3295: %joe6
3296: %note that $E_\nu(\chi_{U\inter V} - \alpha \chi_U) >
3297: %E_\nu(\chi_{\emptyset})$ iff $\alpha < \stand{\nu(V \mid U)}$. Similarly,
3298: %$E_{\vecmu}(\chi_{U\inter V} - \alpha \chi_U) >
3299: %E_{\vecmu}(\chi_{\emptyset})$ iff $\alpha < \mu_j(U)$, where $j$ is the
3300: note that $E_\nu(\chi_{U\inter V} - r \chi_U) >
3301: E_\nu(\chi_{\emptyset})$ iff $r < \stand{\nu(V \mid U)}$. Similarly,
3302: $E_{\vecmu}(\chi_{U\inter V} - r \chi_U) >
3303: E_{\vecmu}(\chi_{\emptyset})$ iff $r < \mu_j(U)$, where $j$ is the
3304: least index such that $\mu_j(U) > 0$. It follows that $\stand{\nu(V \mid U)}
3305: = \mu_j(V \mid U)$. \eprf
3306:
3307: \bigskip
3308:
3309: \opro{motivation} If $\vecmu, \vecmu' \in \SLPS(W,\F)$, then
3310: $\vecmu \aeq \vecmu'$
3311: iff $\vecmu = \vecmu'$.
3312: %Moreover, if $\vecmu \in \LPS^c(W,\F)$, then
3313: %there exists a unique $\vecmu' \in \SLPS^c(W,\F)$ such that $\vecmu \aeq
3314: %\vecmu'$.
3315: \eopro
3316:
3317: \medskip
3318:
3319: \prf Clearly $\vecmu = \vecmu'$ implies that $\vecmu \aeq \vecmu'$.
3320: For the converse, suppose that $\vecmu \aeq \vecmu'$ for $\vecmu,
3321: \vecmu' \in \SLPS(W,\F)$. If $\vecmu \ne \vecmu'$, let $\alpha$ be the
3322: least ordinal such that $\mu_\alpha \ne \mu'_\alpha$, and let $U$ be
3323: such that $\mu_\alpha(U) \ne \mu'_\alpha(U)$. Without loss of
3324: generality, suppose that $\mu_\alpha(U) > \mu'_\alpha(U)$.
3325: Let the sets $U_\beta$
3326: be such that $\mu_\beta(U_\beta) = 1$ and $\mu_\beta(U_\gamma) = 0$ if
3327: $\gamma > \beta$; similarly choose the sets $U_\beta'$. Since
3328: $\mu_\beta = \mu'_\beta$ for $\beta < \alpha$, it follows that
3329: %joe9
3330: %$\mu_\beta(U_\alpha \union U'_\alpha) = \mu_\beta(U_\alpha \union
3331: %U'_\alpha) = 0$ for $\beta < \alpha$; moreover
3332: %$\mu_\alpha(U_\alpha \union U'_\alpha) = \mu_\alpha(U_\alpha \union
3333: $\mu_\beta(U_\alpha \union U'_\alpha) = \mu'_\beta(U_\alpha \union
3334: U'_\alpha) = 0$ for $\beta < \alpha$; moreover,
3335: $\mu_\alpha(U_\alpha \union U'_\alpha) = \mu_\alpha'(U_\alpha \union
3336: U'_\alpha) = 1$. Choose $r$ such that $\mu_\alpha(U) > r >
3337: \mu'_\alpha(U)$. Let $X$ be the random variable $\chi_U -
3338: r\chi_{U_\alpha \union U'_\alpha}$ and let $Y = \chi_\emptyset$.
3339: Then $E_{\vecmu}(X) > E_{\vecmu}(Y)$, while
3340: $E_{\vecmu'}(X) < E_{\vecmu'}(Y)$, so $\vecmu \not\aeq \vecmu'$.
3341: \eprf
3342:
3343: \bigskip
3344:
3345: \opro{finiteeq} If $W$ is finite, then every LPS over $(W,\F)$ is
3346: equivalent to an LPS of length at most $|\Bas(\F)|$. \eopro
3347:
3348: \medskip
3349:
3350: \prf Suppose that $W$ is finite and $\Bas(\F) = \{U_1, \ldots, U_k\}$.
3351: Given an LPS $\vecmu$, define a finite subsequence $\vecmu' =
3352: %joe9
3353: %(\mu_{m_0}, \ldots, \mu_{m_h})$ of
3354: (\mu_{k_0}, \ldots, \mu_{k_h})$ of
3355: $\vecmu$ as follows. Let $\mu_{k_0} = \mu_0$. Suppose that
3356: $\mu_{k_0}, \ldots, \mu_{k_j}$ have been defined. If all probability
3357: measures in $\vecmu$
3358: with index greater that $k_j$ are linear combinations of the probability
3359: measures with index $\mu_{k_0}, \ldots, \mu_{k_j}$, then take $\vecmu'
3360: = (\mu_{k_0}, \ldots, \mu_{k_j})$. Otherwise, let $\mu_{k_{j+1}}$ be
3361: the probability measure in $\vecmu$ with least index that is not a
3362: linear combination of $\mu_{k_0}, \ldots, \mu_{k_j}$.
3363: Since a probability measure over $(W,\F)$ is determined by its value on
3364: the sets in $\Bas(\F)$,
3365: a probability measure over $(W,\F)$ can be identified with a vector in
3366: $\IR^{|\Bas(\F)|}$: the vector defining the probabilities of the
3367: elements in $\Bas(\F)$. There can be at most $|\Bas(\F)|$ linearly
3368: independent such vectors, thus $\vecmu'$ has length at most
3369: $|\Bas(\F)|$.
3370:
3371: It remains to show that $\vecmu'$ is equivalent to $\vecmu$. Given
3372: random variables $X$ and $Y$, suppose that $E_{\vecmu}(X) <
3373: E_{\vecmu}(Y)$. Then there is some minimal index $\beta$ such that
3374: $E_{\mu_\gamma}(X) = E_{\mu_\gamma}(Y)$ for all $\gamma < \beta$ and
3375: $E_{\mu_\beta}(X) < E_{\mu_\beta}(Y)$. It follows that
3376: $\mu_\beta$ cannot be a linear combination of $\mu_\gamma$ for $\gamma <
3377: \beta$. Thus, $\mu_\beta$ is one of the probability measures in
3378: $\vecmu'$. Moreover, the expected value of $X$ and $Y$ agree for all
3379: probability measures in $\vecmu'$ with lower index (since they do in
3380: $\vecmu$). Thus, $E_{\vecmu'}(X) < E_{\vecmu'}(X)$.
3381:
3382: The argument in the other direction is similar in spirit and left to the
3383: reader. \eprf
3384:
3385:
3386: \othm{lpsnps} If $W$ is finite, then
3387: %joe2
3388: %$\FLN$ is an isomorphism from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$.
3389: $\FLN$ is a bijection from $\LPS(W,\F)/\naeq$ to $\NPS(W,\F)/\naeq$
3390: that preserves equivalence (that is, each NPS in $\FLN([\vecmu])$ is
3391: equivalent to $\vecmu$).
3392: \eothm
3393:
3394: %joe2
3395: \prf I first provide a sufficient condition for an NPS to be equivalent
3396: an LPS in a finite space.
3397:
3398: \lem\label{aeqchar} Suppose that $\vecmu = (\mu_0,\ldots, \mu_k)$, and
3399: $\epsilon_0, \ldots, \epsilon_k$ are
3400: such that $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i = 1, \ldots,
3401: k-1$ and $\sum_{i=0}^k \epsilon_i = 1$. Then $\vecmu \aeq \epsilon_0
3402: \mu_0 + \cdots + \epsilon_k
3403: \mu_k$.%
3404: \footnote{Although I do not need this fact here, it is easy to see that
3405: if $W$ is finite and $\vecmu = (\mu_0, \ldots, \mu_k)$ is
3406: an SLPS in $\LPS(W,\F)$, then the converse of Lemma~\ref{aeqchar} holds
3407: as well: if $\nu \aeq \vecmu$, then $\nu = \epsilon_0 \mu_0 + \cdots
3408: \epsilon_k \mu_k$ for some $\epsilon_0, \ldots, \mu_k$ are such that
3409: $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i = 1, \ldots,
3410: k-1$ and $\sum_{i=0}^k \epsilon_i = 1$. (I conjecture this fact is true
3411: in general, not just if $\vecmu$ is an SLPS, but I have not checked this.}
3412: \elem
3413:
3414: \prf Suppose that there exist $\epsilon, \ldots, \epsilon_k$ as in the
3415: statement of the lemma and
3416: %Let $\vecmu = (\mu_0, \ldots, \mu_k)$ and let
3417: $\nu = \epsilon_0 \mu_0 + \cdots + \epsilon_k \mu_k$. I want to show
3418: that $\vecmu \aeq \nu$.
3419:
3420: If $E_{\vecmu}(X) < E_{\vecmu}(Y)$,
3421: then there exists some $j \le k$ such
3422: that $E_{\mu_j}(X) < E_{\mu_j}(Y)$ and $E_{\mu_{j'}}(X) =
3423: E_{\mu_{j'}}(Y)$ for all $j' < j$.
3424: Since $E_\nu(X) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(X)$ and
3425: $E_\nu(Y) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(Y)$,
3426: to show that $E_\nu(X) < E_\nu(Y)$, it suffices to show
3427: %joe9
3428: %that $\epsilon_j(E_{\mu_j}(X) - E_{\mu_j}(Y)) >
3429: that $\epsilon_j(E_{\mu_j}(Y) - E_{\mu_j}(X)) >
3430: \sum_{i=j+1}^k \epsilon_i (E_{\mu_i}(X) - E_{\mu_i}(Y))$. Since
3431: $\epsilon_{j'+1} \le \epsilon_{j'}$ for $j' \ge j$
3432: %joe6
3433: (this follows from the fact that $\stand{\epsilon_{j'+1}/\epsilon{j'}} =
3434: 0$), it follows that
3435: $\sum_{i=j+1}^k \epsilon_i (E_{\mu_i}(X) - E_{\mu_i}(Y)) \le
3436: \epsilon_{j+1} \sum_{i=j+1}^k |E_{\mu_i}(X) - E_{\mu_i}(Y)|$.
3437: %joe9
3438: Thus, it suffices to show that $\epsilon_{j+1} \sum_{i=j+1}^k
3439: |E_{\mu_i}(X) - E_{\mu_i}(Y)| < \epsilon_j(E_{\mu_j}(Y) -
3440: E_{\mu_j}(X))$.
3441: %joe6
3442: This is trivially the case if $E_{\mu_i}(X) = E_{\mu_i}(Y)$ for all
3443: $i$ such that $j+1 \le i \le k$. Thus, assume without loss of
3444: generality that $\sum_{i=j+1}^k |E_{\mu_i}(X) - E_{\mu_i}(Y)| > 0$.
3445: In this case, it suffices to show that $\epsilon_{j+1}/\epsilon_{j} <
3446: %joe9
3447: %(E_{\mu_j}(X) - E_{\mu_j}(Y))/\sum_{i=j+1}^k |E_{\mu_i}(X) -
3448: (E_{\mu_j}(Y) - E_{\mu_j}(X))/\sum_{i=j+1}^k |E_{\mu_i}(X) -
3449: E_{\mu_i}(Y)|$. Since the right-hand side of the inequality is a
3450: positive real and $\stand{\epsilon_{j+1}/\epsilon_{j}} = 0$, the result
3451: follows.
3452:
3453: The argument in the opposite direction is similar. Suppose that
3454: $E_\nu(X) < E_\nu(Y)$.
3455: %joe6
3456: %Again, since $E_\mu(X) = \sum_{i=0}^k \epsilon_i
3457: %E_{\mu_i}(X)$ and $E_\mu(Y) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(Y)$,
3458: Again, since $E_\nu(X) = \sum_{i=0}^k \epsilon_i
3459: E_{\mu_i}(X)$ and $E_\nu(Y) = \sum_{i=0}^k \epsilon_i E_{\mu_i}(Y)$,
3460: it must be the case that if $j$ is the least index such that
3461: $E_{\mu_j}(X) \ne E_{\mu_j}(Y)$, then $E_{\mu_j}(X) < E_{\mu_j}(Y)$.
3462: Thus, $E_{\vecmu}(X) < E_{\vecmu}(Y)$. It follows that $\vecmu \aeq \nu$.
3463: \eprf
3464:
3465:
3466:
3467: It remains to show that, given an NPS $(W,\F,\nu)$, there is an equivalence
3468: class $[\vecmu]$ such that $\FLN([\vecmu]) = [\nu]$.
3469: %joe9
3470: %My goal is to find (standard) probability measures
3471: As I said in the main text, the goal now is to find (standard) probability
3472: measures
3473: $\mu_0, \ldots, \mu_{k}$ and $\epsilon_0, \ldots, \epsilon_k$ such that
3474: $\stand{\epsilon_{i+1}/\epsilon_i} = 0$ and $\nu = \epsilon_0 \mu_0 +
3475: \cdots + \epsilon_k\mu_k$. If this can be done then, by
3476: Lemma~\ref{aeqchar}, $\nu \aeq (\mu_0, \ldots, \mu_k)$, and we are done.
3477:
3478: Suppose that $\Bas(\F) = \{U_1, \ldots, U_k\}$
3479: and that $\nu$ has range $\IR^*$. Note that
3480: a probability measure $\nu'$ on $\F$ can be identified with a
3481: vector $(a_1, \ldots, a_k)$ over $\IR^*$, where $\nu'(U_i) = a_i$, so
3482: that $a_1 + \cdots + a_k = 1$. In the rest of this proof, I frequently
3483: identify $\nu$ with such a vector.
3484:
3485:
3486: \lem\label{newlem1} There exist $k' \le k$, $\epsilon_0, \ldots,
3487: \epsilon_{k'}$ where $\epsilon_0 = 1$,
3488: $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i =
3489: 1, \ldots, k'-1$, and standard real-valued vectors
3490: $\vec{b}_j$, $j = 0, \ldots, k'$, in $\IR^k$ such that
3491: $$\nu = \sum_{j=0}^{k'} \epsilon_j \vec{b}_j.$$
3492: \elem
3493:
3494: \prf I show by induction on $m \le k$ that there exist $\epsilon_0,\ldots,
3495: \epsilon_m$ and $m' \le m$ such that $\epsilon_j = 0$ for $j' > m'$,
3496: $\stand{\epsilon_{i+1}/\epsilon_{i}} = 0$ for $i =
3497: 1, \ldots, m'-1$, and standard vectors
3498: $\vec{b}_j$ $j = 0, \ldots, m-1$
3499: and a possibly nonstandard vector $\vec{b}'_m =
3500: (b'_{m1}, \ldots, b'_{mk})$ such that
3501: (a) $\nu = \sum_{j=0}^{m-1} \epsilon_j \vec{b}_j + \epsilon_m \vec{b}'_m$,
3502: (b) $|b'_{mi}| \le 1$, and (c) at least $m$ of $b'_{m1}, \ldots, b'_{mk}$
3503: are standard.
3504:
3505: For the base case (where $m=0$), just take $\vec{b}'_0 =
3506: \nu$ and $\epsilon_0 = 1$. For the inductive step, suppose that $0 \le m
3507: < k$. If $\vec{b}'_m$ is standard, then take $\vec{b}_m = \vec{b}'_m$,
3508: $\vec{b}_{m+1} = \vec{0}$, and $\epsilon_{m+1}
3509: = 0$. Otherwise, let $\vec{b}_m = \stand{\vec{b}'_m}$ and let
3510: $\vec{b}''_{m+1} =
3511: \vec{b}'_m - \vec{b}_m$. Let $\epsilon' = \max\{|b''_{(m+1)i}|: i = 1,
3512: \ldots, k\}$. Since not all components of $\vec{b}'_m$ are standard,
3513: $\epsilon' > 0$. Note that, by construction, $\stand{\epsilon'/ b_{mi}} =
3514: 0$ if $b_{mi} \ne 0$, for $i = 1, \ldots, k$. Let $\vec{b}'_{m+1} =
3515: \vec{b}''_{m+1}/\epsilon'$ and let $\epsilon_{m+1} = \epsilon'
3516: \epsilon_m$.
3517: By construction, $|b'_{(m+1)i}| \le 1$ and at least one
3518: component of $\vec{b}'_{m+1}$ is either 1 or $-1$. Moreover, if
3519: $b_{mi}'$ is standard, then $b''_{(m+1)i} = b'_{(m+1)i} = 0$. Thus,
3520: $\vec{b}'_{m+1}$ has at least one more standard component that
3521: $\vec{b}'_m$. Since clearly $\nu = \sum_{j=0}^m\epsilon_j \vec{b}_j +
3522: \epsilon_{m+1} \vec{b}_{m+1}'$, this completes the inductive step.
3523: The lemma follows immediately.
3524: \eprf
3525:
3526: Returning to the proof of Theorem~\ref{lpsnps},
3527: I next prove by induction on $m$ that for all $m \le k'$ (where $k' \le
3528: k$ is as in Lemma~\ref{newlem1}), there exist standard probability measures
3529: $\mu_0, \ldots, \mu_m$, (standard) vectors $\vec{b}_{m+1},
3530: \ldots, \vec{b}_{k'} \in \IR^k$, and $\epsilon_1, \ldots,
3531: \epsilon_{k'}$ such that $\nu = \sum_{j=0}^m \epsilon_j \mu_j +
3532: \sum_{j = m+1}^{k'} \epsilon_j \vec{b}_j$.
3533:
3534: The base case is immediate from Lemma~\ref{newlem1}: taking $\vec{b}_j$,
3535: $j = 1, \ldots, k'$ as in Lemma~\ref{newlem1},
3536: $\vec{b}_0$ is in fact a probability measure since $\vec{b}_0 = \stand{\nu}$.
3537: Suppose that the result holds for $m$. Consider $\vec{b}_{m+1}$.
3538: If $b_{(m+1)i} < 0$ for some $j$ then, since $\nu(U_i) \ge 0$,
3539: there must exist $j' \in \{1, \ldots, m\}$ such that $\mu_{j'}(U_i) >
3540: 0$. Thus, there exists some $N > 0$ such that $N(\mu_{j'}(U_i)) +
3541: b_{(m+1)i} > 0$.
3542: Since there are only finitely many basic elements
3543: and every element in the vector $\mu_j$ is nonnegative, for $j = 0,
3544: \ldots, m$, there must exist some
3545: $N'$ such that
3546: $\vec{b}'_{m+1} = N'( \mu_0 + \cdots
3547: + \mu_m) + \vec{b}_{m+1} \ge 0$. Let $c = \sum_{i = 1}^k b_{(m+1)i}'$, and
3548: let $\mu_{m+1} = \vec{b}'_{m+1}/c$. Clearly,
3549: $\nu = (\epsilon_0 -N' \epsilon_{m+1}) \mu_0 + \cdots (\epsilon_m - N'
3550: \epsilon_{m+1}) \mu_m + c \epsilon_{m+1} \mu_{m+1} + \sum_{j=m+2}^{k'}
3551: \vec{b}_j$. This completes the proof of the inductive step.
3552:
3553: The theorem now immediately follows. \eprf
3554:
3555: \bigskip
3556:
3557:
3558: \opro{infiniteeq} Every LPS over $(W,\F)$ is
3559: equivalent to an LPS over $(W,\F)$ of length at most $|\F|$. \eopro
3560:
3561: \medskip
3562:
3563: \prf The argument is essentially the same as that for
3564: Proposition~\ref{finiteeq}, using the observation that
3565: a probability measure over $(W,\F)$ can be identified with an element of
3566: $\IR^{|\F|}$; the vector defining the probabilities of the elements in
3567: $\F$. I leave details to the reader. \eprf
3568:
3569:
3570: \pro\label{counter} For the NPS $(W,\F,\nu)$ constructed in
3571: Example~\ref{counter4},
3572: there is no LPS $\vecmu$ over $(W,\F)$ such that $\nu \aeq
3573: \vecmu$. \epro
3574:
3575:
3576: \prf I start with a straightforward lemma.
3577:
3578: \lem\label{distinct} Given an LPS $\vecmu$, there is an LPS $\vecmu'$
3579: such that $\vecmu
3580: \aeq \vecmu'$ and all the probability measures in $\vecmu'$ are
3581: distinct.
3582: \elem
3583:
3584: \prf Define $\vecmu'$ to be the subsequence consisting of all the
3585: distinct probability measures in $\vecmu$. That is, suppose that $\vecmu =
3586: %joe9
3587: %(\mu_0, \mu_1, \ldots )$. Then $\vecmu = (\mu_{k_0}, \mu_{k_1},
3588: (\mu_0, \mu_1, \ldots )$. Then $\vecmu' = (\mu_{k_0}, \mu_{k_1},
3589: \ldots )$, where $k_0 = 0$, and, if $k_\alpha$ has been defined for all
3590: $\alpha < \beta$ and
3591: there exists an index $\gamma$ such that $\mu_{k_\alpha} \ne \mu_\gamma$ for
3592: all $\alpha \le \beta$, then $k_\beta$ is the least index $\delta$ such that
3593: $\mu_{k_\alpha} \ne \mu_\delta$ for all $\alpha < \beta$. If there is no
3594: index $\gamma$ such that $\mu_\gamma \notin \{\mu_{k_\alpha}: \alpha <
3595: \beta\}$, then $\vecmu' = (\mu_{k_\alpha}: \alpha < \beta)$. I leave
3596: it to the reader to check that $\vecmu \aeq \vecmu'$. \eprf
3597:
3598: Returning to the proof of Proposition~\ref{counter}, suppose by way of
3599: contradiction that $\nu \aeq \vecmu$. Without loss of generality, by
3600: Lemma~\ref{distinct}, assume that all the probability measures
3601: in $\vecmu$ are distinct.
3602: Clearly
3603: $E_\nu(\chi_W) < E_\nu(\alpha \chi_{\{w_1\}})$ if $\alpha \ge 2$ and
3604: $E_\nu(\chi_W) >
3605: E_\nu(\alpha \chi_{\{w_1\}})$ if $\alpha < 2$. Since $\nu \aeq \vecmu$,
3606: it must be
3607: the case that $E_{\vecmu}(\chi_W) < E_{\vecmu}(\alpha \chi_{\{w_1\}})$ if
3608: $\alpha \ge 2$
3609: and $E_{\vecmu}(\chi_W) >
3610: E_{\vecmu}(\alpha \chi_{\{w_1\}})$ if $\alpha < 2$. Since $E_{\vecmu}(\chi_W)
3611: = (1, 1, \ldots)$, it follows that if $\vecmu = (\mu_0, \mu_1, \ldots)$,
3612: it must
3613: be the case that $\mu_0(w_1) = 1/2$ and
3614: \begin{equation}\label{eq:mu1}
3615: \mu_1(w_1) \ge 1/2.
3616: \end{equation}
3617: Similar
3618: arguments (comparing $\chi_W$ to $\chi_{\{w_{j}\}}$) can be used to show that
3619: $\mu_0(w_j) = 1/2^j$ and $\mu_1(w_{2j-1}) \ge 1/2^j$ for $j = 1, 2,
3620: \ldots$.
3621: %Next observe that $E_{\nu}(\chi_{\{w_1\}} - 2 \chi_{\{w_2\}}) =
3622: %E_{\nu}(3\chi_{\{w_1\}} - 1.5 \chi_W) (= 3\epsilon)$.
3623: Next, observe that $E_{\nu}(\chi_{\{w_1\}} - 2^{2k-1}\chi_{\{w_{2k}\}}) =
3624: (2^{k} + 1)\epsilon$. Thus, $$E_{\nu}(\chi_{\{w_1\}} -
3625: 2^{2k-1}\chi_{\{w_{2k}\}}) = E_{\nu}((2^{k}+1)(\chi_{\{w_1\}} - (\chi_W/2))).$$
3626: %E_{\nu}(\frac{2^k-1}{2^{k+1}-1}(\chi_{\{w_1\}} -
3627: %2^{2k+1}\chi_{\{w_{2k+2}\}}) >
3628: %> E_{\nu}(\chi_{\emptyset}) = 0.$$
3629: It follows that the same relationship must hold if $\nu$ is replaced by
3630: $\vecmu$. That is,
3631: $$\mu_1(w_1) - 2^{2k-1}\mu_1(w_{2k}) =
3632: (2^{k}+1)(\mu_1(w_1) - (1/2)).$$
3633: Rearranging terms, this gives
3634: %joe9
3635: %$$2^{k}\mu_1(w_1) + 2^{2k-1}\mu(w_{2k}) = 2^{k-1} + 1/2,$$
3636: $$2^{k}\mu_1(w_1) + 2^{2k-1}\mu_1(w_{2k}) = 2^{k-1} + 1/2,$$
3637: or
3638: \begin{equation}\label{eq1.5}
3639: %joe9
3640: %\mu_1(w_1) + 2^{k-1} \mu(w_{2k}) = 1/2 + 1/2^{k+1}.
3641: \mu_1(w_1) + 2^{k-1} \mu_1(w_{2k}) = 1/2 + 1/2^{k+1}.
3642: \end{equation}
3643: Thus, $\mu_1(w_1) \le 1/2 + 1/2^{k+1}$ for all $k \ge 1$.
3644: Putting this together with (\ref{eq:mu1}), it
3645: follows that $\mu_1(w_1) = 1/2$. Plugging this into (\ref{eq1.5}) gives
3646: $\mu_1(w_{2k}) = 1/2^{2k}$. It now follows that $\mu_1 =
3647: \mu_0$, contradicting the choice of $\vecmu$. \eprf
3648:
3649: \bigskip
3650:
3651: \othm{FNP} $\FNP$ is a bijection from $\NPS(W,\F)/\!\simeq$ to
3652: $\Popper(W,\F)$ and from $\NPS^c(W,\F)/\!\simeq$ to $\Popper^c(W,\F)$.
3653: \eothm
3654:
3655: \medskip
3656:
3657: \prf
3658: %joe9
3659: As I said in the main text, the proof that $\FNP$ is an injection is
3660: straightforward, and to prove that it is a surjection in the countably
3661: additive case, it suffices to show that $\FNP(W,\F,\nu) =
3662: (W,\F,\F',\mu)$, where $\nu \aeq \vecmu'$ and $\vecmu'$ is the
3663: countably additive SLPS such that $\FCP((W,\F,\vec{\mu}'))
3664: = (W, \F,\F', \mu)$. I now do this.
3665:
3666: Suppose that $\FNP(W,\F,\nu) = (W,\F,\F_1',\mu_1)$.
3667: First I show that $\nu(U) = 0$ iff $\vecmu'(U) = \vec{0}$.
3668: Let $X = \chi_U$ and $Y = \chi_{\emptyset}$. Note that $\nu(U) = 0$ iff
3669: $E_\nu(X) = E_\nu(Y)$ iff $E_{\vecmu'}(X) = E_{\vecmu'}(Y)$ iff
3670: $\vecmu'(U) = \vec{0}$. Thus, $\F_1' = \{U: \nu(U) \ne 0\} =
3671: \{U: \vecmu'(U) \ne \vec{0}\} = \F'$.
3672:
3673: Now suppose by way of contradiction that $\mu \ne \mu_1$. Thus, there
3674: must exist some $V \in \F$, $U \in \F'$ such that $\mu(V \mid U) \ne
3675: \mu_1(V \mid U)$. Let $\beta$ be the smallest ordinal such that
3676: %joe9
3677: %$\mu_\beta'(U) \ne 0$. It follows that $\mu_\beta(V \mid U) \ne
3678: %\stand{\nu(V \mid U)}$. We can assume without loss of generality that
3679: %$\mu_\beta(V
3680: $\mu_\beta'(U) \ne 0$. It follows that $\mu'_\beta(V \mid U) \ne \stand{\nu(V
3681: \mid U)}$. We can assume without loss of generality that $\mu'_\beta(V
3682: \mid U) > \stand{\nu(V \mid U)}$. Choose a real number $r$ such that
3683: %joe9
3684: %$\mu_\beta(V \mid U) > r > \st(V \mid U)$. Then
3685: $\mu'_\beta(V \mid U) > r > \stand{\nu(V \mid U)}$. Then
3686: $E_{\vecmu'}(\chi_{V \inter U}) > E_{\vecmu'}(r \chi_U)$ but
3687: $E_{\nu}(\chi_{V \inter U}) < E_{\nu}(r \chi_U)$. This contradicts the
3688: %joe2
3689: %assumption that $\vecmu' \aeq \nu$. It follows that $\FNP(\nu) =
3690: assumption that $\vecmu' \aeq \nu$. It follows that $\FNP(W,\F,\nu) =
3691: (W,\F,\F',\mu)$, as desired.
3692:
3693:
3694: %Thus, it remains to prove the result in the
3695: %case that $W$ is infinite and $\F$ is an algebra (but not necessarily a
3696: %$\sigma$-algebra).
3697: It remains to show that if $(W,\F,\F',\mu) \in \Popper(W,\F) -
3698: \Popper^c(W,\F)$, then there is some $(W,\F,\nu) \in \NPS(W,\F)$ such that
3699: $\FNP(W,\F\nu) = (W,\F,\F',\mu)$. My proof in this case follows closely
3700: the lines of
3701: an analogous result proved by
3702: McGee \citeyear{McGee94}. I provide the details here mainly for
3703: completeness.
3704:
3705: The proof relies on the following ultrafilter construction of
3706: non-Archimedean fields. Given a set $S$, a {\em filter\/} $\G$ on $S$ is a
3707: nonempty set of subsets of $\F$ that is closed under supersets (so that
3708: if $U \in \G$ and $U \subseteq U'$, then $U' \in \G$), is closed under
3709: finite intersections (so that if $U_1, U_2 \in \G$, then $U_1
3710: \inter U_2 \in \G$), and does not contain $\emptyset$. An {\em
3711: ultrafilter\/} is a maximal filter, that is, a filter that is not a
3712: strict subset of any other filter. It is not hard to show that if $\U$
3713: is an ultrafilter on $S$, then for all $U \subseteq S$, either $U \in
3714: \U$ or $\overline{U} \in \U$ \cite{BellSlomson}.
3715:
3716: Suppose $F$ is either $\IR$ or a
3717: non-Archimedean field, $J$ is an arbitrary set, and $\U$ is an
3718: ultrafilter on $J$. Define an equivalence relation $\sim_{\U}$ on
3719: $F^J$ by taking $(a_j: j \in J) \sim_{\U} (b_j: j \in J)$ if $\{j: a_j =
3720: b_j\} \in \U$. Similarly, define a total order $\preceq_\U$ by taking
3721: $(a_j: j \in J) \preceq_{\U} (b_j: j \in J)$ if $\{j: a_j \le b_j\} \in
3722: \U$. (The fact that $\le_{\U}$ is total uses the fact that for all $U
3723: \subseteq
3724: J$, either $U \in \U$ or $\overline{U} \in \U$. Note that the pointwise
3725: ordering on $F^J$ is not total.) Let $F^J/\nsim_{\U}$ consist of these
3726: equivalence classes. Note that $F$ can be viewed as a subset of
3727: $F^J/\nsim_{\U}$ by identifying $a \in F$ with the sequence of all $a$'s.
3728:
3729: Define addition and multiplication on $F^J$ pointwise,
3730: so that, for example, $(a_j: j \in J) + (b_j: j \in J) = (a_j + b_j: j
3731: \in J)$. It is easy to check that if $(a_j: j \in J) \sim_{\U} (a_j': j
3732: \in J)$, then $(a_j: j \in J) + (b_j: j \in J) \sim_{\U} (a_j': j \in J) +
3733: (b_j: j \in J)$, and similarly for multiplication. Thus, the
3734: definitions of $+$ and $\times$ can be extended in the obvious way to
3735: $F^J/\nsim_{\U}$. With these definitions, it is easy to check that
3736: $F^J/\nsim_{\U}$ is a field that contains $F$.
3737:
3738: Now given a Popper space $(W,\F,\F',\mu)$ and a finite subset $\A = \{U_1,
3739: \ldots, U_k\} \subseteq \F$, let $\F_{\A}$ be the (finite) algebra
3740: generated by $\A$ (that is, the smallest set containing $\{U_1, \ldots,
3741: U_k, W\}$ that is closed under unions and complement). Let
3742: $\F'_{\A} = \F_{\A} \inter \F'$. It follows from Theorem~\ref{FCPfin} that
3743: there is a finite SLPS $\vecmu_\A$ over $(W,\F_{\A})$ that is mapped to
3744: $(W,\F_{\A},\F'_{\A'}, \mu_{\A})$ by $\FCP$. (Although
3745: Theorem~\ref{FCPfin} is stated for finite state spaces $W$, the proof
3746: relies on only the fact that the algebra is finite, so it applies without
3747: change here.) It now follows from
3748: Theorem~\ref{lpsnps} that, for each $\A$, there is a nonstandard
3749: probability space $(W,\F_{\A},\nu_\A)$ with range $\IR(\epsilon)$ that is
3750: equivalent to $\vecmu_{\A}$. By Proposition~\ref{FCPaeq}, it follows
3751: that for $U \in \F'_{\A}$ iff $\nu_{\A}(U) = 0$.
3752: Moreover, $\stand{\nu_{\A}(V \mid U)} = \mu_{\A}(V \mid U)$ for $U \in
3753: \F'_{\A}$ and $V \in \F_{\A}$.
3754:
3755: Let $J$ consist of all finite subsets of $\F$. For a subset $\A$ of
3756: $\F$, let $G_{\A}$ be the subset of $2^J$ consisting of all sets in $J$
3757: containing $\A$. Let $\G = \{G \subseteq J: G \supseteq G_{\A} \mbox{ for
3758: some } \A \subseteq \F\}$. It is easy to check that $\G$ is a filter on
3759: $J$. It is a standard result that every filter can be extended to an
3760: ultrafilter \cite{BellSlomson}. Let $\U$ be an ultrafilter containing
3761: $\G$. By the construction above, $\R(\epsilon)/\nsim_{\U}$ is a
3762: non-Archimedean field.
3763:
3764: Define $\nu$ on $(W,\F)$ by taking
3765: $\nu(U) = (\nu_{\A}(U): \A \in J)$, where $\nu_\A(U)$ is taken to be 0
3766: if $U \notin \F_{\A}$. To see that $\nu$ is indeed a nonstandard
3767: probability measure with the required properties, note that clearly
3768: $\nu(W) = 1$ (where 1 is identified with the sequence of all 1's).
3769: Moreover, to see that $\nu(U) + \nu(V) = \nu(U \union V)$, let
3770: $\A_{U,V}$ be the smallest subalgebra containing $U$ and $V$.
3771: Note that if $\A \supset \A_{U,V}$, then
3772: $\nu_{\A}(U) + \nu_{\A}(V) = \nu_{\A'}(U \union V)$. Since the set of
3773: algebras containing $\A_{U,V}$ is an element of the ultrafilter, the
3774: result follows. Similar arguments show that $\nu(U) = 0$ iff $U \in
3775: \F'$ and that $\stand{\nu(V \mid U)} = \mu(V \mid U)$ if $U \in \F'$ and $V \in
3776: \F$. Clearly $\FNP(\nu) = \mu$. \eprf
3777:
3778: \bigskip
3779:
3780:
3781: %joe6
3782: %\opro{simeqvsaeq} If $\nu_1 \aeq \nu_2$ than $\nu_1 \simeq \nu_2$.
3783: \opro{simeqvsaeq} If $\nu_1 \aeq \nu_2$ then $\nu_1 \simeq \nu_2$.
3784: \eopro
3785:
3786: \medskip
3787:
3788: \prf Suppose that $\nu_1 \aeq \nu_2$. To show that $\nu_1 \simeq
3789: \nu_2$, first suppose that $\nu_1(U) \ne 0$ for some $U \subseteq W$. Then
3790: $E_{\nu_1}(\chi_\emptyset) < E_{\nu_1}(\chi_U)$. Since $\nu_1 \aeq
3791: \nu_2$, it must be the case that $E_{\nu_2}(\chi_\emptyset) <
3792: E_{\nu_2}(\chi_U)$. Thus, $\nu_2(U) \ne 0$. A symmetric argument shows
3793: that if $\nu_2(U) \ne 0$ then $\nu_1(U) \ne 0$. Next, suppose that
3794: $\nu_1(U) \ne 0$ and $\nu_1(V \mid U) = \alpha$. Thus,
3795: $E_{\nu_1}(\alpha \chi_U) = E_{\nu_1}(\chi_{U \inter V})$. Since
3796: $\nu_1 \aeq \nu_2$, it follows that
3797: $E_{\nu_2}(\alpha \chi_U) = E_{\nu_2}(\chi_{U \inter V})$, and so
3798: $\nu_2(V \mid U) = \alpha$. Thus, $\stand{\nu_1(V \mid U)} =
3799: \stand{\nu_2(V \mid U)}$.
3800: Hence, $\nu_1 \simeq \nu_2$, as desired. \eprf
3801: \commentout{
3802: \bigskip
3803:
3804: \opro{indaeq}
3805: $U$ is approximately conditionally independent of $V$
3806: given $V'$ with respect to $\nu$ iff there exists a measure $\nu'$ such
3807: that $\nu \aeq \nu'$ and $U$ is conditionally independent of $V$ given
3808: $V'$ with respect to $\nu'$.
3809: \eopro
3810:
3811: \medskip
3812:
3813: \prf Suppose that $U$ is approximately conditionally independent of $V$
3814: given $V'$ with respect to $\nu$. If $\nu(U \inter V') = 0$, then $U$
3815: is conditionally independent of $V$ given $V'$ with respect to $\nu$.
3816: If $\nu(U \inter V') \ne 0$, $\stand{\nu(V \mid U \inter V')}
3817: = \stand{\nu(V \mid V')}$.
3818: }
3819:
3820: \othm{BBDstrongindependence} There exists an NPS $\nu$ whose
3821: range is an
3822: elementary extension of the reals such that $\vecmu \aeq \nu$ and $X_1,
3823: %joe5
3824: %\ldots, X_n$ are strongly independent with respect to $\nu$ iff there
3825: \ldots, X_n$ are independent with respect to $\nu$ iff there
3826: exists a sequence $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$
3827: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$ as $j\rightarrow\infty$,
3828: and $X_1, \ldots, X_n$ are independent with respect to $\vecmu \, \Box
3829: \, \vec{r}^j$ for $j = 1, 2, 3, \ldots$.
3830: \eothm
3831:
3832: \prf Suppose that there exists an NPS
3833: $\nu$ whose range is an elementary extension of the reals, $\vecmu
3834: \aeq \nu$, and $X_1, \ldots, X_n$ are
3835: independent with respect to $\nu$. Using arguments similar in spirit to
3836: those the
3837: arguments of BBD \citeyear[Proposition 2]{BBD2}, it follows that there exist
3838: positive infinitesimals $\epsilon_1, \ldots, \epsilon_k$ such that
3839: $\vecmu \, \Box \, (\epsilon_1, \ldots, \epsilon_k) = \nu$. It is not
3840: hard to show that there exist a finite set of real-valued polynomials
3841: $p_1,\ldots, p_N$ such that $p_j(\epsilon_1, \ldots, \epsilon_k) = 0$
3842: for $j = 1, \ldots, N$ and if $\vec{r}$ is a vector of positive reals
3843: such that $p_j(\vec{r}) = 0$ for $j = 1, \ldots, N$, then $X_1, \ldots,
3844: X_n$ are independent with respect to $\vecmu \, \Box \, \vec{r}$.
3845: Thus, for all natural numbers $m \ge 1$, the range of
3846: $\nu$ satisfies the first-order property $$\exists x_1 \ldots \exists x_k
3847: (p_1(x_1, \ldots, x_k) = 0 \land \ldots \land p_N(x_1, \ldots, x_k) = 0
3848: \land 0 < x_1 < 1/m \land \ldots \land 0 < x_k < 1/m).$$
3849: Since the range of $\nu$ is an elementary extension of the reals, this
3850: first-order
3851: property holds of the reals as well.
3852: Thus, there exists a sequence
3853: $\vec{r}^j$ of vectors of positive reals converging to $\vec{0}$ such that
3854: $p_j(\vec{r}^j) = 0$ for $j = 1, \ldots, N$.
3855:
3856: The converse follows by a straightforward application
3857: of compactness in first-order logic \cite{Enderton}.
3858: Suppose that there exists a sequence
3859: $\vec{r}^j$, $j = 1, 2, \ldots$ of vectors in $(0,1)^k$
3860: such that $\vec{r}^j \rightarrow (0,\ldots, 0)$
3861: as $j\rightarrow\infty$, and $X_1, \ldots, X_n$ are
3862: independent with respect to $\vecmu \, \Box \, \vec{r}^j$ for $j = 1, 2, 3,
3863: \ldots$. We now apply the compactness theorem.
3864: As I mentioned in the proof of Proposition~\ref{infiniteeq}, the
3865: compactness theorem says that,
3866: given a collection for formulas, if each finite subset has a model, then
3867: so does the whole set.
3868: Consider a language with the function symbols $+$ and $\times$,
3869: the binary relation $\le$, a constant
3870: symbol $\mathbf{r}$ for each
3871: real number $r$, a unary predicate $N$ (representing the natural numbers),
3872: and constant symbols $p_{U}$ for each set $U \in
3873: \F$. Intuitively, $p_U$ represents $\nu(U)$.
3874: Consider the following (uncountable) collection of formulas:
3875: \begin{itemize}
3876: \item[(a)] All first-order formulas in this language true of the reals.
3877: %joe9
3878: %(This includes, for example, a formula such as $\forall x\forall y(x= y
3879: (This includes, for example, a formula such as $\forall x\forall y(x+ y
3880: = y+x)$, which says that addition is commutative, as well as formulas
3881: such as $\mathbf{2} + \mathbf{3} = \mathbf{5}$ and
3882: $\mathbf{\sqrt{2}} \times \mathbf{\sqrt{3}} = \mathbf{\sqrt{6}}$.)
3883: \item[(b)] Formulas $p_U > 0$ for $U \in \F'$ and $p_U = 0$ for $U \in \F -
3884: \F'$.
3885: \item[(c)] Formulas $p_U + p_V = p_{U \union V}$ if $U \inter V = \emptyset$.
3886: \item[(d)] The formula $p_W = 1$.
3887: \item[(e)] Formulas of the form $p_{X_1 = x_1} \times \cdots \times
3888: p_{X_n = x_n} =
3889: p_{X_1 = x_1 \inter \ldots \inter X_n = x_n}$, for all values $x_i \in
3890: \V(X_i)$, $i = 1, \ldots, n$; these formulas say that $X_1,
3891: \ldots, X_n$ are independent with respect to $\nu$.
3892: \item[(f)] For every pair of $Y$, $Y'$ of random variables such that
3893: $E_{\vecmu}(Y) \ge E_{\vecmu}(Y')$, a formula that says
3894: $E_{\nu}(Y) \ge E_{\nu}(Y')$, where $E_{\nu}(Y)$ and $E_{\nu}(Y')$ are
3895: expressed using the constant symbols $p_U$ (where the events $U$ are
3896: those of the form $Y=y$ and $Y'=y'$).
3897: %joe6
3898: Note that this formula is finite, since $X$ and $Y$ are assumed to have
3899: finite range. The formula would not be expressible in first-order logic
3900: if $X$ or $Y$ had infinite range.
3901: \end{itemize}
3902:
3903: It is not hard to show that every finite subset of these formulas is
3904: satisfiable. Indeed, given a finite subset of formulas, there must
3905: exist some $m$ such that taking $p_U = \vecmu \, \Box \, \vec{r}^m(U)$
3906: will work (and interpreting $\mathbf{r}$ as the real number $r$, of
3907: course). The only nonobvious part is showing that we can deal with the
3908: formulas in part (f); that we can do so follows from the proof of
3909: Proposition 1 in \cite{BBD2}, which shows that
3910: $E_{\vecmu}(Y') > E_{\vecmu}(Y)$ iff there exists some $M$ such that $E_{\vecmu \, \Box \,
3911: \vec{r}^m}(Y') >
3912: E_{\vecmu \, \Box \, \vec{r}^m}(Y)$ for all $m$, then
3913: $E_{\vecmu}(Y') > E_{\vecmu}(Y)$.
3914:
3915: Since every finite set of formulas is satisfiable,
3916: by compactness, the infinite set is satisfiable. Let $\nu(U)$
3917: be the interpretation of $p_U$ in a model satisfying these formulas.
3918: Then it is easy to check that $\nu$ is an elementary extension of the
3919: reals, $\nu \aeq \vecmu$, and
3920: that $X_1, \ldots, X_n$ are independent with respect to $\nu$.
3921: \eprf
3922:
3923:
3924:
3925:
3926:
3927: \othm{KRindependence}
3928: $X_1, \ldots, X_n$ are strongly independent with respect to the Popper
3929: space $(W,\F,\F',\mu)$ iff there
3930: exists an NPS $(W,\F,\nu)$ such that
3931: %joe4
3932: %$\FNP(W,\F,\nu) = \mu$ and $X_1, \ldots,
3933: $\FNP(W,\F,\nu) = (W,\F,\F',\mu)$ and $X_1, \ldots,
3934: X_n$ are independent with respect to $(W,\F,\nu)$.
3935: \eothm
3936:
3937: \prf It easily follows from Kohlberg and Reny's \citeyear[Theorem
3938: 2.10]{KR97} characterization of strong independence that if
3939: $X_1, \ldots, X_n$ are independent with respect to the NPS
3940: %joe3: 7/28/05
3941: %$(W,\F,\nu$ then $X_1, \ldots, X_n$ are strongly independent with respect to
3942: $(W,\F,\nu)$ then $X_1, \ldots, X_n$ are strongly independent with respect to
3943: $\FNP(W,\F,\nu)$.
3944: \commentout{
3945: The converse follows by a straightforward application
3946: of compactness in first-order logic \cite{Enderton}.
3947:
3948: Suppose that $(W,\F,\F',\mu)$ is a Popper space and
3949: $\mu_j \rightarrow \mu$ are as required for $X_1, \ldots, X_n$ to be
3950: strongly independent with respect to $\mu$.
3951: As I mentioned in the proof of Proposition~\ref{infiniteeq}, the
3952: compactness theorem says that,
3953: given a collection for formulas, if each finite subset has a model, then
3954: so does the whole set.
3955: Consider a language with the function symbols $+$ and $\times$,
3956: the binary relation $\le$, a constant
3957: symbol $\mathbf{r}$ for each
3958: real number $r$, and constant symbols $p_{U}$ for each set $U \in
3959: \F$. Intuitively, $p_U$ represents $\nu(U)$.
3960: Consider the following (uncountable) collection of formulas:
3961: \begin{itemize}
3962: \item All formulas true in fields (for example, $\forall x, y (x+ y =
3963: y+x)$, which says that addition is commutative).
3964: \item All true statements of the form $\mathbf{r_1} + \mathbf{r_2} =
3965: \mathbf{r_3}$ and $\mathbf{r_1} \times \mathbf{r_2} = \mathbf{r_3}$
3966: involving real constants $\mathbf{r_1}$, $\mathbf{r_2}$, $\mathbf{r_3}$
3967: (for example $\mathbf{2} + \mathbf{3} = \mathbf{5}$ and
3968: $\mathbf{\sqrt{2}} \times \mathbf{\sqrt{3}} = \mathbf{\sqrt{6}}$).
3969: \item Formulas $p_U > 0$ for $U \in \F'$ and $p_U = 0$ for $U \in \F -
3970: \F'$.
3971: \item Formulas $p_U + p_V = p_{U \union V}$ if $U \inter V = \emptyset$.
3972: \item The formula $p_W = 1$.
3973: \item Formulas of the form $p_{X_1 = x_1} \times \cdots \times p_{X_n = x_n} =
3974: p_{X_1 = x_1 \inter \ldots \inter X_n = x_n}$, for all values $x_i \in
3975: \V(X_i)$, $i = 1, \ldots, n$; these formulas say that $X_1,
3976: \ldots, X_n$ are independent with respect to $\nu$.
3977: \item Formulas of the form $(\mathbf{r - \frac{1}{n}})p_V \le p_{U \inter V}
3978: \le (\mathbf{r + \frac{1}{n}})p_V$ for all $U$, $V$, $\mathbf{r}$, and
3979: $\mathbf{n} > 0$ such that $\mu(U \mid V) = r$.
3980: \end{itemize}
3981:
3982: It is easy to see that every finite subset of these formulas is
3983: satisfiable. Indeed, given a finite subset of formulas, there must
3984: exist some $m$ such that taking $p_U = \mu_m(U)$ satisfies all the
3985: formulas (and interpreting $\mathbf{r}$ as the real number $r$, of
3986: course). By compactness, the infinite set is satisfiable. Let $\nu(U)$
3987: be the interpretation of $p_U$ in a model satisfying these formulas.
3988: Then it is easy to check that $\FLN(W,\F,\nu) = (W,\F,\F',\mu)$,
3989: and that $X_1, \ldots, X_n$ are independent with respect to $\nu$.
3990: }
3991: %\end{commentout}
3992:
3993: The converse follows using compactness, much as in the proof of
3994: Theorem~\ref{BBDstrongindependence}.
3995: Suppose that $(W,\F,\F',\mu)$ is a Popper space and
3996: $\mu_j \rightarrow \mu$ are as required for $X_1, \ldots, X_n$ to be
3997: strongly independent with respect to $\mu$.
3998: Consider the same language as in the proof of
3999: Theorem~\ref{BBDstrongindependence}, and essentially the same
4000: collection of formulas, except that the formulas of part (f) are
4001: replaced by
4002: \begin{itemize}
4003: \item[(f$'$)] Formulas of the form $(\mathbf{r - \frac{1}{n}})p_V \le p_{U \inter V}
4004: \le (\mathbf{r + \frac{1}{n}})p_V$ for all $U$, $V$, $\mathbf{r}$, and
4005: $\mathbf{n} > 0$ such that $\mu(U \mid V) = r$.
4006: \end{itemize}
4007:
4008: Again, it is easy to see that every finite subset of these formulas is
4009: satisfiable. Indeed, given a finite subset of formulas, there must
4010: exist some $m$ such that taking $p_U = \mu_m(U)$ satisfies all the
4011: formulas (and interpreting $\mathbf{r}$ as the real number $r$, of
4012: course). By compactness, the infinite set is satisfiable. Let $\nu(U)$
4013: be the interpretation of $p_U$ in a model satisfying these formulas.
4014: Then it is easy to check that $\FLN(W,\F,\nu) = (W,\F,\F',\mu)$,
4015: and that $X_1, \ldots, X_n$ are independent with respect to $\nu$.
4016: \eprf
4017:
4018:
4019: \bibliographystyle{chicago}
4020: %\bibliographystyle{alpha}
4021: \bibliography{z,joe}
4022: \end{document}
4023:
4024: