0810.1019/E2.tex
1: 
2: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3: \chapter{Quantities, states, and statistics}\label{c.quants}
4: 
5: When considered in sufficient 
6: detail, no physical system is truly in global equilibrium; one can 
7: always find smaller or larger deviations. To describe these deviations, 
8: extra variables are needed, resulting in a more complete but also more 
9: complex model. At even higher resolution, this model is again 
10: imperfect and an approximation to an even more complex, better model. 
11: This refinement process may be repeated in several stages. 
12: At the most detailed stages, we transcend the frontier of 
13: current knowledge in physics, but even as this frontier recedes, 
14: deeper and deeper stages with unknown details are imaginable.
15: 
16: Therefore, it is desirable to have a meta-description of thermodynamics
17: that, starting with a detailed model, allows to deduce the properties 
18: of each coarser model, in a way that all description levels are 
19: consistent with the current state of the art in physics. 
20: Moreover, the results should be as independent as possible of unknown 
21: details at the lower levels.
22: This meta-description is the subject of \bfi{statistical mechanics}. 
23: 
24: This chapter introduces the technical machinery of statistical 
25: mechanics, Gibbs states and the partition function, in a uniform 
26: way common to classical mechanics and quantum mechanics. 
27: As in the phenomenological case, the intensive variables determine 
28: the state (which now is a more abstract object), whereas the extensive 
29: variables now appear as values of other abstract objects called 
30: quantities. This change of setting allows the natural incorporation 
31: of quantum mechanics, where quantities need not commute, while 
32: values are numbers observable in principle, hence must 
33: satisfy the commutative law.
34: 
35: The operational meaning of the abstract concepts of quantities, 
36: states and values introduced in the following becomes apparent once we 
37: have recovered the phenomenological results of Chapter \ref{c.ctherm} 
38: from the abstract theory developped in this and the next chapter. 
39: Chapter \ref{c.models} discusses in more detail how the theory relates 
40: to experiment.
41: 
42: \at{adapt Section 1.5 to match the contents}
43: 
44: 
45: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
46: \section{Quantities}\label{s.quantities}
47: 
48: Any fundamental description of physical systems must give account of 
49: the numerical values of quantities observable in experiments when the 
50: system under consideration is in a specified state. Moreover, the form 
51: and meaning of states, and of what is observable in principle, must be 
52: clearly defined. We consider an axiomatic conceptual 
53: foundation on the basis of quantities\footnote{
54: We deliberately avoid the notion of observables, since it is not clear 
55: on a fundamental level what it means to `observe' something, and since 
56: many things (such as the finne structure constant, neutrino masses, 
57: decay rates, scattering cross sections) which can be observed in nature 
58: are only indirectly related to what is traditionally called an 
59: `observable' in quantum mechanics. The related problem of how to 
60: interpret measurements is discussed in Section \ref{s.measurement}.
61: } % end footnote
62: and their values, consistent with the conventions adopted by the 
63: International System of Units (SI) \cite{SI}, who declare:
64: ''{\em A quantity in the general sense 
65: is a property ascribed to phenomena, bodies, or substances that can 
66: be quantified for, or assigned to, a particular phenomenon, 
67: body, or substance. [...] 
68: The value of a physical quantity is the quantitative expression
69: of a particular physical quantity as the product of a number and a
70: unit, the number being its numerical value.}'' 
71: 
72: In different states, the quantities of a given system may have 
73: different values; the state (equivalently, the values determined by it) 
74: characterizes an individual system at a particular time.
75: Theory must therefore define what to consider as quantities, 
76: what as states, and how a state assigns values to a quantity.
77: Since quantities can be added, multiplied, compared, and integrated, 
78: the set of all quantities has an elaborate structure whose properties 
79: we formulate after the discussion of the following motivating example. 
80: 
81: \begin{example}\label{ex.Nlevel}
82: As a simple example satisfying the axioms to be 
83: introduced, the reader may think of an $N$-level quantum system.
84: The \bfi{quantities} are the elements of the algebra
85: $\Ez=\Cz^{N\times N}$ of square complex $N\times N$ matrices, the 
86: \bfi{constants} are the multiples of the identity matrix, the 
87: \bfi{conjugate} $f^*$ of $f$ is given by conjugate transposition, and 
88: the \bfi{integral} $\sint g = \tr g$ is the \bfi{trace}, the sum of the 
89: diagonal entries or, equivalently, the sum of the eigenvalues. 
90: The standard basis consisting of the $N$ \bfi{unit vectors} \idx{$e^k$} 
91: with a one in component $k$ and zeros in all other component 
92: corresponds to the $N$ levels of the quantum systems. The Hamiltonian
93: $H$ is represented by a diagonal matrix $H=\Diag(E_1,\dots,E_N)$
94: whose diagonal entries $E_k$ are the \bfi{energy levels} of the system.
95: In the nondegenerate case, all $E_k$ are distinct, and the diagonal 
96: matrices comprise all functions of $H$. Quantities representing 
97: arbitrary nondiagonal matrices are less easy to interpret. However,
98: an important class of quantities are the matrices of the form
99: $P=\psi\psi^*$, where $\psi$ is a vector of norm 1; they satisfy 
100: $P^2=P=P^*$ and are the quantities observed in binary measurements
101: such as detector clicks; see Section \ref{s.qprob}.
102: The \bfi{states} of the $N$-level system are represented by a 
103: \bfi{density matrix} $\rho\in\Ez$, a positive semidefinite Hermitian 
104: matrix with trace one. The \bfi{value} of a quantity $f\in\Ez$
105: is the number $\<f\>=\tr \rho f$. The diagonal entries $p_k:=\rho_{kk}$ 
106: represent the probability for obtaining a response in a binary test for 
107: the $k$th quantum level; the off-diagonal entries $\rho_{jk}$
108: represent deviations from a classical mixture of quantum levels.
109: \end{example}
110: 
111: \begin{dfn} ~\\
112: (i) A \bfi{$*$-algebra} is a set $\Ez$ together 
113: with operations on $\Ez$ defining for any two quantities $f,g\in\Ez$ 
114: the \bfi{sum} $f+g\in\Ez$, the 
115: \bfi{product} $fg\in\Ez$, and the \bfi{conjugate} $f^*\in\Ez$,
116: such that the following axioms (Q1)--(Q4) hold for all $\alpha\in\Cz$ 
117: and all $f,g,h\in\Ez$:
118: 
119: (Q1) 
120: ~$\Cz \subseteq \Ez$, i.e., complex numbers are special elements
121: called \bfi{constants}, for which addition, multiplication and 
122: conjugation have their traditional meaning. 
123: 
124: (Q2)
125: ~{$(fg)h=f(gh)$,~~ $\alpha f=f\alpha $,~~ $0f=0$,~~ $1f=f$.}
126: 
127: (Q3)
128: ~{$(f+g)+h=f+(g+h)$,~~ $f(g+h)=fg+fh$,~~ $f+0=f$.}
129: 
130: (Q4)
131: ~{$f^{**}=f$,~~ $(fg)^* =g^* f^* $,~~ $(f+g)^* =f^* +g^*$.}
132: 
133: (ii) A $*$-algebra $\Ez$ is called \bfi{commutative} if $fg=gf$ for 
134: all $f,g\in\Ez$, and  \bfi{noncommutative} otherwise.
135: The $*$-algebra $\Ez$ is called \bfi{nondegenerate} if
136: 
137: (Q5)
138: ~{$f^* f =0 \implies f =0$.}
139: 
140: (iii) We introduce the notation
141: \[
142: -f:=(-1)f,~~ f-g:=f+(-g), ~~~[f,g]:=fg-gf,
143: \]
144: \[
145: f^0:=1,~~ f^l:=f^{l-1}f~~~ (l=1,2,\dots ),
146: \]
147: \[
148: \re f := \half(f+f^*),~~~\im f := \frac{1}{2i}(f-f^*),
149: \]
150: for $f,g\in\Ez$.
151: $[f,g]$ is called the \bfi{commutator} of $f$ and $g$, and $\re f$, 
152: $\im f$ are referred to as the \bfi{real part} (or \bfi{Hermitian part})
153: and \bfi{imaginary part} of $f$, respectively. 
154: $f\in\Ez$ is called \bfi{Hermitian} if $f^*=f$.. 
155: 
156: (iv) A \idx{$*$-homomorphism} is a mapping $\phi$ from a $*$-algebra 
157: $\Ez$ 
158: with unity to another (or the same) $*$-algebra $\Ez'$ with unity
159: such that
160: \[
161: \phi(f+g)=\phi(f)+\phi(g),~~~\phi(fg)=\phi(f)\phi(g),~~~
162: \phi(\alpha f)=\alpha\phi(f),
163: \]
164: \[
165: \phi(f^*)=\phi(f)^*,~~~\phi(1)=1.
166: \]
167: for all $f,g$ in $\Ez$ and $\alpha\in\Cz$.
168: \end{dfn}
169: 
170: Note that we assume commutativity only for the product of numbers and
171: elements of $\Ez$. In general, the product of two elements of $\Ez$ 
172: is indeed noncommutative.
173: However, general commutativity of the addition is a consequence of our 
174: other assumptions. We prove this together with some other useful 
175: relations. 
176: 
177: \begin{prop}\label{p5.1.2}~\\
178: (i) For all  $f$, $g$, $h\in \Ez$,
179: \lbeq{e.p1}
180: (f+g)h=fh+gh,~~f-f=0,~~ f+g=g+f
181: \eeq
182: \lbeq{e.p2}
183: [f,f^*]=-2i[\re f,\im f].
184: \eeq
185: (ii) For all $f\in\Ez$, $\re f$ and $\im f$ are Hermitian. $f$ is 
186: Hermitian iff $f=\re f$ iff $\im f=0$. If $f,g$ are commuting 
187: Hermitian quantities then $fg$ is Hermitian, too.
188: \end{prop}
189: 
190: \bepf
191: (i) The right distributive law follows from
192: \[
193: \begin{array}{lll}
194: (f+g)h&=&((f+g)h)^{* *}=(h^* (f+g)^* )^* =(h^* (f^* +g^* ))^* \\
195: &=&(h^* f^* +h^* g^* )^* =(h^* f^* )^* +(h^* g^* )^* \\
196: &=&f^{* * }h^{* * }+g^{* * }h^{* * }=fh+gh.
197: \end{array}
198: \]
199: It implies $f-f=1f-1f=(1-1)f=0f=0$. From this, we may deduce that 
200: addition is commutative, as follows. The quantity $h:=-f+g$
201: satisfies
202: \[
203: -h=(-1)((-1)f+g)=(-1)(-1)f+(-1)g=f-g, 
204: \]
205: and we have
206: \[
207: f+g=f+(h-h)+g=(f+h)+(-h+g)=(f-f+g)+(f-g+g)=g+f. 
208: \]
209: This proves \gzit{e.p1}. If $u=\re f$, $v=\im f$ then $u^*=u,v^*=v$
210: and $f=u+iv, f^*=u-iv$. Hence 
211: \[
212: [f,f^*]=(u+iv)(u-iv)-(u-iv)(u+iv)=2i(vu-uv)=-2i[\re f,\im f],
213: \]
214: giving \gzit{e.p2}. 
215: 
216: (ii) The first two assertions are trivial, and the third holds since
217: $(fg)^*=g^*f^*=gf=fg$ if $f,g$ are Hermitian and commute.
218: \epf
219: 
220: 
221: \begin{dfn}~\\
222: (i) The $*$-algebra $\Ez$ is called \bfi{partially ordered} if there is
223: a partial order $\geq$ satisfying the following axioms (Q6)--(Q9)
224: for all $f,g,h\in\Ez$:
225: 
226: (Q6) 
227: ~$\geq$ is reflexive ($f\geq f$),
228: antisymmetric ($f\geq g \geq f \Rightarrow f=g$),
229: and transitive ($f\geq g \geq h \Rightarrow f \geq h)$).
230: 
231: (Q7)
232: ~{$f\geq g \implies f+h\geq g+h$.}
233: 
234: (Q8)
235: ~{$f\geq 0 \implies f=f^*$ and $g^*fg\geq 0$.}
236: 
237: (Q9)
238: ~ $1 \geq 0$.
239: 
240: We introduce the notation
241: \[ 
242: f \leq g :\Leftrightarrow g\geq f,
243: \]
244: \[
245: \|f\|:=\inf\{\alpha\in\Rz \mid f^*f \leq \alpha^2, \alpha\geq0 \},
246: \]
247: where the infimum of the empty set is taken to be $\infty$. The number
248: $\|f\|$ is referred to as the \bfi{(spectral) norm} of $f$. 
249: An element $f\in\Ez$ is called \bfi{bounded} if $\|f\|<\infty$.
250: The \bfi{uniform topology} is the topology induced on
251: $\Ez$ by declaring as open sets arbitrary unions of finte intersections 
252: of the \bfi{open balls} $\{f\in\Ez \mid \|f-f_0\|<\eps\}$ for some 
253: $\eps>0$ and some $f_0 \in\Ez$.
254: \end{dfn}
255: 
256: \begin{prop}\label{p1.3}~\\
257: (i) For all quantities  $f$, $g$, $h\in \Ez$ and $\lambda \in\Cz$,
258: \lbeq{e.p3}
259: f^*f\geq 0,~~ ff^*\geq 0.
260: \eeq
261: \lbeq{e.p4}
262: f^*f\leq 0 \implies \|f\|=0 \implies f=0,
263: \eeq
264: \lbeq{e.p5}
265: f\leq g \implies h^*fh\leq h^*gh,~|\lambda|f\leq|\lambda|g,
266: \eeq
267: \lbeq{e.p6}
268: f^*g+g^*f\leq 2\|f\|~\|g\|,
269: \eeq
270: \lbeq{e.p7}
271: \|\lambda f\|=|\lambda| \|f\|,~~~ \|f\pm g\|\leq \|f\|\pm \|g\|,
272: \eeq
273: \lbeq{e.p8}
274: \|f g\|\leq \|f\|~ \|g\|.
275: \eeq
276: (ii) Among the complex numbers, precisely the nonnegative real numbers
277: $\lambda$ satisfy $\lambda\geq 0$.
278: 
279: \end{prop}
280: 
281: \bepf
282: (i) \gzit{e.p3}--\gzit{e.p5} follow directly from 
283: (Q7) -- (Q9). Now let $\alpha=\|f\|$, $\beta=\|g\|$. Then 
284: $f^*f\leq \alpha^2$ and $g^*g\leq \beta^2$. Since
285: \[
286: \begin{array}{lll}
287: 0\leq (\beta f - \alpha g)^*(\beta f - \alpha g)&=&
288: \beta^2f^*f-\alpha\beta(f^*g+g^*f)+\alpha^2 g^*g\\
289: &\leq& \beta^2\alpha^2 \pm\alpha\beta(f^*g+g^*f) +\alpha^2 g^*g,
290: \end{array}
291: \] 
292: $f^*g+g^*f\leq 2\alpha\beta$ if $\alpha\beta\neq 0$, and for
293: $\alpha\beta=0$, the same follows from \gzit{e.p4}. Therefore
294: \gzit{e.p6} holds. The first half of \gzit{e.p7} is trivial, and
295: the second half follows for the plus sign from 
296: \[
297: (f+g)^*(f+g)=f^*f+f^*g+g^*f+g^*g
298: \leq \alpha^2+ 2\alpha\beta+\beta^2=(\alpha+\beta)^2,
299: \]
300: and then for the minus sign from the first half.
301: Finally, by \gzit{e.p5},
302: \[
303: (fg)^*(fg)=g^*f^*fg\leq g^*\alpha^2g=\alpha^2g^*g\leq\alpha^2\beta^2.
304: \]
305: This implies \gzit{e.p8}.
306: 
307: (ii) If $\lambda$ is a nonnegative real number then $\lambda=f^*f\geq0$ 
308: with $f=\sqrt{\lambda}$. If $\lambda$ is a negative real number then 
309: $\lambda=-f^*f\leq0$ with $f=\sqrt{-\lambda}$, and by antisymmetry,
310: $\lambda\geq0$ is impossible. If $\lambda$ is a nonreal number then 
311: $\lambda\neq\lambda^*$ and $\lambda\geq0$ is impossible by (Q8).
312: \epf
313: 
314: \begin{dfn}
315: A \bfi{Euclidean $*$-algebra} is a nondegenerate, partially ordered 
316: $*$-algebra $\Ez$, whose elements are called 
317: \bfi{quantities}, together with a complex-valued 
318: \bfi{integral} $\sint$ defined on a subspace $\Sz$ of $\Ez$,
319: whose elements are called \bfi{strongly integrable}, satisfying
320: the following axioms (EA1)--(EA6):
321: 
322: (EA1) ~ 
323: $g$ bounded, $h$ strongly integrable $~~\Rightarrow~~ h^*,gh,hg$ 
324: strongly integrable,
325: 
326: (EA2) ~
327: $ \sint h^* h > 0$ ~if $h \not= 0$,
328: 
329: (EA3) ~
330: $(\sint h) ^* = \sint h^*, ~~~\sint gh = \sint hg$,
331: 
332: (EA4) ~
333: $\sint h^* g h= 0$ for all strongly integrable $h~~\Rightarrow~~ g=0$~~~
334: \bfi{(nondegeneracy)},
335: 
336: (EA5) ~
337: $ \sint h_l^* h_l \to 0 ~~\Rightarrow~~ \sint g h_l \to 0$,~
338: $\sint h_l^* g h_l \to 0$,
339: 
340: (EA6) ~
341: $h_l\downto 0~~\Rightarrow~~ \inf\sint h_l=0$~~~
342: \bfi{(Dini property)}.
343: 
344: Here, integrals extend over the longest following product or quotient
345: (while later, differential operators act on the shortest syntactically 
346: meaningful term), the \bfi{monotonic limit} is defined by
347: $g_l \downarrow 0$ iff, for every strongly integrable $h$, the sequence 
348: (or net) $\sint h^*g_lh$ consists of real numbers converging 
349: monotonically decreasing to zero. 
350: \end{dfn}
351: 
352: Note that the integral can often be naturally extended from strongly 
353: integrable quantities to a significantly larger space of integrable 
354: quantities.
355: 
356: \begin{prop}
357: \lbeq{e.ean4}
358: g\in\Ez,~~\sint gf = 0 \Forall f \in \Ez \implies g=0.
359: \eeq
360: For strongly integrable $f,g$,
361: \lbeq{e.intcs}
362: \sint (gh)^*(gh)\le \sint g^*g~\sint h^*h.~~~
363: \mbox{\bf (\bfi{Cauchy-Schwarz inequality})}
364: \eeq
365: In particular, every strongly integrable quantity is bounded.
366: \end{prop}
367: 
368: \bepf
369: 
370: If $\sint gf = 0$ for all $f \in \Ez$ then this holds in particular 
371: for $f=hh^*$. Thus $0=\sint ghh^*=\sint h^*gh$ by (EA2), and
372: (EA4) gives the desired conclusion \gzit{e.ean4}.
373: \gzit{e.intcs} holds since by (EA2), $\sint g^*h$ defines a positive 
374: definite inner product on $\Sz$, and directly implies the final 
375: statement.
376: \epf
377: 
378: We now describe the basic Euclidean $*$-algebras relevant in 
379: nonrelativistic physics. However, the remainder is completely
380: independent of details how the axioms are realized; a specific 
381: realization is needed only when doing specific quantitative 
382: calculations.
383: 
384: \begin{expls}\label{e3.1}~\\
385: (i) \bfi{($N$-level quantum systems)}
386: The simplest family of Euclidean 
387: $*$-algebras is the algebra $\Ez=\Cz^{N\times N}$ of 
388: square complex $N\times N$ matrices; cf. Example \ref{ex.Nlevel}.
389: Here the quantites are square matrices, the constants are the multiples 
390: of the identity matrix, the conjugate is conjugate transposition, and 
391: the integral is the trace, the sum of the diagonal entries or, 
392: equivalently, the sum of the eigenvalues. In particular, all quantities 
393: are strongly integrable. 
394: 
395: 
396: (ii) \bfi{(Nonrelativistic classical mechanics)}
397: An atomic $N$-particle system is described in classical mechanics by
398: the phase space $\Rz^{6N}$ with six coordinates -- position 
399: $x^a\in\Rz^3$ and momentum $p^a\in\Rz^3$ -- for each particle.
400: The algebra 
401: \[
402: \Ez_N:= C^\infty(\Rz^{6N})
403: \]
404: of smooth complex-valued functions 
405: $g(x^{1:N},p^{1:N})$  of positions and momenta is a commutative 
406: Euclidean $*$-algebra with complex conjugation as conjugate
407: and the \bfi{Liouville integral}
408: \[
409: \sint g=C^{-1} \int dp^{1:N}dx^{1:N} g_N(x^{1:N},p^{1:N}),
410: \]
411: where $C$ is a positive constant.
412: Strongly integrable quantities are the Schwartz functions in $\Ez$.
413: The axioms are easily verified.
414: 
415: (iii) \bfi{(Classical fluids)}
416: A fluid is classically described by an atomic system with an
417: indefinite number of particles. The appropriate Euclidean $*$-algebra 
418: for a single species of monatomic particles is the 
419: direct sum $\Ez=\D\oplus_{N\ge 0} \Ez_N$ whose quantities are 
420: infinite sequences $g=(g_0,g_1,...)$ of $g_N\in\Ez_N$, with 
421: $\Ez_N$ as in (i), and weighted Liouville integral
422: \[
423: \sint g=\sum_{N\ge 0} 
424: C_N^{-1}\int dp^{1:N}dx^{1:N} g_N(x^{1:N},p^{1:N}).
425: \]
426: Here $C_N$ is a symmetry factor for the symmetry group of the
427: $N$-particle systen, which equals $h^{3N}N!$ for indistinguishable
428: particles; $h= 2\pi \hbar$ is Planck's constant.
429: This accounts for the Maxwell statistics and gives the correct entropy 
430: of mixing. Classical fluids with monatomic particles of several 
431: different kinds require a tensor product of several such algebras, and 
432: classical fluids composed of molecules require additional degrees
433: of freedom to account for the rotation and vibration of the molecules.
434: 
435: 
436: (iv) \bfi{(Nonrelativistic quantum mechanics)} 
437: Let $\Hz$ be a Euclidean space, a dense subspace of a Hilbert space.
438: Then the algebra $\Ez:= \Lin \Hz$ of continuous linear operators 
439: on $\Hz$ is a Euclidean $*$-algebra with the adjoint as conjugate and
440: the \bfi{quantum integral}
441: \[
442:   \sint g= \tr g,
443: \]
444: given by the trace of the quantity in the integrand.
445: Strongly integrable quantities are the operators $g\in\Ez$ which 
446: are trace class; this includes all linear operators of finite rank. 
447: Again, the axioms are easily verified. In the quantum context, 
448: Hermitian quantities $f$ are often referred to as \bfi{observables};
449: but we do not use this term here.
450: 
451: \end{expls}
452: 
453: We end this section by stating some results needed later.
454: The exposition in this and the next chapter is fully rigorous if the 
455: statements of Proposition \ref{app2a.} and Proposition \ref{app1.}
456: are assumed in addition to (EA1)--(EA6).
457: We prove these propositions only in case that $\Ez$ 
458: is finite-dimensional\footnote{
459: We'd appreciate to be informed about possible proofs in general that 
460: only use the properties of Euclidean $*$-algebras (and perhaps further, 
461: elementary assumptions).
462: }. % end footnote
463: But they can also be proved if the quantities 
464: involved are smooth functions, or if they have a spectral 
465: resolution; cf., e.g., \sca{Thirring} \cite{Thi} (who works in the
466: framework of $C^*$-algebras and von Neumann algebras). 
467: 
468: \bigskip
469: \begin{prop} \label{app2a.} 
470: For arbitrary quantities $f$, $g$,
471: \[
472: e^{\alpha f}e^{\beta f}=e^{(\alpha+\beta)f}~~(\alpha,\beta\in\Rz),
473: \]
474: \[
475: (e^f)^*=e^{f^*},
476: \]
477: \[
478: e^f g = g e^f ~~~\mbox{if $f$ and $g$ commute},
479: \]
480: \[
481: f^*=f \implies \log e^f=f,
482: \]
483: \[
484: f\ge 0 \implies \sqrt{f}\ge 0,~~(\sqrt{f})^2 =f,
485: \]
486: For any quantity $f=f(s)$ depending continuously on $s\in[a,b]$,
487: \[
488: \int_a^b ds \sint f(s) = \sint \Big(\int_a^b ds f(s)\Big),
489: \]
490: and for any quantity $f=f(\lambda)$ depending continuously 
491: differentiably on a parameter vector $\lambda$, 
492: \[
493: \frac{d}{d\lambda} \sint f = \sint df/d\lambda.
494: \]
495: \end{prop}
496: 
497: \bepf 
498: In finite dimensions, the first four assertions are standard 
499: matrix calculus, and the remaining two statements hold since $\sint f$ 
500: must be a finite linear combination of the components of $f$.
501: \epf
502: 
503: \begin{prop} \label{app1.}
504: Let $f,g$ be quantities depending continuously differentiably on a
505: parameter or parameter vector $\lambda $, and suppose that
506: \[
507: [f(\lambda ),g(\lambda )]=0\mbox { for all }\lambda.
508: \]
509: Thus, for any continuously differentiable function $F$ of two
510: variables,
511: \lbeq{app1}
512: \frac {d} {d\lambda }\sint F(f,g)
513: =\sint\partial _1F(f,g)\frac {df} {d\lambda }
514: +\sint\partial _2F(f,g)\frac {dg } {d\lambda }\ .
515: \eeq
516: Here $\partial _1F$ and $\partial _2 f$ denote differentiation by the 
517: first and second argument of $F$, respectively
518: \end{prop}
519: 
520: \bepf
521: We prove the special case $F(x,y)=x^my^n$, where (\ref{app1}) reduces
522: to
523: \lbeq{app2}
524: \frac {d} {d\lambda }\sint f^mg^n
525: =\sint mf^{m-1}g^n\frac {df} {d\lambda }
526: +\sint nf^mg^{n-1}\frac {dg} {d\lambda}.
527: \eeq
528: The general case then follows for polynomials $F(x,y)$ by taking
529: suitable linear combinations, and for arbitrary $F$ by a limiting
530: procedure. To prove (\ref{app2}), we note that, more generally,
531: \[ 
532: \begin{array}{lll}
533: \D\frac {d} {d\lambda }\sint f_1\dots f_{m+n}
534: &=\sint\frac {d} {d\lambda }(f_1\dots f_{m+n})\\
535: &\D=\sint\sum _{j=1} ^{m+n}f_1\dots f_{j-1}\frac {df_j} {d\lambda }
536: f_{j+1}\dots f_{m+n} \\
537: &\D=\sum _{j=1} ^{m+n}\sint f_1\dots f_{j-1}\frac {df_j} {d\lambda }
538: f_{j+1}\dots f_{m+n} \\
539: &\D=\sum _{j=1} ^{m+n}\sint f_{j+1}\dots f_{m+n}f_1\dots f_{j-1}
540: \frac {df_j} {d\lambda }\ , 
541: \end{array}
542: \]
543: using the cyclic commutativity (EA3) of the integral.
544: If we specialize to $f_j=f$ if $j\le m$, $f_j=g$ if $j>m$, and note
545: that $f$ and $g$ commute, we arrive at (\ref{app2}).
546: \epf
547: 
548: Of course, the proposition generalizes to families of more than two
549: commuting quantities; but more important is the special case $g=f$:
550: 
551: \begin{cor} \label{app2.}
552: For any quantity $f$ depending continuously differentiably on a
553: parameter vector $\lambda $, and any continuously differentiable 
554: function $F$ of a single variable,
555: \lbeq{app3}
556: \frac {d} {d\lambda }\sint F(f)=\sint F'(f)\frac {df} {d\lambda }.
557: \eeq
558: \end{cor}
559: 
560: 
561: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
562: \section{Gibbs states}\label{s.gibbs}
563: 
564: Our next task is to specify the formal properties of the value of a 
565: quantity. 
566: 
567: \begin{dfn}\label{d.state}
568: A \bfi{state} is a mapping $^-$ that assigns to all quantities $f$
569: from a subspace of $\Ez$ containing all bounded quantities
570: its \bfi{value} $\overline{f}=:\< f\> \in \Cz$ 
571: such that for all $f,g \in \Ez$, $\alpha \in \Cz$,
572: 
573: (E1)~ $\<1\> =1, ~~\<f^*\>=\<f\>^*,~~ \< f+g\> =\<f\> +\<g\> $, 
574: 
575: (E2)~ $\<\alpha f\> =\alpha\<f\>$, 
576: 
577: (E3)~ If $f \ge 0$ then $\<f\> \ge 0$,
578: 
579: (E4)~ If $f_l\in\Ez,~ f_l \downarrow 0$ then $\<f_l\> \downarrow  0$.
580: \end{dfn}
581: 
582: Note that this formal definition of a state -- always used in the 
583: remainder of the book -- differs from the phenomenological 
584: thermodynamic states defined in Section \ref{s.phen}. 
585: The connection between the two notions will be made in
586: Section \ref{s.eos}. 
587: 
588: Statistical mechanics essentially originated with Josiah Willard Gibbs,
589: whose 1902 book \sca{Gibbs} \cite{Gib} on (at that time of course 
590: only classical) statistical mechanics is still readable. See
591: \sca{Uffink} \cite{Uff} for a history of the subject. 
592: 
593: All states arising in thermodynamics have the following
594: particular form.
595: 
596: \begin{dfn} \label{2.7.}
597: A \bfi{Gibbs state} is defined by assigning to any $g\in\Ez$ the value
598: \lbeq{2-10a}
599: \<g\>:=\sint e^{-S/\kbar} g, 
600: \eeq
601: where $S$, called the \bfi{entropy} of the state, is a Hermitian 
602: quantity with strongly integrable $e^{-S/\kbar}$, satisfying the 
603: normalization condition
604: \lbeq{2-10}
605: \sint e^{-S/\kbar}=1,
606: \eeq
607: and $\kbar$ is the Boltzmann constant
608: \lbeq{e.kbar}
609: \kbar \approx 1.38065 \cdot 10^{-23} J/K.
610: \eeq
611: Theorem \ref{2.6.} below implies that a Gibbs state is indeed a state.
612: \end{dfn}
613: 
614: The Boltzmann constant defines the units in which the entropy is 
615: measured. In analogy\footnote{
616: As we shall see in \gzit{e.qmunc} and \gzit{e.thunc}, $\hbar$ and 
617: $\kbar$ play indeed analogous roles in quantum mechanical and 
618: thermodynamic uncertainty relations.
619: } % end footnote
620: with Planck's constant $\hbar$,
621: we write $\kbar$ in place of the customary $k$ or $k_B$, in order to 
622: be free to use the letter $k$ for other purposes. 
623: By a change of units one can enforce any value of $\kbar$.
624: Chemists use instead of particle number $N$ the corresponding \bfi{mole 
625: number}, which differs by a fixed numerical factor, the \bfi{Avogadro
626: constant} 
627: \[
628: N_A=R/\kbar \approx 6.02214 \cdot 10^{23}\fct{mol}^{-1}, 
629: \]
630: where $R$ is the  universal gas constant \gzit{e.R}.
631: As a result, all results from statistical mechanics may be translated 
632: to phenomenological thermodynamics by setting $\kbar = R$, 
633: corresponding to setting $1 \fct{mol} =  6.02214 \cdot 10^{23}$,
634: the number of particles in one mole of a pure substance.
635: 
636: What is here called entropy has a variety of alternative names in the 
637: literature on statistical mechanics. For example,
638: \sca{Gibbs} \cite{Gib}, who first noticed the rich thermodynamic 
639: implications of states defined by \gzit{2-10a}, called $-S$ the 
640: {\em index of probability};  
641: \sca{Alhassid \& Levine} \cite{AlhL} and \sca{Balian} \cite{Bal2}
642: use the name {\em surprisal} for $S$.  Our terminology is close to 
643: that of \sca{Mrugala} et al. \cite{MruNSS}, who call 
644: $S$ the {\em microscopic entropy}, and \sca{Hassan} et al. \cite{HasVL},
645: who call $S$ the {\em information(al) entropy operator}. 
646: What is traditionally (and in Section \ref{s.phen}) called entropy 
647: and denoted by $S$
648: is in the present setting the value $\ol S=\<S\>$.
649: 
650: 
651: \begin{thm} \label{2.6.}~\\
652: (i) A Gibbs state determines its entropy uniquely.
653: 
654: (ii) For any Hermitian quantity $f$ with strongly integrable $e^{-f}$, 
655: the mapping $\<\cdot\>_f$ defined by 
656: \lbeq{2-6a}
657: \< g \>_f:=Z_f^{-1}\sint e^{-f} g,~~~\mbox{where } Z_f:=\sint e^{-f},
658: \eeq
659: is a state. It is a Gibbs state with entropy 
660: \lbeq{2-8}
661: S_f:=\kbar (f+\log Z_f).
662: \eeq
663: (iii) The \bfi{KMS condition} (cf. \sca{Kubo} \cite{Kub0},
664: \sca{Martin \& Schwinger} \cite{MarS})
665: \lbeq{e.KMS}
666: \<gh\>_f = \<hQ_f g\> ~~~\mbox{for bounded } g,h
667: \eeq
668: holds. Here $Q_f$ is the linear mapping defined by
669: \[
670: Q_f g :=e^{-f}ge^{f}.
671: \]
672: \end{thm}
673: 
674: \bepf 
675: (i) If the entropies $S$ and $S'$ define the same Gibbs state then 
676: \[
677: \sint (e^{-S/\kbar}-e^{-S'/\kbar}) g = \<g\>-\<g\>=0 
678: \]
679: for all $g$, hence  \gzit{e.ean4} gives $e^{-S/\kbar}-e^{-S'/\kbar}=0$. 
680: This implies that $e^{-S/\kbar}=e^{-S'/\kbar}$, hence $S=S'$ by 
681: Proposition \ref{app2a.}.
682: 
683: (ii) The quantity $d:=e^{-f/2}$ is nonzero and satisfies $d^*=d$, 
684: $e^{-f}=d^*d\geq 0$. Hence $Z_f>0$ by (EA2), and $\rho:=Z_f^{-1}e^{-f}$ 
685: is Hermitian and nonnegative. For $h\ge 0$, the quantity $g=\sqrt{f}$
686: is Hermitian (by Proposition \ref{app2a.}) and satisfies 
687: $g\rho g^*=Z_f^{-1}(gd)(gd)^* \ge 0$, hence 
688: by (EA3),
689: \[
690: \<h\>_f=\<g^*g\>_f= \sint \rho g^*g =\sint g\rho g^* \ge 0. 
691: \] 
692: Moreover, $\<1\>_f =Z_f^{-1}\sint e^{-f}=1$. Similarly, if $g\ge 0$ 
693: then $g=h^*H$ with $h=\sqrt{g}=h^*$ and with $k:=e^{-f/2}h$, we get 
694: \[
695: Z_f\<g\>_f = \sint e^{-f}hh^*=\sint h^*e^{-f}h = \sint k^*k \ge 0.
696: \]
697: This implies (E3). the other axioms (E1)--(E4) follow easily from the 
698: corresponding properties of the integral. Thus $\<\cdot\>_f$ is a state.
699: Finally, with the definition \gzit{2-8}, we have
700: \[
701: Z_f^{-1}e^{-f}=e^{-f-\log Z_f}=e^{-S_f/\kbar}, 
702: \]
703: whence $\<\cdot\>_f$ is a Gibbs state.
704: 
705: (iii) By (EA3),
706: $\<hQ_fg\>_f=\sint e^{-f}hQ_fg=\sint Q_fge^{-f}h =\sint e^{-f}gh 
707: =\<gh\>_f$.
708: \epf
709: 
710: Note that the state \gzit{2-6a} is unaltered when $f$ is 
711: shifted by a constant. $Q_f$ is called the \bfi{modular automorphism}
712: of the state $\<\cdot\>_f$ since $Q_f(gh)=Q_f(g)Q_f(h)$; for a classical
713: system, $Q_f$ is the identity. In the following, we shall not make use
714: of the KMS condition; however, it plays an important role in the 
715: mathematics of the thermodynamic limit (cf. \sca{Thirring} \cite{Thi}).
716: 
717: $Z_f$ is called the \bfi{partition function} of $f$; it is a function of
718: whatever parameters appear in a particular form given to $f$ in the
719: applications. A large part of traditional statistical mechanics is 
720: concerned with the calculation, for given $f$, of the partition 
721: function $Z_f$ and of the values $\<g\>_f$ for selected quantities $g$. 
722: As we shall see, the basic results of 
723: statistical mechanics are completely independent of the details 
724: involved, and it is this basic part that we concentrate upon in this 
725: book.
726: 
727: \begin{expl}\label{ex.canonical}
728: A \bfi{canonical ensemble}\footnote{\label{f.ensemble} 
729: Except in the traditional notions of a microcanonical, canonical, or 
730: grand canonical ensemble, we avoid the term \bfi{ensemble} which in 
731: statistical mechanics is de facto uses as a synonym for state but
732: often has the connotation of a large real or imagined 
733: collection of identical copies of a systems. The latter interpretation 
734: has well-known difficulties to explain why each single 
735: macroscopic system is described correctly by thermodynamics;
736: see, e.g., \sca{Sklar} \cite{Skl}.
737: }, % end footnote
738: is defined as a Gibbs state whose entropy is an affine function of a 
739: Hermitian quantity $H$, called the \bfi{Hamiltonian}:
740: \[
741: S=\beta H + \const,
742: \]
743: with a constant depending on $\beta$, computable from \gzit{2-8} and
744: the partition function
745: \[
746: Z=\sint e^{-\beta H}
747: \]
748: of $f=\beta H$.
749: In particular, in the quantum case, where $\sint$ is the trace, the
750: finiteness of $Z$ implies that 
751: $S$ and hence $H$ must have a discrete spectrum that is bounded below. 
752: Hence the partition function takes the familiar form
753: \lbeq{e3.3}
754: Z=\tr e^{-\beta H} = \sum_{n \in \cal N} e^{-\beta E_n},
755: \eeq
756: where the $E_n$ ($n\in\cal N$) are the \bfi{energy levels}, the 
757: eigenvalues of $H$.
758: If the spectrum of $H$ is known, this leads to explicit formulas for 
759: $Z$. For example, a \bfi{two level system} is defined by the energy 
760: levels $0,E$ (or $E_0$ and $E_0+E$, which gives the same results), 
761: and has
762: \lbeq{e.2level}
763: Z=1+e^{-\beta E}.
764: \eeq
765: It describes a single \bfi{Fermion mode}, but also many other systems
766: at low temperature; cf. \gzit{e.2levelapprox}. In particular, it is the 
767: basis of laser-induced chemical reactions in photochemistry (see, e.g., 
768: \sca{Karlov} \cite{Kar}, \sca{Murov} et al. \cite{MurCH}), where 
769: only two electronic energy levels (the ground state and the first 
770: excited state) are relevant; cf. the discussion of 
771: \gzit{e.2levelapprox} below.
772: 
773: For a \bfi{harmonic oscillator}, defined by the energy levels $nE$, 
774: $n=0,1,2,\dots$ and describing a single \bfi{Boson mode}, we have
775: \[
776: Z=\sum_{n=0}^\infty e^{-n\beta E} = (1-e^{-\beta E})^{-1}.
777: \]
778: Independent modes are modelled by taking tensor products of single 
779: mode algebras and adding their Hamiltonians, leading to spectra which 
780: are obtained by summing the eigenvalues of the modes in all possible 
781: ways. The resulting partition function is the product of the 
782: single-mode partition functions. 
783: \at{expand? treat Maxwell case? $\sint f = \sum_n f(n)/n!$}] 
784: From here, a thermodynamic limit 
785: leads to the properties of ideal gases. Then nonideal gases due to 
786: interactions can be handled using the cumulant expansion, as 
787: indicated at the end of Section \ref{s.gen}. The details are outside 
788: the scope of this book.
789: \end{expl}
790: 
791: Since the Hamiltonian can be any Hermitian quantity, the quantum 
792: partition function formula \gzit{e3.3} can in principle be used to
793: compute the partition function of arbitrary quantized Hermitian 
794: quantities.
795: 
796: 
797: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
798: \section{Kubo product and generating functional} \label{s.gen}
799: 
800: 
801: 
802: The negative logarithm of the partition function, the so-called
803: generating functional, plays a fundamental role in statistical 
804: mechanics.
805: 
806: We first discuss a number of general properties, discovered by 
807: \sca{Gibbs} \cite{Gib}, \sca{Peierls} \cite{Pei}, 
808: \sca{Bogoliubov} \cite{Bog}, \sca{Kubo} \cite{Kub}, 
809: \sca{Mori} \cite{Mor}, and \sca{Griffiths} \cite{Gri}. 
810: The somewhat technical setting involving the Kubo inner product is
811: necessary to handle noncommuting quantities correctly; 
812: everything would be much easier in the classical case.
813: On a first reading, the proofs in this section may be skipped.
814: 
815: \begin{prop} Let $f$ be Hermitian such that $e^{sf}$ is strongly
816: integrable for all $s\in[-1,1]$. Then 
817: \lbeq{e.kubo}
818: \<g;h\>_f:=\<g E_f h\>_f,
819: \eeq
820: where $E_f$ is the linear mapping defined for Hermitian $f$ by
821: \[
822: E_f h:=\int_0^1 ds\, e^{-sf}he^{sf},
823: \]
824: defines a bilinear, positive definite inner product 
825: $\<\cdot\,;\cdot\>_f$ on the algebra of quantities, 
826: called the \bfi{Kubo} (or \bfi{Mori} or \bfi{Bogoliubov}) 
827: \bfi{inner product}.
828: For all $f,g$, the following relations hold:
829: \lbeq{e.kubo2}
830: \<g;h\>_f^* =\<h^*;g^*\>_f.
831: \eeq
832: \lbeq{e.definit}
833: \<g^*;g\>_f > 0 ~~~\mbox{if } g \ne 0.
834: \eeq
835: \lbeq{e.kubo1}
836: \<g;h\>_f =g\<h\>_f ~~~\mbox{if $g\in \Cz$},
837: \eeq
838: \lbeq{e.kubo0}
839: \<g;h\>_f =\<gh\>_f ~~~\mbox{if $g$ or $h$ commutes with $f$},
840: \eeq
841: \lbeq{e.E0}
842: E_f g = g ~~~\mbox{if $g$ commutes with $f$},
843: \eeq
844: If $f=f(\lambda)$ depends continuously differentiably on the
845: real parameter vector $\lambda$ then 
846: \lbeq{e.deriv0}
847: \frac{d}{d\lambda} e^{-f} = - \Big(E_f \frac{df}{d\lambda}\Big)e^{-f}.
848: \eeq
849: \end{prop}
850: 
851: \bepf
852: (i) We have
853: \[
854: \<g;h\>_f^* =\<(gE_fh)^*\>_f = \<(E_fh)^*g^*\>_f
855: =\Big\<\int_0^1 ds\,e^{sf}h^*e^{-sf}g^*\Big\>_f
856: =\int_0^1 ds\<e^{sf}h^*e^{-sf}g^*\>_f.
857: \]
858: The integrand equals
859: \[
860: \sint e^{-f}e^{sf}h^*e^{-sf}g^* = \sint e^{sf}e^{-f}h^*e^{-sf}g^* 
861: =\sint e^{-f}h^*e^{-sf}g^*e^{sf} = \<h^*e^{-sf}g^*e^{sf}\>_f
862: \]
863: by (EA3), hence
864: \[
865: \<g;h\>_f^* = \int_0^1 ds\<h^*e^{-sf}g^*e^{sf}\>_f
866: = \Big\<h^* \int_0^1 ds\,e^{-sf}g^*e^{sf}\Big\>_f
867: = \<h^*E_fg^*\>_f=\<h^*;g^*\>_f.
868: \]
869: Thus \gzit{e.kubo2} holds.
870: 
871: (ii) Suppose that $g\ne 0$. For $s\in[0,1]$, we define $u=s/2,v=(1-s)/2$
872: and $g(s):= e^{-uf}ge^{vf}$. Since $f$ is Hermitian, 
873: $g(s)^*= e^{vf}g^*e^{-uf}$, hence by (EA3) and (EA2),
874: \[
875: \sint  e^{-f}g^*e^{-sf}ge^{sf}=\sint e^{vf}ge^{-2uf}g^*e^{vf}
876: =\sint g(s)^*g(s)>0, 
877: \]
878: so that
879: \[
880: \<g^*;g\>_f=\<g^*E_fg\>_f
881: =\int_0^1 ds\,\sint e^{-f}g^*e^{-sf}ge^{sf} > 0.
882: \]
883: This proves \gzit{e.definit}, and shows that the Kubo inner product is 
884: positive definite.
885: 
886: (iii) If $f$ and $g$ commute then $ge^{sf}=e^{sf}g$, hence 
887: \[
888: E_fg=\int_0^1 ds e^{-sf}e^{sf} g = \int_0^1 ds g = g,
889: \]
890: giving \gzit{e.E0}. The definition of the Kubo inner product then
891: implies \gzit{e.kubo0}, and taking $g\in\Cz$ gives \gzit{e.kubo1}.
892: 
893: (iv) The function $q$ on $[0,1]$ defined by
894: \[
895: q(t):= \int_0^t ds\, e^{-sf}\frac{df}{d\lambda}e^{sf}
896: +\Big(\frac{d}{d\lambda}e^{-tf}\Big) e^{tf}
897: \]
898: satisfies $q(0)=0$ and 
899: \[
900: \frac{d}{dt}q(t) = e^{-tf}\frac{df}{d\lambda}e^{tf}
901: +\Big(\frac{d}{d\lambda}e^{-tf}\Big)f e^{tf}
902: +\frac{d}{d\lambda}(-e^{-tf}f) e^{tf} = 0.
903: \]
904: Hence $q$ vanishes identically. In particular, $q(1)=0$, giving 
905: \gzit{e.deriv0}.  
906: \epf
907: 
908: As customary in thermodynamics, we use differentials to express
909: relations involving the differentiation by arbitrary parameters.
910: To write \gzit{e.deriv0} in differential form, we formally multiply by 
911: $d\lambda$, and obtain the \bfi{quantum chain rule} for exponentials,
912: \lbeq{e.chain}
913: d e^{-f} = (- E_fd f) e^{-f}.
914: \eeq
915: If the $f(\lambda)$ commute for all values of $\lambda$
916: then the quantum chain rule reduces to the classical chain rule.
917: Indeed, then $f$ commutes also with $\frac{df}{d\lambda}$; hence 
918: $E_f\frac{df}{d\lambda} = \frac{df}{d\lambda}$, and $E_fd f = df$.
919: 
920: \bigskip
921: {\em The following theorem is central to the mathematics of 
922: statistical mechanics.}
923: As will be apparent from the discussion in the next chapter, 
924: part (i) is the 
925: abstract mathematical form of the second law of thermodynamics, 
926: part (ii) allows the actual computation of thermal properties from
927: microscopic assumptions, and part (iii) is the abstract form of the 
928: first law.
929: 
930: \begin{thm} \label{t3.3}
931: Let $f$ be Hermitian such that $e^{sf}$ is strongly
932: integrable for all $s\in[-1,1]$. 
933: 
934: (i) The \bfi{generating functional}
935: \lbeq{e.gen}
936: W(f):=- \log \sint e^{-f}
937: \eeq
938: is a concave function of the Hermitian quantity $f$.
939: In particular,
940: \lbeq{e.GB}
941: W(g) \le W(f)+\<g-f\>_f.~~~
942: \mbox{\bf (\bfi{Gibbs-Bogoliubov inequality})} 
943: \eeq
944: Equality holds in \gzit{e.GB} iff $f$ and $g$ differ by a constant. 
945: 
946: (ii) For Hermitian $g$, we have
947: \lbeq{e.starh}
948: W(f+\tau g)=W(f)-\log\<e^{-f-\tau g}e^f\>_f.
949: \eeq
950: Moreover, the \bfi{cumulant expansion}
951: \lbeq{e.cumulant}
952: W(f+\tau g)
953: = W(f)+\tau\<g\>_f + \frac{\tau^2}{2}(\<g\>_f^2-\<g;g\>_f) + O(\tau^3)
954: \eeq
955: holds if the coefficients are finite.
956: 
957: (iii) If $f=f(\lambda)$ and $g=g(\lambda)$ depend continuously 
958: differentiably on $\lambda$ then the following \bfi{differentiation 
959: formulas} hold:
960: \lbeq{e.diff}
961: d\<g\>_f = \<dg\>_f-\<g;df\>_f+\<g\>_f\<df\>_f, 
962: \eeq
963: \lbeq{e.diffW}
964: dW(f)=\<df\>_f.
965: \eeq
966: (iv) The entropy of the state $\<\cdot\>_f$ is
967: \lbeq{e.ent}
968: S=\kbar(f-W(f)).
969: \eeq
970: \end{thm}
971: 
972: \bepf
973: We prove the assertions in reverse order.
974: 
975: (iv) Equation \gzit{e.gen} says that $W(f)=-\log Z_f$, which together 
976: with \gzit{2-8} gives \gzit{e.ent}.
977: 
978: (iii) We have 
979: \[
980: \bary{lll}
981: d\sint ge^{-f} &=& \sint dg e^{-f} + \sint gde^{-f}
982: =\sint dge^{-f}-\sint gE_fd fe^{-f}\\
983: &=&\sint(dg-gE_fd f)e^{-f} = Z_f\<dg-gE_fd f\>_f.
984: \eary
985: \]
986: On the other hand, 
987: $d\sint ge^{-f} = d(Z_f\<g\>_f)=dZ_f\<g\>_f+Z_fd\<g\>_f$, so that
988: \lbeq{e.s1}
989: dZ_f\<g\>_f+Z_fd\<g\>_f = Z_f\<dg-gE_fd f\>_f 
990: = Z_f\<dg\>_f-Z_f\<g;df\>_f.
991: \eeq
992: In particular, for $g=1$ we find by \gzit{e.kubo1} that 
993: $dZ_f=-Z_f\<1;df\>_f=-Z_f\<df\>_f$. Now \gzit{e.diffW} follows from
994: $dW(f)=-d\log Z_f =-dZ_f/Z_f = \<df\>_f$, and solving \gzit{e.s1} for 
995: $d\<g\>_f$ gives \gzit{e.diff}.
996: 
997: (ii) Equation \gzit{e.starh} follows from
998: \[
999: e^{-W(h)} = \sint e^{-h} = \sint e^{-h} e^f e^{-f}
1000: = \sint e^{-f} e^{-h} e^f = (\sint e^{-f}) \<e^{-h} e^f\>_f
1001: = e^{-W(f)}  \<e^{-h} e^f\>_f
1002: \]
1003: by taking logarithms and setting $h=f+\tau g$. To prove the cumulant 
1004: expansion, we introduce the function $\phi$ defined by
1005: \[
1006: \phi(\tau):=W(f+\tau g),
1007: \]
1008: From \gzit{e.diffW}, we find $\phi'(\tau) = \<g\>_{f+\tau g}$
1009: for $f,g$ independent of $\tau$, and by differentiating this again,
1010: \[
1011: \phi''(\tau)=\D\frac{d}{d\tau}  \<g\>_{f+\tau g}
1012: =\D-\Big\<g\frac{E_fd (f+\tau g)}{d\tau}\Big\>_{f+\tau g}
1013: +\<g\>_{f+\tau g}^2.
1014: \]
1015: In particular, 
1016: \lbeq{e.x5}
1017: \phi'(0) = \<g\>_f,~~~
1018: \phi''(0) = \<g\>_f^2-\<gE_f g\>_f= \<g\>_f^2-\<g;g\>_f.
1019: \eeq
1020: A Taylor expansion now implies \gzit{e.cumulant}. 
1021: 
1022: (i) Since the Cauchy-Schwarz equation 
1023: for the Kubo inner product implies 
1024: \[
1025: \<g\>_f^2=\<g;1\>_f^2\le \<g;g\>_f\<1;1\>_f= \<g;g\>_f, 
1026: \]
1027: \gzit{e.x5} implies that 
1028: \[
1029: \frac{d^2}{d\tau^2} W(f+\tau g)\Big|_{\tau=0}\le 0
1030: \]
1031: for all $f,g$. This implies that $W(f)$ is concave.
1032: Moreover, replacing $f$ by $f+sg$, we find that $\phi''(s)\le 0$ for
1033: all $s$. The remainder form of Taylor's theorem therefore gives
1034: \[
1035: \phi(\tau)=\phi(0)+\tau\phi'(0)+\int_0^\tau ds (\tau-s)\phi''(s)
1036: \le \phi(0)+\tau\phi'(0),
1037: \]
1038: and for $\tau=1$ we get
1039: \lbeq{e.x6}
1040: W(f+g)\le W(f)+\<g\>_f.
1041: \eeq
1042: \gzit{e.GB} follows for $\tau=1$ upon replacing $g$ by $g-f$.
1043: 
1044: By the derivation, equality holds in \gzit{e.x6} only if $\phi''(s)=0$ 
1045: for all $0<s<1$. By \gzit{e.x5}, applied with $f+sg$ in place of $f$, 
1046: this
1047: implies $\<g\>_{f+sg}^2 = \<g;g\>_{f+sg}$. Thus we have equality in 
1048: the Cauchy-Schwarz argument, forcing $g$ to be a multiple of $1$.
1049: Therefore equality in the Gibbs-Bogoliubov inequality \gzit{e.GB} 
1050: is possible only if $g-f$ is a constant.
1051: \epf
1052: 
1053: As a consequence of the  Gibbs-Bogoliubov inequality, we derive an
1054: important inequality for the entropy.
1055: 
1056: \begin{thm} \label{t4.5} 
1057: Let $S_c$ be the entropy of a reference state. Then, for an arbitrary 
1058: Gibbs state $\<\cdot\>$ with entropy $S$,
1059: \lbeq{e4.5}
1060: \< S \> \le \< S_c\>,
1061: \eeq
1062: with equality only if $S_c =S$.
1063: \end{thm}
1064: 
1065: \bepf
1066: Let $f=S/\kbar$ and $g=S_c/\kbar$. Since $S$ and $S_c$ are 
1067: entropies, $W(f)=W(g)=0$, and the Gibbs-Bogoliubov inequality 
1068: \gzit{e.GB} gives $0\le \<g-f\>_f = \<S_c-S\>/\kbar$.
1069: This implies \gzit{e4.5}. If equality holds then equality holds in 
1070: \gzit{e.GB}, so that $S_c$ and $S$ differ only by a constant.
1071: But this constant vanishes since the values agree.
1072: \epf
1073: 
1074: The difference
1075: \lbeq{4-5}
1076: \< S_c-S\>  =\< S_c\> -\< S\>  \ge 0
1077: \eeq
1078: is known as \bfi{relative entropy}.
1079: In an information theoretical context (cf. Section \ref{s.complexity}),
1080: the relative entropy may be interpreted as the amount of information
1081: in a state $\< \cdot \>$ which cannot be explained by
1082: the reference state. This interpretation makes sense since 
1083: the relative entropy vanishes precisely for the reference state. 
1084: A large relative entropy therefore indicates that the state contains 
1085: some important information not present in the reference state.
1086: 
1087: 
1088: \bfi{Approximations.} 
1089: The cumulant expansion is the basis of a well-known
1090: approximation method in statistical mechanics. Starting from special
1091: reference states $\<\cdot\>_f$ with explicitly known $W(f)$ and $E_f$ 
1092: (corresponding to so-called explicitly solvable models), one obtains 
1093: inductively expressions for values in these states by 
1094: applying the differentiation rules. (In the most important cases,
1095: the resulting formulas for the values are commonly
1096: referred to as a \bfi{Wick theorem}, cf. \sca{Wick} \cite{Wic},
1097: although the formulas are much older and were derived in 1918 by
1098: \sca{Isserlis} \cite{Iss}.
1099: For details, see textbooks on statistical mechanics, 
1100: e.g., \sca{Huang} \cite{Hua}, \sca{Reichl} \cite{Rei}.)
1101: 
1102: From these, one can calculate the coefficents in the cumulant 
1103: expansion; note that higher order terms can be found 
1104: by proceeding as in the proof, using further differentiation. 
1105: \at{Alternatively, one may proceed on the basis of BCH-formulas for 
1106: the Lie groups defining the exactly solvable model.}
1107: This gives approximate generating functions (and by 
1108: differentiation associated values) for Gibbs states 
1109: with an entropy close to the explicitly solvable reference state.
1110: From the resulting generating function and the differentiation 
1111: formulas \gzit{e.diff}--\gzit{e.diffW},
1112: one gets as before the values for the given state.
1113: 
1114: The best tractable reference state  $\<\cdot\>_f$ to be used for a 
1115: given Gibbs state $\<\cdot\>_g$ can be obtained by minimizing the 
1116: upper bound in the Gibbs-Bogoliubov inequality \gzit{e.GB} over 
1117: all $f$ for which an explicit generating function is known.
1118: Frequently, one simply approximates $W(g)$ by the minimum of this
1119: upper bound,
1120: \lbeq{e.meanfield}
1121: W(g) \approx W_m(g):=\inf_f \Big(W(f)+\<g-f\>_f\Big).
1122: \eeq
1123: Using $W_m(g)$ in place of $W(g)$ defines a so-called 
1124: \bfi{mean field theory}; cf. \sca{Callen} \cite{Cal}.
1125: For computations from first principles (quantum field theory), see, 
1126: e.g., the survey by \sca{Berges} et al. \cite{BerTW}.
1127: 
1128: 
1129: 
1130: 
1131: 
1132: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1133: \section{Limit resolution and uncertainty} \label{s.limit}
1134: 
1135: Definition \ref{d.state} generalizes the expectation axioms of 
1136: \sca{Whittle} \cite[Section 2.2]{Whi} for classical probability theory.
1137: Indeed, the values of our quantities are traditionally called 
1138: expectation values, and refer to the mean over an ensemble of (real or 
1139: imagined) identically prepared systems. 
1140: 
1141: In our treatment, we keep the notation with pointed brackets familiar 
1142: from statistical mechanics, but use the more neutral term {\em value} 
1143: for $\<f\>$ to avoid any reference to probability or statistics.
1144: This keeps the formal machinery completely independent of controversial
1145: issues about the interpretation of probabilities. Statistics and 
1146: measurements, where the probabilistic aspect enters directly, are 
1147: discussed separately in Chapter \ref{s.model}.
1148: 
1149: The key to an interpretation of the values of quantities as objective, 
1150: observer-independent properties is an analysis of the uncertainty 
1151: inherent in the description of a system by a state, based on the 
1152: following result.
1153: 
1154: \begin{prop} 
1155: For Hermitian $g$, 
1156: \lbeq{e.res0}
1157: \<g\>^2 \le \<g^2\>.
1158: \eeq
1159: Equality holds if $g=\<g\>$. 
1160: \end{prop}
1161: 
1162: \bepf
1163: Put $\ol g = \<g\>$. Then $0\le\<(g-\ol g)^2\>=\<g^2-2\ol g g+\ol g^2\>
1164: =\<g^2\>-2\ol g \<g\>+\ol g^2=\<g^2\>-\<g\>^2$.
1165: This gives \gzit{e.res0}. If $g=\ol g$ then equality holds in this
1166: argument.
1167: \epf
1168: 
1169: \begin{dfn}
1170: The number
1171: \[
1172: \cov(f,g):=\re \<(f-\overline{f})^*(g-\overline{g}) \>
1173: \]
1174: is called the \bfi{covariance} of $f,g\in\Ez$. Two quantities $f,g$ are 
1175: called \bfi{uncorrelated} if $\cov(f,g)=0$, and \bfi{correlated} 
1176: otherwise. The number 
1177: \[
1178: \sigma(f):=\sqrt{\cov(f,f)}
1179: \]
1180: is called the \bfi{uncertainty} of $f\in\Ez$ in the state $\<\cdot\>$. 
1181: The number
1182: \lbeq{e.res}
1183: \res(g):=\sqrt{\<g^2\>/\<g\>^2-1},
1184: \eeq
1185: is called the \bfi{limit resolution}
1186: of a Hermitian quantity $g$ with nonzero value $\<g\>$.
1187: \end{dfn}
1188: 
1189: Note that (E3) and \gzit{e.res0} ensure that $\sigma(f)$ and $\res(g)$ 
1190: are nonnegative real numbers that vanish if $f,g$ are constant, 
1191: i.e., complex numbers, and $g\ne 0$.
1192: This definition is analogous to the definitions of elementary classical 
1193: statistics, where $\Ez$ is a commutative algebra of random variables, 
1194: to the present, more general situation; in a statistical context, 
1195: the uncertainty 
1196: $\sigma(f)$ is referred to as \bfi{standard deviation}.
1197: 
1198: There is no need to associate an intrinsic statistical 
1199: meaning to the above concepts. We treat the uncertainty 
1200: $\sigma(f)$ and the limit resolution $\res(g)$ simply as an absolute 
1201: and relative uncertainty measure, respectively, specifying 
1202: how accurately one can treat $g$ as a sharp number, given by this 
1203: value. 
1204: 
1205: In experimental practice, the limit resolution is a lower bound 
1206: on the relative accuracy with which one can expect $\<g\>$ to be 
1207: determinable reliably\footnote{
1208: The situation is analogous to the limit resolution with which one can
1209: determine the longitude and latitude of a city such as Vienna.
1210: Clearly these are well-defined only up to some limit resolution
1211: related to the diameter of the city. No amount of measurements can 
1212: reduce the uncertainty below about 10km. For an extended object,
1213: the uncertainty in its position is conceptual, 
1214: not just a lack of knowledge or precision. Indeed, a point may be 
1215: defined to be an object in a state where the position has zero limit 
1216: resolution.
1217: }\ % end footnote
1218: from measurements of a single system at a single time. 
1219: In particular, a quantity $g$ is considered to be 
1220: \bfi{significant} if $\res(g)\ll 1$, while it is \bfi{noise} if 
1221: $\res(g)\gg 1$. If $g$ is a quantity and $\widetilde g$ is a good 
1222: approximation of its value then $\Delta g:=g-\widetilde g$ is 
1223: noise. Sufficiently significant quantities can be treated as 
1224: \bfi{deterministic}; the analysis of noise is the subject of 
1225: \bfi{statistics}.
1226: 
1227: 
1228: 
1229: \begin{prop} \label{p5.2}
1230: 
1231: For any state,
1232: 
1233: (i) $f\leq g \implies \<f\> \leq \<g\>$.
1234: 
1235: (ii) For $f,g\in\Ez$,
1236: \[
1237: \cov(f,g)=\re(\<f^*g\>-\<f\>^*\<g\>),
1238: \]
1239: \[
1240: \<f^*f\>=\<f\>^*\<f\>+\sigma(f)^2,
1241: \]
1242: \[
1243: |\<f\>|\leq\sqrt{\<f^*f\>}.
1244: \]
1245: 
1246: (iii) If $f$ is Hermitian then $\bar f = \<f\>$ is real and
1247: \[
1248: \sigma(f)=\sqrt{\<(f-\overline{f})^2 \>}
1249: =\sqrt{\<f^2\>-\<f\>^2}.
1250: \]
1251: 
1252: (iv) Two commuting Hermitian quantities $f,g$ are uncorrelated iff
1253: \[
1254: \<fg\>=\<f\>\<g\>.
1255: \]
1256: 
1257: \end{prop}
1258: 
1259: \bepf
1260: (i) follows from (E1) and (E3).
1261: 
1262: (ii) The first formula holds since
1263: \[
1264: \<(f-\bar f)^*(g-\bar g)\>
1265: =\<f^*g\>-\bar f^*\<g\>-\<f\>^*\bar g +\bar f^*\bar g 
1266: = \<f^*g\>-\<f\>^*\<g\>.
1267: \]
1268: The second formula follows for $g=f$, using (E1), and the third 
1269: formula is an immediate consequence.
1270: 
1271: (iii) follows from (E1) and (ii).
1272: 
1273: (iv) If $f,g$ are Hermitian and commute the $fg$ is Hermitian by 
1274: Proposition \ref{p5.1.2}(ii), hence $\<fg\>$ is real. By (ii),
1275: $\cov(f,g)=\<fg\>-\<f\>\<g\>$, and the assertion follows.
1276: \epf
1277: 
1278: 
1279: Formally, the essential difference between classical mechanics 
1280: and quantum mechanics in the latter's lack of commutativity.
1281: While in classical mechanics there is in principle no lower
1282: limit to the uncertainties with which we can prepare the quantities
1283: in a system of interest,
1284: the quantum mechanical uncertainty relation for noncommuting 
1285: quantities puts strict limits on the uncertainties in the preparation
1286: of microscopic states. Here, {\em preparation} is defined informally 
1287: as bringing the system into an state such that measuring certain 
1288: quantities $f$ gives numbers that agree with the values $\<f\>$ to an 
1289: accuracy specified by given uncertainties.
1290: 
1291: We now discuss the limits of the accuracy to which this 
1292: can be done.
1293: 
1294: 
1295: \begin{prop} \label{p5.1}~\\
1296: (i) The \bfi{Cauchy--Schwarz inequality}  
1297: \[
1298: |\< f^*g \>|^2 \le \< f^*f \>\< g^*g \>
1299: \]
1300: holds for all $f,g\in\Ez$.
1301: 
1302: (ii) The \bfi{uncertainty relation}
1303: \[
1304: \sigma(f)^2\sigma(g)^2 
1305: \geq |\cov(f,g)|^2+\left|\shalf\<f^*g-g^*f\>\right|^2
1306: \]
1307: holds for all $f,g\in\Ez$.
1308: 
1309: (iii) For $f,g\in\Ez$, 
1310: \lbeq{ecov1}
1311: \cov(f,g)=\cov(g,f)=\shalf(\sigma(f+g)^2-\sigma(f)^2-\sigma(g)^2),
1312: \eeq
1313: \lbeq{ecov}
1314: |\cov(f,g)| \leq \sigma(f)\sigma(g), 
1315: \eeq
1316: \lbeq{esig}
1317: \sigma(f+g) \leq \sigma(f)+\sigma(g).
1318: \eeq
1319: In particular,
1320: \lbeq{e.prodbound}
1321: |\<fg\>-\<f\>\<g\>|\leq\sigma(f)\sigma(g) 
1322: ~~~\mbox{for commuting Hermitian } f,g. 
1323: \eeq
1324: 
1325: \end{prop}
1326: 
1327: \bepf
1328: (i) For arbitrary $\alpha ,\beta\in \Cz$ we have
1329: \[
1330: \begin{array}{ll}
1331: 0&\le \<(\alpha f-\beta g)^*(\alpha f-\beta g )\> \\
1332: &=\alpha ^* \alpha \< f^*f \>-\alpha ^* \beta \< f^*g \>
1333: -\beta ^*\alpha \< g^*f \>+\beta\beta^* \< g^*g \>\\
1334: &=|\alpha |^2\< f^*f \>-2\re(\alpha ^* \beta \< f^*g \>)
1335: +|\beta|^2\< g^*g \>
1336: \end{array}
1337: \]
1338: We now choose $\beta=\< f^*g \>$, and obtain for arbitrary
1339: real $\alpha $ the inequality
1340: \lbeq{f.8}
1341: 0\le \alpha ^2\< f^*f \>
1342: -2\alpha |\< f^*g \>|^2+|\< f^*g \>|^2\< g^*g \>.
1343: \eeq
1344: The further choice $\alpha=\< g^*g \>$ gives
1345: \[
1346: 0\le \< g^*g \>^2\< f^*f \>-\< g^*g \>|\< f^*g \>|^2.
1347: \]
1348: If $\< g^*g \>>0$, we find after division by $\< g^*g \>$ that (i) 
1349: holds. And if $\< g^*g \>\le 0$ then $\< g^*g \>=0$ and we have 
1350: $\< f^*g \>=0$ since otherwise a tiny $\alpha $ produces a negative
1351: right hand side in \gzit{f.8}. Thus (i) also holds in this case.
1352: 
1353: (ii) Since $(f-\bar f)^*(g-\bar g)-(g-\bar g)^*(f-\bar f)=f^*g-g^*f$,
1354: it is sufficient to prove the uncertainty relation for the case of
1355: quantities $f,g$ whose value vanishes. In this case, (i) implies
1356: \[
1357: (\re \<f^*g\>)^2 +(\im \<f^*g\>)^2 =|\<f^*g\>|^2 \leq 
1358: \< f^*f \>\< g^*g \> = \sigma(f)^2\sigma(g)^2.
1359: \]
1360: The assertion follows since $\re \<f^*g\>=\cov(f,g)$ and
1361: \[
1362: i\im \<f^*g\>=\shalf(\<f^*g\>-\<f^*g\>^*)=\shalf\<f^*g-g^*f\>.
1363: \]
1364: 
1365: (iii) Again, it is sufficient to consider the case of
1366: quantities $f,g$ whose value vanishes. Then
1367: \lbeq{esig1}
1368: \begin{array}{lll}
1369: \sigma(f+g)^2 &=& \<(f+g)^*(f+g)\>
1370: =\<f^*f\>+\<f^*g+g^*f\>+\<g^*g\>\\
1371: &=& \sigma(f)^2+2\cov(f,g)+\sigma(g)^2,
1372: \end{array}
1373: \eeq
1374: and \gzit{ecov1} follows. \gzit{ecov} is an immediate consequence of
1375: (ii), and \gzit{esig} follows easily from \gzit{esig1} and 
1376: \gzit{ecov}. Finally, \gzit{e.prodbound} is a consequence of 
1377: \gzit{ecov} and Proposition \ref{p5.2}(iii).
1378: \epf
1379: 
1380: If we apply Proposition \ref{p5.1}(ii) to scalar position $q$ and 
1381: momentum $p$ variables satisfying the 
1382: \bfi{canonical commutation relation}
1383: \lbeq{ccr}
1384: [q,p]=i\hbar,
1385: \eeq
1386: we obtain 
1387: \lbeq{e6.unc0}
1388: \sigma(q)\sigma(p)\geq \shalf\hbar,
1389: \eeq
1390: the \bfi{uncertainty relation} of \sca{Heisenberg} \cite{Hei,Rob}.
1391: it implies that no state exists where both position $q$ and momentum 
1392: $p$ have arbitrarily small uncertainty. 
1393: 
1394: 
1395: 
1396: 
1397: