1:
2: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3: \chapter{Quantities, states, and statistics}\label{c.quants}
4:
5: When considered in sufficient
6: detail, no physical system is truly in global equilibrium; one can
7: always find smaller or larger deviations. To describe these deviations,
8: extra variables are needed, resulting in a more complete but also more
9: complex model. At even higher resolution, this model is again
10: imperfect and an approximation to an even more complex, better model.
11: This refinement process may be repeated in several stages.
12: At the most detailed stages, we transcend the frontier of
13: current knowledge in physics, but even as this frontier recedes,
14: deeper and deeper stages with unknown details are imaginable.
15:
16: Therefore, it is desirable to have a meta-description of thermodynamics
17: that, starting with a detailed model, allows to deduce the properties
18: of each coarser model, in a way that all description levels are
19: consistent with the current state of the art in physics.
20: Moreover, the results should be as independent as possible of unknown
21: details at the lower levels.
22: This meta-description is the subject of \bfi{statistical mechanics}.
23:
24: This chapter introduces the technical machinery of statistical
25: mechanics, Gibbs states and the partition function, in a uniform
26: way common to classical mechanics and quantum mechanics.
27: As in the phenomenological case, the intensive variables determine
28: the state (which now is a more abstract object), whereas the extensive
29: variables now appear as values of other abstract objects called
30: quantities. This change of setting allows the natural incorporation
31: of quantum mechanics, where quantities need not commute, while
32: values are numbers observable in principle, hence must
33: satisfy the commutative law.
34:
35: The operational meaning of the abstract concepts of quantities,
36: states and values introduced in the following becomes apparent once we
37: have recovered the phenomenological results of Chapter \ref{c.ctherm}
38: from the abstract theory developped in this and the next chapter.
39: Chapter \ref{c.models} discusses in more detail how the theory relates
40: to experiment.
41:
42: \at{adapt Section 1.5 to match the contents}
43:
44:
45: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
46: \section{Quantities}\label{s.quantities}
47:
48: Any fundamental description of physical systems must give account of
49: the numerical values of quantities observable in experiments when the
50: system under consideration is in a specified state. Moreover, the form
51: and meaning of states, and of what is observable in principle, must be
52: clearly defined. We consider an axiomatic conceptual
53: foundation on the basis of quantities\footnote{
54: We deliberately avoid the notion of observables, since it is not clear
55: on a fundamental level what it means to `observe' something, and since
56: many things (such as the finne structure constant, neutrino masses,
57: decay rates, scattering cross sections) which can be observed in nature
58: are only indirectly related to what is traditionally called an
59: `observable' in quantum mechanics. The related problem of how to
60: interpret measurements is discussed in Section \ref{s.measurement}.
61: } % end footnote
62: and their values, consistent with the conventions adopted by the
63: International System of Units (SI) \cite{SI}, who declare:
64: ''{\em A quantity in the general sense
65: is a property ascribed to phenomena, bodies, or substances that can
66: be quantified for, or assigned to, a particular phenomenon,
67: body, or substance. [...]
68: The value of a physical quantity is the quantitative expression
69: of a particular physical quantity as the product of a number and a
70: unit, the number being its numerical value.}''
71:
72: In different states, the quantities of a given system may have
73: different values; the state (equivalently, the values determined by it)
74: characterizes an individual system at a particular time.
75: Theory must therefore define what to consider as quantities,
76: what as states, and how a state assigns values to a quantity.
77: Since quantities can be added, multiplied, compared, and integrated,
78: the set of all quantities has an elaborate structure whose properties
79: we formulate after the discussion of the following motivating example.
80:
81: \begin{example}\label{ex.Nlevel}
82: As a simple example satisfying the axioms to be
83: introduced, the reader may think of an $N$-level quantum system.
84: The \bfi{quantities} are the elements of the algebra
85: $\Ez=\Cz^{N\times N}$ of square complex $N\times N$ matrices, the
86: \bfi{constants} are the multiples of the identity matrix, the
87: \bfi{conjugate} $f^*$ of $f$ is given by conjugate transposition, and
88: the \bfi{integral} $\sint g = \tr g$ is the \bfi{trace}, the sum of the
89: diagonal entries or, equivalently, the sum of the eigenvalues.
90: The standard basis consisting of the $N$ \bfi{unit vectors} \idx{$e^k$}
91: with a one in component $k$ and zeros in all other component
92: corresponds to the $N$ levels of the quantum systems. The Hamiltonian
93: $H$ is represented by a diagonal matrix $H=\Diag(E_1,\dots,E_N)$
94: whose diagonal entries $E_k$ are the \bfi{energy levels} of the system.
95: In the nondegenerate case, all $E_k$ are distinct, and the diagonal
96: matrices comprise all functions of $H$. Quantities representing
97: arbitrary nondiagonal matrices are less easy to interpret. However,
98: an important class of quantities are the matrices of the form
99: $P=\psi\psi^*$, where $\psi$ is a vector of norm 1; they satisfy
100: $P^2=P=P^*$ and are the quantities observed in binary measurements
101: such as detector clicks; see Section \ref{s.qprob}.
102: The \bfi{states} of the $N$-level system are represented by a
103: \bfi{density matrix} $\rho\in\Ez$, a positive semidefinite Hermitian
104: matrix with trace one. The \bfi{value} of a quantity $f\in\Ez$
105: is the number $\<f\>=\tr \rho f$. The diagonal entries $p_k:=\rho_{kk}$
106: represent the probability for obtaining a response in a binary test for
107: the $k$th quantum level; the off-diagonal entries $\rho_{jk}$
108: represent deviations from a classical mixture of quantum levels.
109: \end{example}
110:
111: \begin{dfn} ~\\
112: (i) A \bfi{$*$-algebra} is a set $\Ez$ together
113: with operations on $\Ez$ defining for any two quantities $f,g\in\Ez$
114: the \bfi{sum} $f+g\in\Ez$, the
115: \bfi{product} $fg\in\Ez$, and the \bfi{conjugate} $f^*\in\Ez$,
116: such that the following axioms (Q1)--(Q4) hold for all $\alpha\in\Cz$
117: and all $f,g,h\in\Ez$:
118:
119: (Q1)
120: ~$\Cz \subseteq \Ez$, i.e., complex numbers are special elements
121: called \bfi{constants}, for which addition, multiplication and
122: conjugation have their traditional meaning.
123:
124: (Q2)
125: ~{$(fg)h=f(gh)$,~~ $\alpha f=f\alpha $,~~ $0f=0$,~~ $1f=f$.}
126:
127: (Q3)
128: ~{$(f+g)+h=f+(g+h)$,~~ $f(g+h)=fg+fh$,~~ $f+0=f$.}
129:
130: (Q4)
131: ~{$f^{**}=f$,~~ $(fg)^* =g^* f^* $,~~ $(f+g)^* =f^* +g^*$.}
132:
133: (ii) A $*$-algebra $\Ez$ is called \bfi{commutative} if $fg=gf$ for
134: all $f,g\in\Ez$, and \bfi{noncommutative} otherwise.
135: The $*$-algebra $\Ez$ is called \bfi{nondegenerate} if
136:
137: (Q5)
138: ~{$f^* f =0 \implies f =0$.}
139:
140: (iii) We introduce the notation
141: \[
142: -f:=(-1)f,~~ f-g:=f+(-g), ~~~[f,g]:=fg-gf,
143: \]
144: \[
145: f^0:=1,~~ f^l:=f^{l-1}f~~~ (l=1,2,\dots ),
146: \]
147: \[
148: \re f := \half(f+f^*),~~~\im f := \frac{1}{2i}(f-f^*),
149: \]
150: for $f,g\in\Ez$.
151: $[f,g]$ is called the \bfi{commutator} of $f$ and $g$, and $\re f$,
152: $\im f$ are referred to as the \bfi{real part} (or \bfi{Hermitian part})
153: and \bfi{imaginary part} of $f$, respectively.
154: $f\in\Ez$ is called \bfi{Hermitian} if $f^*=f$..
155:
156: (iv) A \idx{$*$-homomorphism} is a mapping $\phi$ from a $*$-algebra
157: $\Ez$
158: with unity to another (or the same) $*$-algebra $\Ez'$ with unity
159: such that
160: \[
161: \phi(f+g)=\phi(f)+\phi(g),~~~\phi(fg)=\phi(f)\phi(g),~~~
162: \phi(\alpha f)=\alpha\phi(f),
163: \]
164: \[
165: \phi(f^*)=\phi(f)^*,~~~\phi(1)=1.
166: \]
167: for all $f,g$ in $\Ez$ and $\alpha\in\Cz$.
168: \end{dfn}
169:
170: Note that we assume commutativity only for the product of numbers and
171: elements of $\Ez$. In general, the product of two elements of $\Ez$
172: is indeed noncommutative.
173: However, general commutativity of the addition is a consequence of our
174: other assumptions. We prove this together with some other useful
175: relations.
176:
177: \begin{prop}\label{p5.1.2}~\\
178: (i) For all $f$, $g$, $h\in \Ez$,
179: \lbeq{e.p1}
180: (f+g)h=fh+gh,~~f-f=0,~~ f+g=g+f
181: \eeq
182: \lbeq{e.p2}
183: [f,f^*]=-2i[\re f,\im f].
184: \eeq
185: (ii) For all $f\in\Ez$, $\re f$ and $\im f$ are Hermitian. $f$ is
186: Hermitian iff $f=\re f$ iff $\im f=0$. If $f,g$ are commuting
187: Hermitian quantities then $fg$ is Hermitian, too.
188: \end{prop}
189:
190: \bepf
191: (i) The right distributive law follows from
192: \[
193: \begin{array}{lll}
194: (f+g)h&=&((f+g)h)^{* *}=(h^* (f+g)^* )^* =(h^* (f^* +g^* ))^* \\
195: &=&(h^* f^* +h^* g^* )^* =(h^* f^* )^* +(h^* g^* )^* \\
196: &=&f^{* * }h^{* * }+g^{* * }h^{* * }=fh+gh.
197: \end{array}
198: \]
199: It implies $f-f=1f-1f=(1-1)f=0f=0$. From this, we may deduce that
200: addition is commutative, as follows. The quantity $h:=-f+g$
201: satisfies
202: \[
203: -h=(-1)((-1)f+g)=(-1)(-1)f+(-1)g=f-g,
204: \]
205: and we have
206: \[
207: f+g=f+(h-h)+g=(f+h)+(-h+g)=(f-f+g)+(f-g+g)=g+f.
208: \]
209: This proves \gzit{e.p1}. If $u=\re f$, $v=\im f$ then $u^*=u,v^*=v$
210: and $f=u+iv, f^*=u-iv$. Hence
211: \[
212: [f,f^*]=(u+iv)(u-iv)-(u-iv)(u+iv)=2i(vu-uv)=-2i[\re f,\im f],
213: \]
214: giving \gzit{e.p2}.
215:
216: (ii) The first two assertions are trivial, and the third holds since
217: $(fg)^*=g^*f^*=gf=fg$ if $f,g$ are Hermitian and commute.
218: \epf
219:
220:
221: \begin{dfn}~\\
222: (i) The $*$-algebra $\Ez$ is called \bfi{partially ordered} if there is
223: a partial order $\geq$ satisfying the following axioms (Q6)--(Q9)
224: for all $f,g,h\in\Ez$:
225:
226: (Q6)
227: ~$\geq$ is reflexive ($f\geq f$),
228: antisymmetric ($f\geq g \geq f \Rightarrow f=g$),
229: and transitive ($f\geq g \geq h \Rightarrow f \geq h)$).
230:
231: (Q7)
232: ~{$f\geq g \implies f+h\geq g+h$.}
233:
234: (Q8)
235: ~{$f\geq 0 \implies f=f^*$ and $g^*fg\geq 0$.}
236:
237: (Q9)
238: ~ $1 \geq 0$.
239:
240: We introduce the notation
241: \[
242: f \leq g :\Leftrightarrow g\geq f,
243: \]
244: \[
245: \|f\|:=\inf\{\alpha\in\Rz \mid f^*f \leq \alpha^2, \alpha\geq0 \},
246: \]
247: where the infimum of the empty set is taken to be $\infty$. The number
248: $\|f\|$ is referred to as the \bfi{(spectral) norm} of $f$.
249: An element $f\in\Ez$ is called \bfi{bounded} if $\|f\|<\infty$.
250: The \bfi{uniform topology} is the topology induced on
251: $\Ez$ by declaring as open sets arbitrary unions of finte intersections
252: of the \bfi{open balls} $\{f\in\Ez \mid \|f-f_0\|<\eps\}$ for some
253: $\eps>0$ and some $f_0 \in\Ez$.
254: \end{dfn}
255:
256: \begin{prop}\label{p1.3}~\\
257: (i) For all quantities $f$, $g$, $h\in \Ez$ and $\lambda \in\Cz$,
258: \lbeq{e.p3}
259: f^*f\geq 0,~~ ff^*\geq 0.
260: \eeq
261: \lbeq{e.p4}
262: f^*f\leq 0 \implies \|f\|=0 \implies f=0,
263: \eeq
264: \lbeq{e.p5}
265: f\leq g \implies h^*fh\leq h^*gh,~|\lambda|f\leq|\lambda|g,
266: \eeq
267: \lbeq{e.p6}
268: f^*g+g^*f\leq 2\|f\|~\|g\|,
269: \eeq
270: \lbeq{e.p7}
271: \|\lambda f\|=|\lambda| \|f\|,~~~ \|f\pm g\|\leq \|f\|\pm \|g\|,
272: \eeq
273: \lbeq{e.p8}
274: \|f g\|\leq \|f\|~ \|g\|.
275: \eeq
276: (ii) Among the complex numbers, precisely the nonnegative real numbers
277: $\lambda$ satisfy $\lambda\geq 0$.
278:
279: \end{prop}
280:
281: \bepf
282: (i) \gzit{e.p3}--\gzit{e.p5} follow directly from
283: (Q7) -- (Q9). Now let $\alpha=\|f\|$, $\beta=\|g\|$. Then
284: $f^*f\leq \alpha^2$ and $g^*g\leq \beta^2$. Since
285: \[
286: \begin{array}{lll}
287: 0\leq (\beta f - \alpha g)^*(\beta f - \alpha g)&=&
288: \beta^2f^*f-\alpha\beta(f^*g+g^*f)+\alpha^2 g^*g\\
289: &\leq& \beta^2\alpha^2 \pm\alpha\beta(f^*g+g^*f) +\alpha^2 g^*g,
290: \end{array}
291: \]
292: $f^*g+g^*f\leq 2\alpha\beta$ if $\alpha\beta\neq 0$, and for
293: $\alpha\beta=0$, the same follows from \gzit{e.p4}. Therefore
294: \gzit{e.p6} holds. The first half of \gzit{e.p7} is trivial, and
295: the second half follows for the plus sign from
296: \[
297: (f+g)^*(f+g)=f^*f+f^*g+g^*f+g^*g
298: \leq \alpha^2+ 2\alpha\beta+\beta^2=(\alpha+\beta)^2,
299: \]
300: and then for the minus sign from the first half.
301: Finally, by \gzit{e.p5},
302: \[
303: (fg)^*(fg)=g^*f^*fg\leq g^*\alpha^2g=\alpha^2g^*g\leq\alpha^2\beta^2.
304: \]
305: This implies \gzit{e.p8}.
306:
307: (ii) If $\lambda$ is a nonnegative real number then $\lambda=f^*f\geq0$
308: with $f=\sqrt{\lambda}$. If $\lambda$ is a negative real number then
309: $\lambda=-f^*f\leq0$ with $f=\sqrt{-\lambda}$, and by antisymmetry,
310: $\lambda\geq0$ is impossible. If $\lambda$ is a nonreal number then
311: $\lambda\neq\lambda^*$ and $\lambda\geq0$ is impossible by (Q8).
312: \epf
313:
314: \begin{dfn}
315: A \bfi{Euclidean $*$-algebra} is a nondegenerate, partially ordered
316: $*$-algebra $\Ez$, whose elements are called
317: \bfi{quantities}, together with a complex-valued
318: \bfi{integral} $\sint$ defined on a subspace $\Sz$ of $\Ez$,
319: whose elements are called \bfi{strongly integrable}, satisfying
320: the following axioms (EA1)--(EA6):
321:
322: (EA1) ~
323: $g$ bounded, $h$ strongly integrable $~~\Rightarrow~~ h^*,gh,hg$
324: strongly integrable,
325:
326: (EA2) ~
327: $ \sint h^* h > 0$ ~if $h \not= 0$,
328:
329: (EA3) ~
330: $(\sint h) ^* = \sint h^*, ~~~\sint gh = \sint hg$,
331:
332: (EA4) ~
333: $\sint h^* g h= 0$ for all strongly integrable $h~~\Rightarrow~~ g=0$~~~
334: \bfi{(nondegeneracy)},
335:
336: (EA5) ~
337: $ \sint h_l^* h_l \to 0 ~~\Rightarrow~~ \sint g h_l \to 0$,~
338: $\sint h_l^* g h_l \to 0$,
339:
340: (EA6) ~
341: $h_l\downto 0~~\Rightarrow~~ \inf\sint h_l=0$~~~
342: \bfi{(Dini property)}.
343:
344: Here, integrals extend over the longest following product or quotient
345: (while later, differential operators act on the shortest syntactically
346: meaningful term), the \bfi{monotonic limit} is defined by
347: $g_l \downarrow 0$ iff, for every strongly integrable $h$, the sequence
348: (or net) $\sint h^*g_lh$ consists of real numbers converging
349: monotonically decreasing to zero.
350: \end{dfn}
351:
352: Note that the integral can often be naturally extended from strongly
353: integrable quantities to a significantly larger space of integrable
354: quantities.
355:
356: \begin{prop}
357: \lbeq{e.ean4}
358: g\in\Ez,~~\sint gf = 0 \Forall f \in \Ez \implies g=0.
359: \eeq
360: For strongly integrable $f,g$,
361: \lbeq{e.intcs}
362: \sint (gh)^*(gh)\le \sint g^*g~\sint h^*h.~~~
363: \mbox{\bf (\bfi{Cauchy-Schwarz inequality})}
364: \eeq
365: In particular, every strongly integrable quantity is bounded.
366: \end{prop}
367:
368: \bepf
369:
370: If $\sint gf = 0$ for all $f \in \Ez$ then this holds in particular
371: for $f=hh^*$. Thus $0=\sint ghh^*=\sint h^*gh$ by (EA2), and
372: (EA4) gives the desired conclusion \gzit{e.ean4}.
373: \gzit{e.intcs} holds since by (EA2), $\sint g^*h$ defines a positive
374: definite inner product on $\Sz$, and directly implies the final
375: statement.
376: \epf
377:
378: We now describe the basic Euclidean $*$-algebras relevant in
379: nonrelativistic physics. However, the remainder is completely
380: independent of details how the axioms are realized; a specific
381: realization is needed only when doing specific quantitative
382: calculations.
383:
384: \begin{expls}\label{e3.1}~\\
385: (i) \bfi{($N$-level quantum systems)}
386: The simplest family of Euclidean
387: $*$-algebras is the algebra $\Ez=\Cz^{N\times N}$ of
388: square complex $N\times N$ matrices; cf. Example \ref{ex.Nlevel}.
389: Here the quantites are square matrices, the constants are the multiples
390: of the identity matrix, the conjugate is conjugate transposition, and
391: the integral is the trace, the sum of the diagonal entries or,
392: equivalently, the sum of the eigenvalues. In particular, all quantities
393: are strongly integrable.
394:
395:
396: (ii) \bfi{(Nonrelativistic classical mechanics)}
397: An atomic $N$-particle system is described in classical mechanics by
398: the phase space $\Rz^{6N}$ with six coordinates -- position
399: $x^a\in\Rz^3$ and momentum $p^a\in\Rz^3$ -- for each particle.
400: The algebra
401: \[
402: \Ez_N:= C^\infty(\Rz^{6N})
403: \]
404: of smooth complex-valued functions
405: $g(x^{1:N},p^{1:N})$ of positions and momenta is a commutative
406: Euclidean $*$-algebra with complex conjugation as conjugate
407: and the \bfi{Liouville integral}
408: \[
409: \sint g=C^{-1} \int dp^{1:N}dx^{1:N} g_N(x^{1:N},p^{1:N}),
410: \]
411: where $C$ is a positive constant.
412: Strongly integrable quantities are the Schwartz functions in $\Ez$.
413: The axioms are easily verified.
414:
415: (iii) \bfi{(Classical fluids)}
416: A fluid is classically described by an atomic system with an
417: indefinite number of particles. The appropriate Euclidean $*$-algebra
418: for a single species of monatomic particles is the
419: direct sum $\Ez=\D\oplus_{N\ge 0} \Ez_N$ whose quantities are
420: infinite sequences $g=(g_0,g_1,...)$ of $g_N\in\Ez_N$, with
421: $\Ez_N$ as in (i), and weighted Liouville integral
422: \[
423: \sint g=\sum_{N\ge 0}
424: C_N^{-1}\int dp^{1:N}dx^{1:N} g_N(x^{1:N},p^{1:N}).
425: \]
426: Here $C_N$ is a symmetry factor for the symmetry group of the
427: $N$-particle systen, which equals $h^{3N}N!$ for indistinguishable
428: particles; $h= 2\pi \hbar$ is Planck's constant.
429: This accounts for the Maxwell statistics and gives the correct entropy
430: of mixing. Classical fluids with monatomic particles of several
431: different kinds require a tensor product of several such algebras, and
432: classical fluids composed of molecules require additional degrees
433: of freedom to account for the rotation and vibration of the molecules.
434:
435:
436: (iv) \bfi{(Nonrelativistic quantum mechanics)}
437: Let $\Hz$ be a Euclidean space, a dense subspace of a Hilbert space.
438: Then the algebra $\Ez:= \Lin \Hz$ of continuous linear operators
439: on $\Hz$ is a Euclidean $*$-algebra with the adjoint as conjugate and
440: the \bfi{quantum integral}
441: \[
442: \sint g= \tr g,
443: \]
444: given by the trace of the quantity in the integrand.
445: Strongly integrable quantities are the operators $g\in\Ez$ which
446: are trace class; this includes all linear operators of finite rank.
447: Again, the axioms are easily verified. In the quantum context,
448: Hermitian quantities $f$ are often referred to as \bfi{observables};
449: but we do not use this term here.
450:
451: \end{expls}
452:
453: We end this section by stating some results needed later.
454: The exposition in this and the next chapter is fully rigorous if the
455: statements of Proposition \ref{app2a.} and Proposition \ref{app1.}
456: are assumed in addition to (EA1)--(EA6).
457: We prove these propositions only in case that $\Ez$
458: is finite-dimensional\footnote{
459: We'd appreciate to be informed about possible proofs in general that
460: only use the properties of Euclidean $*$-algebras (and perhaps further,
461: elementary assumptions).
462: }. % end footnote
463: But they can also be proved if the quantities
464: involved are smooth functions, or if they have a spectral
465: resolution; cf., e.g., \sca{Thirring} \cite{Thi} (who works in the
466: framework of $C^*$-algebras and von Neumann algebras).
467:
468: \bigskip
469: \begin{prop} \label{app2a.}
470: For arbitrary quantities $f$, $g$,
471: \[
472: e^{\alpha f}e^{\beta f}=e^{(\alpha+\beta)f}~~(\alpha,\beta\in\Rz),
473: \]
474: \[
475: (e^f)^*=e^{f^*},
476: \]
477: \[
478: e^f g = g e^f ~~~\mbox{if $f$ and $g$ commute},
479: \]
480: \[
481: f^*=f \implies \log e^f=f,
482: \]
483: \[
484: f\ge 0 \implies \sqrt{f}\ge 0,~~(\sqrt{f})^2 =f,
485: \]
486: For any quantity $f=f(s)$ depending continuously on $s\in[a,b]$,
487: \[
488: \int_a^b ds \sint f(s) = \sint \Big(\int_a^b ds f(s)\Big),
489: \]
490: and for any quantity $f=f(\lambda)$ depending continuously
491: differentiably on a parameter vector $\lambda$,
492: \[
493: \frac{d}{d\lambda} \sint f = \sint df/d\lambda.
494: \]
495: \end{prop}
496:
497: \bepf
498: In finite dimensions, the first four assertions are standard
499: matrix calculus, and the remaining two statements hold since $\sint f$
500: must be a finite linear combination of the components of $f$.
501: \epf
502:
503: \begin{prop} \label{app1.}
504: Let $f,g$ be quantities depending continuously differentiably on a
505: parameter or parameter vector $\lambda $, and suppose that
506: \[
507: [f(\lambda ),g(\lambda )]=0\mbox { for all }\lambda.
508: \]
509: Thus, for any continuously differentiable function $F$ of two
510: variables,
511: \lbeq{app1}
512: \frac {d} {d\lambda }\sint F(f,g)
513: =\sint\partial _1F(f,g)\frac {df} {d\lambda }
514: +\sint\partial _2F(f,g)\frac {dg } {d\lambda }\ .
515: \eeq
516: Here $\partial _1F$ and $\partial _2 f$ denote differentiation by the
517: first and second argument of $F$, respectively
518: \end{prop}
519:
520: \bepf
521: We prove the special case $F(x,y)=x^my^n$, where (\ref{app1}) reduces
522: to
523: \lbeq{app2}
524: \frac {d} {d\lambda }\sint f^mg^n
525: =\sint mf^{m-1}g^n\frac {df} {d\lambda }
526: +\sint nf^mg^{n-1}\frac {dg} {d\lambda}.
527: \eeq
528: The general case then follows for polynomials $F(x,y)$ by taking
529: suitable linear combinations, and for arbitrary $F$ by a limiting
530: procedure. To prove (\ref{app2}), we note that, more generally,
531: \[
532: \begin{array}{lll}
533: \D\frac {d} {d\lambda }\sint f_1\dots f_{m+n}
534: &=\sint\frac {d} {d\lambda }(f_1\dots f_{m+n})\\
535: &\D=\sint\sum _{j=1} ^{m+n}f_1\dots f_{j-1}\frac {df_j} {d\lambda }
536: f_{j+1}\dots f_{m+n} \\
537: &\D=\sum _{j=1} ^{m+n}\sint f_1\dots f_{j-1}\frac {df_j} {d\lambda }
538: f_{j+1}\dots f_{m+n} \\
539: &\D=\sum _{j=1} ^{m+n}\sint f_{j+1}\dots f_{m+n}f_1\dots f_{j-1}
540: \frac {df_j} {d\lambda }\ ,
541: \end{array}
542: \]
543: using the cyclic commutativity (EA3) of the integral.
544: If we specialize to $f_j=f$ if $j\le m$, $f_j=g$ if $j>m$, and note
545: that $f$ and $g$ commute, we arrive at (\ref{app2}).
546: \epf
547:
548: Of course, the proposition generalizes to families of more than two
549: commuting quantities; but more important is the special case $g=f$:
550:
551: \begin{cor} \label{app2.}
552: For any quantity $f$ depending continuously differentiably on a
553: parameter vector $\lambda $, and any continuously differentiable
554: function $F$ of a single variable,
555: \lbeq{app3}
556: \frac {d} {d\lambda }\sint F(f)=\sint F'(f)\frac {df} {d\lambda }.
557: \eeq
558: \end{cor}
559:
560:
561: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
562: \section{Gibbs states}\label{s.gibbs}
563:
564: Our next task is to specify the formal properties of the value of a
565: quantity.
566:
567: \begin{dfn}\label{d.state}
568: A \bfi{state} is a mapping $^-$ that assigns to all quantities $f$
569: from a subspace of $\Ez$ containing all bounded quantities
570: its \bfi{value} $\overline{f}=:\< f\> \in \Cz$
571: such that for all $f,g \in \Ez$, $\alpha \in \Cz$,
572:
573: (E1)~ $\<1\> =1, ~~\<f^*\>=\<f\>^*,~~ \< f+g\> =\<f\> +\<g\> $,
574:
575: (E2)~ $\<\alpha f\> =\alpha\<f\>$,
576:
577: (E3)~ If $f \ge 0$ then $\<f\> \ge 0$,
578:
579: (E4)~ If $f_l\in\Ez,~ f_l \downarrow 0$ then $\<f_l\> \downarrow 0$.
580: \end{dfn}
581:
582: Note that this formal definition of a state -- always used in the
583: remainder of the book -- differs from the phenomenological
584: thermodynamic states defined in Section \ref{s.phen}.
585: The connection between the two notions will be made in
586: Section \ref{s.eos}.
587:
588: Statistical mechanics essentially originated with Josiah Willard Gibbs,
589: whose 1902 book \sca{Gibbs} \cite{Gib} on (at that time of course
590: only classical) statistical mechanics is still readable. See
591: \sca{Uffink} \cite{Uff} for a history of the subject.
592:
593: All states arising in thermodynamics have the following
594: particular form.
595:
596: \begin{dfn} \label{2.7.}
597: A \bfi{Gibbs state} is defined by assigning to any $g\in\Ez$ the value
598: \lbeq{2-10a}
599: \<g\>:=\sint e^{-S/\kbar} g,
600: \eeq
601: where $S$, called the \bfi{entropy} of the state, is a Hermitian
602: quantity with strongly integrable $e^{-S/\kbar}$, satisfying the
603: normalization condition
604: \lbeq{2-10}
605: \sint e^{-S/\kbar}=1,
606: \eeq
607: and $\kbar$ is the Boltzmann constant
608: \lbeq{e.kbar}
609: \kbar \approx 1.38065 \cdot 10^{-23} J/K.
610: \eeq
611: Theorem \ref{2.6.} below implies that a Gibbs state is indeed a state.
612: \end{dfn}
613:
614: The Boltzmann constant defines the units in which the entropy is
615: measured. In analogy\footnote{
616: As we shall see in \gzit{e.qmunc} and \gzit{e.thunc}, $\hbar$ and
617: $\kbar$ play indeed analogous roles in quantum mechanical and
618: thermodynamic uncertainty relations.
619: } % end footnote
620: with Planck's constant $\hbar$,
621: we write $\kbar$ in place of the customary $k$ or $k_B$, in order to
622: be free to use the letter $k$ for other purposes.
623: By a change of units one can enforce any value of $\kbar$.
624: Chemists use instead of particle number $N$ the corresponding \bfi{mole
625: number}, which differs by a fixed numerical factor, the \bfi{Avogadro
626: constant}
627: \[
628: N_A=R/\kbar \approx 6.02214 \cdot 10^{23}\fct{mol}^{-1},
629: \]
630: where $R$ is the universal gas constant \gzit{e.R}.
631: As a result, all results from statistical mechanics may be translated
632: to phenomenological thermodynamics by setting $\kbar = R$,
633: corresponding to setting $1 \fct{mol} = 6.02214 \cdot 10^{23}$,
634: the number of particles in one mole of a pure substance.
635:
636: What is here called entropy has a variety of alternative names in the
637: literature on statistical mechanics. For example,
638: \sca{Gibbs} \cite{Gib}, who first noticed the rich thermodynamic
639: implications of states defined by \gzit{2-10a}, called $-S$ the
640: {\em index of probability};
641: \sca{Alhassid \& Levine} \cite{AlhL} and \sca{Balian} \cite{Bal2}
642: use the name {\em surprisal} for $S$. Our terminology is close to
643: that of \sca{Mrugala} et al. \cite{MruNSS}, who call
644: $S$ the {\em microscopic entropy}, and \sca{Hassan} et al. \cite{HasVL},
645: who call $S$ the {\em information(al) entropy operator}.
646: What is traditionally (and in Section \ref{s.phen}) called entropy
647: and denoted by $S$
648: is in the present setting the value $\ol S=\<S\>$.
649:
650:
651: \begin{thm} \label{2.6.}~\\
652: (i) A Gibbs state determines its entropy uniquely.
653:
654: (ii) For any Hermitian quantity $f$ with strongly integrable $e^{-f}$,
655: the mapping $\<\cdot\>_f$ defined by
656: \lbeq{2-6a}
657: \< g \>_f:=Z_f^{-1}\sint e^{-f} g,~~~\mbox{where } Z_f:=\sint e^{-f},
658: \eeq
659: is a state. It is a Gibbs state with entropy
660: \lbeq{2-8}
661: S_f:=\kbar (f+\log Z_f).
662: \eeq
663: (iii) The \bfi{KMS condition} (cf. \sca{Kubo} \cite{Kub0},
664: \sca{Martin \& Schwinger} \cite{MarS})
665: \lbeq{e.KMS}
666: \<gh\>_f = \<hQ_f g\> ~~~\mbox{for bounded } g,h
667: \eeq
668: holds. Here $Q_f$ is the linear mapping defined by
669: \[
670: Q_f g :=e^{-f}ge^{f}.
671: \]
672: \end{thm}
673:
674: \bepf
675: (i) If the entropies $S$ and $S'$ define the same Gibbs state then
676: \[
677: \sint (e^{-S/\kbar}-e^{-S'/\kbar}) g = \<g\>-\<g\>=0
678: \]
679: for all $g$, hence \gzit{e.ean4} gives $e^{-S/\kbar}-e^{-S'/\kbar}=0$.
680: This implies that $e^{-S/\kbar}=e^{-S'/\kbar}$, hence $S=S'$ by
681: Proposition \ref{app2a.}.
682:
683: (ii) The quantity $d:=e^{-f/2}$ is nonzero and satisfies $d^*=d$,
684: $e^{-f}=d^*d\geq 0$. Hence $Z_f>0$ by (EA2), and $\rho:=Z_f^{-1}e^{-f}$
685: is Hermitian and nonnegative. For $h\ge 0$, the quantity $g=\sqrt{f}$
686: is Hermitian (by Proposition \ref{app2a.}) and satisfies
687: $g\rho g^*=Z_f^{-1}(gd)(gd)^* \ge 0$, hence
688: by (EA3),
689: \[
690: \<h\>_f=\<g^*g\>_f= \sint \rho g^*g =\sint g\rho g^* \ge 0.
691: \]
692: Moreover, $\<1\>_f =Z_f^{-1}\sint e^{-f}=1$. Similarly, if $g\ge 0$
693: then $g=h^*H$ with $h=\sqrt{g}=h^*$ and with $k:=e^{-f/2}h$, we get
694: \[
695: Z_f\<g\>_f = \sint e^{-f}hh^*=\sint h^*e^{-f}h = \sint k^*k \ge 0.
696: \]
697: This implies (E3). the other axioms (E1)--(E4) follow easily from the
698: corresponding properties of the integral. Thus $\<\cdot\>_f$ is a state.
699: Finally, with the definition \gzit{2-8}, we have
700: \[
701: Z_f^{-1}e^{-f}=e^{-f-\log Z_f}=e^{-S_f/\kbar},
702: \]
703: whence $\<\cdot\>_f$ is a Gibbs state.
704:
705: (iii) By (EA3),
706: $\<hQ_fg\>_f=\sint e^{-f}hQ_fg=\sint Q_fge^{-f}h =\sint e^{-f}gh
707: =\<gh\>_f$.
708: \epf
709:
710: Note that the state \gzit{2-6a} is unaltered when $f$ is
711: shifted by a constant. $Q_f$ is called the \bfi{modular automorphism}
712: of the state $\<\cdot\>_f$ since $Q_f(gh)=Q_f(g)Q_f(h)$; for a classical
713: system, $Q_f$ is the identity. In the following, we shall not make use
714: of the KMS condition; however, it plays an important role in the
715: mathematics of the thermodynamic limit (cf. \sca{Thirring} \cite{Thi}).
716:
717: $Z_f$ is called the \bfi{partition function} of $f$; it is a function of
718: whatever parameters appear in a particular form given to $f$ in the
719: applications. A large part of traditional statistical mechanics is
720: concerned with the calculation, for given $f$, of the partition
721: function $Z_f$ and of the values $\<g\>_f$ for selected quantities $g$.
722: As we shall see, the basic results of
723: statistical mechanics are completely independent of the details
724: involved, and it is this basic part that we concentrate upon in this
725: book.
726:
727: \begin{expl}\label{ex.canonical}
728: A \bfi{canonical ensemble}\footnote{\label{f.ensemble}
729: Except in the traditional notions of a microcanonical, canonical, or
730: grand canonical ensemble, we avoid the term \bfi{ensemble} which in
731: statistical mechanics is de facto uses as a synonym for state but
732: often has the connotation of a large real or imagined
733: collection of identical copies of a systems. The latter interpretation
734: has well-known difficulties to explain why each single
735: macroscopic system is described correctly by thermodynamics;
736: see, e.g., \sca{Sklar} \cite{Skl}.
737: }, % end footnote
738: is defined as a Gibbs state whose entropy is an affine function of a
739: Hermitian quantity $H$, called the \bfi{Hamiltonian}:
740: \[
741: S=\beta H + \const,
742: \]
743: with a constant depending on $\beta$, computable from \gzit{2-8} and
744: the partition function
745: \[
746: Z=\sint e^{-\beta H}
747: \]
748: of $f=\beta H$.
749: In particular, in the quantum case, where $\sint$ is the trace, the
750: finiteness of $Z$ implies that
751: $S$ and hence $H$ must have a discrete spectrum that is bounded below.
752: Hence the partition function takes the familiar form
753: \lbeq{e3.3}
754: Z=\tr e^{-\beta H} = \sum_{n \in \cal N} e^{-\beta E_n},
755: \eeq
756: where the $E_n$ ($n\in\cal N$) are the \bfi{energy levels}, the
757: eigenvalues of $H$.
758: If the spectrum of $H$ is known, this leads to explicit formulas for
759: $Z$. For example, a \bfi{two level system} is defined by the energy
760: levels $0,E$ (or $E_0$ and $E_0+E$, which gives the same results),
761: and has
762: \lbeq{e.2level}
763: Z=1+e^{-\beta E}.
764: \eeq
765: It describes a single \bfi{Fermion mode}, but also many other systems
766: at low temperature; cf. \gzit{e.2levelapprox}. In particular, it is the
767: basis of laser-induced chemical reactions in photochemistry (see, e.g.,
768: \sca{Karlov} \cite{Kar}, \sca{Murov} et al. \cite{MurCH}), where
769: only two electronic energy levels (the ground state and the first
770: excited state) are relevant; cf. the discussion of
771: \gzit{e.2levelapprox} below.
772:
773: For a \bfi{harmonic oscillator}, defined by the energy levels $nE$,
774: $n=0,1,2,\dots$ and describing a single \bfi{Boson mode}, we have
775: \[
776: Z=\sum_{n=0}^\infty e^{-n\beta E} = (1-e^{-\beta E})^{-1}.
777: \]
778: Independent modes are modelled by taking tensor products of single
779: mode algebras and adding their Hamiltonians, leading to spectra which
780: are obtained by summing the eigenvalues of the modes in all possible
781: ways. The resulting partition function is the product of the
782: single-mode partition functions.
783: \at{expand? treat Maxwell case? $\sint f = \sum_n f(n)/n!$}]
784: From here, a thermodynamic limit
785: leads to the properties of ideal gases. Then nonideal gases due to
786: interactions can be handled using the cumulant expansion, as
787: indicated at the end of Section \ref{s.gen}. The details are outside
788: the scope of this book.
789: \end{expl}
790:
791: Since the Hamiltonian can be any Hermitian quantity, the quantum
792: partition function formula \gzit{e3.3} can in principle be used to
793: compute the partition function of arbitrary quantized Hermitian
794: quantities.
795:
796:
797: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
798: \section{Kubo product and generating functional} \label{s.gen}
799:
800:
801:
802: The negative logarithm of the partition function, the so-called
803: generating functional, plays a fundamental role in statistical
804: mechanics.
805:
806: We first discuss a number of general properties, discovered by
807: \sca{Gibbs} \cite{Gib}, \sca{Peierls} \cite{Pei},
808: \sca{Bogoliubov} \cite{Bog}, \sca{Kubo} \cite{Kub},
809: \sca{Mori} \cite{Mor}, and \sca{Griffiths} \cite{Gri}.
810: The somewhat technical setting involving the Kubo inner product is
811: necessary to handle noncommuting quantities correctly;
812: everything would be much easier in the classical case.
813: On a first reading, the proofs in this section may be skipped.
814:
815: \begin{prop} Let $f$ be Hermitian such that $e^{sf}$ is strongly
816: integrable for all $s\in[-1,1]$. Then
817: \lbeq{e.kubo}
818: \<g;h\>_f:=\<g E_f h\>_f,
819: \eeq
820: where $E_f$ is the linear mapping defined for Hermitian $f$ by
821: \[
822: E_f h:=\int_0^1 ds\, e^{-sf}he^{sf},
823: \]
824: defines a bilinear, positive definite inner product
825: $\<\cdot\,;\cdot\>_f$ on the algebra of quantities,
826: called the \bfi{Kubo} (or \bfi{Mori} or \bfi{Bogoliubov})
827: \bfi{inner product}.
828: For all $f,g$, the following relations hold:
829: \lbeq{e.kubo2}
830: \<g;h\>_f^* =\<h^*;g^*\>_f.
831: \eeq
832: \lbeq{e.definit}
833: \<g^*;g\>_f > 0 ~~~\mbox{if } g \ne 0.
834: \eeq
835: \lbeq{e.kubo1}
836: \<g;h\>_f =g\<h\>_f ~~~\mbox{if $g\in \Cz$},
837: \eeq
838: \lbeq{e.kubo0}
839: \<g;h\>_f =\<gh\>_f ~~~\mbox{if $g$ or $h$ commutes with $f$},
840: \eeq
841: \lbeq{e.E0}
842: E_f g = g ~~~\mbox{if $g$ commutes with $f$},
843: \eeq
844: If $f=f(\lambda)$ depends continuously differentiably on the
845: real parameter vector $\lambda$ then
846: \lbeq{e.deriv0}
847: \frac{d}{d\lambda} e^{-f} = - \Big(E_f \frac{df}{d\lambda}\Big)e^{-f}.
848: \eeq
849: \end{prop}
850:
851: \bepf
852: (i) We have
853: \[
854: \<g;h\>_f^* =\<(gE_fh)^*\>_f = \<(E_fh)^*g^*\>_f
855: =\Big\<\int_0^1 ds\,e^{sf}h^*e^{-sf}g^*\Big\>_f
856: =\int_0^1 ds\<e^{sf}h^*e^{-sf}g^*\>_f.
857: \]
858: The integrand equals
859: \[
860: \sint e^{-f}e^{sf}h^*e^{-sf}g^* = \sint e^{sf}e^{-f}h^*e^{-sf}g^*
861: =\sint e^{-f}h^*e^{-sf}g^*e^{sf} = \<h^*e^{-sf}g^*e^{sf}\>_f
862: \]
863: by (EA3), hence
864: \[
865: \<g;h\>_f^* = \int_0^1 ds\<h^*e^{-sf}g^*e^{sf}\>_f
866: = \Big\<h^* \int_0^1 ds\,e^{-sf}g^*e^{sf}\Big\>_f
867: = \<h^*E_fg^*\>_f=\<h^*;g^*\>_f.
868: \]
869: Thus \gzit{e.kubo2} holds.
870:
871: (ii) Suppose that $g\ne 0$. For $s\in[0,1]$, we define $u=s/2,v=(1-s)/2$
872: and $g(s):= e^{-uf}ge^{vf}$. Since $f$ is Hermitian,
873: $g(s)^*= e^{vf}g^*e^{-uf}$, hence by (EA3) and (EA2),
874: \[
875: \sint e^{-f}g^*e^{-sf}ge^{sf}=\sint e^{vf}ge^{-2uf}g^*e^{vf}
876: =\sint g(s)^*g(s)>0,
877: \]
878: so that
879: \[
880: \<g^*;g\>_f=\<g^*E_fg\>_f
881: =\int_0^1 ds\,\sint e^{-f}g^*e^{-sf}ge^{sf} > 0.
882: \]
883: This proves \gzit{e.definit}, and shows that the Kubo inner product is
884: positive definite.
885:
886: (iii) If $f$ and $g$ commute then $ge^{sf}=e^{sf}g$, hence
887: \[
888: E_fg=\int_0^1 ds e^{-sf}e^{sf} g = \int_0^1 ds g = g,
889: \]
890: giving \gzit{e.E0}. The definition of the Kubo inner product then
891: implies \gzit{e.kubo0}, and taking $g\in\Cz$ gives \gzit{e.kubo1}.
892:
893: (iv) The function $q$ on $[0,1]$ defined by
894: \[
895: q(t):= \int_0^t ds\, e^{-sf}\frac{df}{d\lambda}e^{sf}
896: +\Big(\frac{d}{d\lambda}e^{-tf}\Big) e^{tf}
897: \]
898: satisfies $q(0)=0$ and
899: \[
900: \frac{d}{dt}q(t) = e^{-tf}\frac{df}{d\lambda}e^{tf}
901: +\Big(\frac{d}{d\lambda}e^{-tf}\Big)f e^{tf}
902: +\frac{d}{d\lambda}(-e^{-tf}f) e^{tf} = 0.
903: \]
904: Hence $q$ vanishes identically. In particular, $q(1)=0$, giving
905: \gzit{e.deriv0}.
906: \epf
907:
908: As customary in thermodynamics, we use differentials to express
909: relations involving the differentiation by arbitrary parameters.
910: To write \gzit{e.deriv0} in differential form, we formally multiply by
911: $d\lambda$, and obtain the \bfi{quantum chain rule} for exponentials,
912: \lbeq{e.chain}
913: d e^{-f} = (- E_fd f) e^{-f}.
914: \eeq
915: If the $f(\lambda)$ commute for all values of $\lambda$
916: then the quantum chain rule reduces to the classical chain rule.
917: Indeed, then $f$ commutes also with $\frac{df}{d\lambda}$; hence
918: $E_f\frac{df}{d\lambda} = \frac{df}{d\lambda}$, and $E_fd f = df$.
919:
920: \bigskip
921: {\em The following theorem is central to the mathematics of
922: statistical mechanics.}
923: As will be apparent from the discussion in the next chapter,
924: part (i) is the
925: abstract mathematical form of the second law of thermodynamics,
926: part (ii) allows the actual computation of thermal properties from
927: microscopic assumptions, and part (iii) is the abstract form of the
928: first law.
929:
930: \begin{thm} \label{t3.3}
931: Let $f$ be Hermitian such that $e^{sf}$ is strongly
932: integrable for all $s\in[-1,1]$.
933:
934: (i) The \bfi{generating functional}
935: \lbeq{e.gen}
936: W(f):=- \log \sint e^{-f}
937: \eeq
938: is a concave function of the Hermitian quantity $f$.
939: In particular,
940: \lbeq{e.GB}
941: W(g) \le W(f)+\<g-f\>_f.~~~
942: \mbox{\bf (\bfi{Gibbs-Bogoliubov inequality})}
943: \eeq
944: Equality holds in \gzit{e.GB} iff $f$ and $g$ differ by a constant.
945:
946: (ii) For Hermitian $g$, we have
947: \lbeq{e.starh}
948: W(f+\tau g)=W(f)-\log\<e^{-f-\tau g}e^f\>_f.
949: \eeq
950: Moreover, the \bfi{cumulant expansion}
951: \lbeq{e.cumulant}
952: W(f+\tau g)
953: = W(f)+\tau\<g\>_f + \frac{\tau^2}{2}(\<g\>_f^2-\<g;g\>_f) + O(\tau^3)
954: \eeq
955: holds if the coefficients are finite.
956:
957: (iii) If $f=f(\lambda)$ and $g=g(\lambda)$ depend continuously
958: differentiably on $\lambda$ then the following \bfi{differentiation
959: formulas} hold:
960: \lbeq{e.diff}
961: d\<g\>_f = \<dg\>_f-\<g;df\>_f+\<g\>_f\<df\>_f,
962: \eeq
963: \lbeq{e.diffW}
964: dW(f)=\<df\>_f.
965: \eeq
966: (iv) The entropy of the state $\<\cdot\>_f$ is
967: \lbeq{e.ent}
968: S=\kbar(f-W(f)).
969: \eeq
970: \end{thm}
971:
972: \bepf
973: We prove the assertions in reverse order.
974:
975: (iv) Equation \gzit{e.gen} says that $W(f)=-\log Z_f$, which together
976: with \gzit{2-8} gives \gzit{e.ent}.
977:
978: (iii) We have
979: \[
980: \bary{lll}
981: d\sint ge^{-f} &=& \sint dg e^{-f} + \sint gde^{-f}
982: =\sint dge^{-f}-\sint gE_fd fe^{-f}\\
983: &=&\sint(dg-gE_fd f)e^{-f} = Z_f\<dg-gE_fd f\>_f.
984: \eary
985: \]
986: On the other hand,
987: $d\sint ge^{-f} = d(Z_f\<g\>_f)=dZ_f\<g\>_f+Z_fd\<g\>_f$, so that
988: \lbeq{e.s1}
989: dZ_f\<g\>_f+Z_fd\<g\>_f = Z_f\<dg-gE_fd f\>_f
990: = Z_f\<dg\>_f-Z_f\<g;df\>_f.
991: \eeq
992: In particular, for $g=1$ we find by \gzit{e.kubo1} that
993: $dZ_f=-Z_f\<1;df\>_f=-Z_f\<df\>_f$. Now \gzit{e.diffW} follows from
994: $dW(f)=-d\log Z_f =-dZ_f/Z_f = \<df\>_f$, and solving \gzit{e.s1} for
995: $d\<g\>_f$ gives \gzit{e.diff}.
996:
997: (ii) Equation \gzit{e.starh} follows from
998: \[
999: e^{-W(h)} = \sint e^{-h} = \sint e^{-h} e^f e^{-f}
1000: = \sint e^{-f} e^{-h} e^f = (\sint e^{-f}) \<e^{-h} e^f\>_f
1001: = e^{-W(f)} \<e^{-h} e^f\>_f
1002: \]
1003: by taking logarithms and setting $h=f+\tau g$. To prove the cumulant
1004: expansion, we introduce the function $\phi$ defined by
1005: \[
1006: \phi(\tau):=W(f+\tau g),
1007: \]
1008: From \gzit{e.diffW}, we find $\phi'(\tau) = \<g\>_{f+\tau g}$
1009: for $f,g$ independent of $\tau$, and by differentiating this again,
1010: \[
1011: \phi''(\tau)=\D\frac{d}{d\tau} \<g\>_{f+\tau g}
1012: =\D-\Big\<g\frac{E_fd (f+\tau g)}{d\tau}\Big\>_{f+\tau g}
1013: +\<g\>_{f+\tau g}^2.
1014: \]
1015: In particular,
1016: \lbeq{e.x5}
1017: \phi'(0) = \<g\>_f,~~~
1018: \phi''(0) = \<g\>_f^2-\<gE_f g\>_f= \<g\>_f^2-\<g;g\>_f.
1019: \eeq
1020: A Taylor expansion now implies \gzit{e.cumulant}.
1021:
1022: (i) Since the Cauchy-Schwarz equation
1023: for the Kubo inner product implies
1024: \[
1025: \<g\>_f^2=\<g;1\>_f^2\le \<g;g\>_f\<1;1\>_f= \<g;g\>_f,
1026: \]
1027: \gzit{e.x5} implies that
1028: \[
1029: \frac{d^2}{d\tau^2} W(f+\tau g)\Big|_{\tau=0}\le 0
1030: \]
1031: for all $f,g$. This implies that $W(f)$ is concave.
1032: Moreover, replacing $f$ by $f+sg$, we find that $\phi''(s)\le 0$ for
1033: all $s$. The remainder form of Taylor's theorem therefore gives
1034: \[
1035: \phi(\tau)=\phi(0)+\tau\phi'(0)+\int_0^\tau ds (\tau-s)\phi''(s)
1036: \le \phi(0)+\tau\phi'(0),
1037: \]
1038: and for $\tau=1$ we get
1039: \lbeq{e.x6}
1040: W(f+g)\le W(f)+\<g\>_f.
1041: \eeq
1042: \gzit{e.GB} follows for $\tau=1$ upon replacing $g$ by $g-f$.
1043:
1044: By the derivation, equality holds in \gzit{e.x6} only if $\phi''(s)=0$
1045: for all $0<s<1$. By \gzit{e.x5}, applied with $f+sg$ in place of $f$,
1046: this
1047: implies $\<g\>_{f+sg}^2 = \<g;g\>_{f+sg}$. Thus we have equality in
1048: the Cauchy-Schwarz argument, forcing $g$ to be a multiple of $1$.
1049: Therefore equality in the Gibbs-Bogoliubov inequality \gzit{e.GB}
1050: is possible only if $g-f$ is a constant.
1051: \epf
1052:
1053: As a consequence of the Gibbs-Bogoliubov inequality, we derive an
1054: important inequality for the entropy.
1055:
1056: \begin{thm} \label{t4.5}
1057: Let $S_c$ be the entropy of a reference state. Then, for an arbitrary
1058: Gibbs state $\<\cdot\>$ with entropy $S$,
1059: \lbeq{e4.5}
1060: \< S \> \le \< S_c\>,
1061: \eeq
1062: with equality only if $S_c =S$.
1063: \end{thm}
1064:
1065: \bepf
1066: Let $f=S/\kbar$ and $g=S_c/\kbar$. Since $S$ and $S_c$ are
1067: entropies, $W(f)=W(g)=0$, and the Gibbs-Bogoliubov inequality
1068: \gzit{e.GB} gives $0\le \<g-f\>_f = \<S_c-S\>/\kbar$.
1069: This implies \gzit{e4.5}. If equality holds then equality holds in
1070: \gzit{e.GB}, so that $S_c$ and $S$ differ only by a constant.
1071: But this constant vanishes since the values agree.
1072: \epf
1073:
1074: The difference
1075: \lbeq{4-5}
1076: \< S_c-S\> =\< S_c\> -\< S\> \ge 0
1077: \eeq
1078: is known as \bfi{relative entropy}.
1079: In an information theoretical context (cf. Section \ref{s.complexity}),
1080: the relative entropy may be interpreted as the amount of information
1081: in a state $\< \cdot \>$ which cannot be explained by
1082: the reference state. This interpretation makes sense since
1083: the relative entropy vanishes precisely for the reference state.
1084: A large relative entropy therefore indicates that the state contains
1085: some important information not present in the reference state.
1086:
1087:
1088: \bfi{Approximations.}
1089: The cumulant expansion is the basis of a well-known
1090: approximation method in statistical mechanics. Starting from special
1091: reference states $\<\cdot\>_f$ with explicitly known $W(f)$ and $E_f$
1092: (corresponding to so-called explicitly solvable models), one obtains
1093: inductively expressions for values in these states by
1094: applying the differentiation rules. (In the most important cases,
1095: the resulting formulas for the values are commonly
1096: referred to as a \bfi{Wick theorem}, cf. \sca{Wick} \cite{Wic},
1097: although the formulas are much older and were derived in 1918 by
1098: \sca{Isserlis} \cite{Iss}.
1099: For details, see textbooks on statistical mechanics,
1100: e.g., \sca{Huang} \cite{Hua}, \sca{Reichl} \cite{Rei}.)
1101:
1102: From these, one can calculate the coefficents in the cumulant
1103: expansion; note that higher order terms can be found
1104: by proceeding as in the proof, using further differentiation.
1105: \at{Alternatively, one may proceed on the basis of BCH-formulas for
1106: the Lie groups defining the exactly solvable model.}
1107: This gives approximate generating functions (and by
1108: differentiation associated values) for Gibbs states
1109: with an entropy close to the explicitly solvable reference state.
1110: From the resulting generating function and the differentiation
1111: formulas \gzit{e.diff}--\gzit{e.diffW},
1112: one gets as before the values for the given state.
1113:
1114: The best tractable reference state $\<\cdot\>_f$ to be used for a
1115: given Gibbs state $\<\cdot\>_g$ can be obtained by minimizing the
1116: upper bound in the Gibbs-Bogoliubov inequality \gzit{e.GB} over
1117: all $f$ for which an explicit generating function is known.
1118: Frequently, one simply approximates $W(g)$ by the minimum of this
1119: upper bound,
1120: \lbeq{e.meanfield}
1121: W(g) \approx W_m(g):=\inf_f \Big(W(f)+\<g-f\>_f\Big).
1122: \eeq
1123: Using $W_m(g)$ in place of $W(g)$ defines a so-called
1124: \bfi{mean field theory}; cf. \sca{Callen} \cite{Cal}.
1125: For computations from first principles (quantum field theory), see,
1126: e.g., the survey by \sca{Berges} et al. \cite{BerTW}.
1127:
1128:
1129:
1130:
1131:
1132: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1133: \section{Limit resolution and uncertainty} \label{s.limit}
1134:
1135: Definition \ref{d.state} generalizes the expectation axioms of
1136: \sca{Whittle} \cite[Section 2.2]{Whi} for classical probability theory.
1137: Indeed, the values of our quantities are traditionally called
1138: expectation values, and refer to the mean over an ensemble of (real or
1139: imagined) identically prepared systems.
1140:
1141: In our treatment, we keep the notation with pointed brackets familiar
1142: from statistical mechanics, but use the more neutral term {\em value}
1143: for $\<f\>$ to avoid any reference to probability or statistics.
1144: This keeps the formal machinery completely independent of controversial
1145: issues about the interpretation of probabilities. Statistics and
1146: measurements, where the probabilistic aspect enters directly, are
1147: discussed separately in Chapter \ref{s.model}.
1148:
1149: The key to an interpretation of the values of quantities as objective,
1150: observer-independent properties is an analysis of the uncertainty
1151: inherent in the description of a system by a state, based on the
1152: following result.
1153:
1154: \begin{prop}
1155: For Hermitian $g$,
1156: \lbeq{e.res0}
1157: \<g\>^2 \le \<g^2\>.
1158: \eeq
1159: Equality holds if $g=\<g\>$.
1160: \end{prop}
1161:
1162: \bepf
1163: Put $\ol g = \<g\>$. Then $0\le\<(g-\ol g)^2\>=\<g^2-2\ol g g+\ol g^2\>
1164: =\<g^2\>-2\ol g \<g\>+\ol g^2=\<g^2\>-\<g\>^2$.
1165: This gives \gzit{e.res0}. If $g=\ol g$ then equality holds in this
1166: argument.
1167: \epf
1168:
1169: \begin{dfn}
1170: The number
1171: \[
1172: \cov(f,g):=\re \<(f-\overline{f})^*(g-\overline{g}) \>
1173: \]
1174: is called the \bfi{covariance} of $f,g\in\Ez$. Two quantities $f,g$ are
1175: called \bfi{uncorrelated} if $\cov(f,g)=0$, and \bfi{correlated}
1176: otherwise. The number
1177: \[
1178: \sigma(f):=\sqrt{\cov(f,f)}
1179: \]
1180: is called the \bfi{uncertainty} of $f\in\Ez$ in the state $\<\cdot\>$.
1181: The number
1182: \lbeq{e.res}
1183: \res(g):=\sqrt{\<g^2\>/\<g\>^2-1},
1184: \eeq
1185: is called the \bfi{limit resolution}
1186: of a Hermitian quantity $g$ with nonzero value $\<g\>$.
1187: \end{dfn}
1188:
1189: Note that (E3) and \gzit{e.res0} ensure that $\sigma(f)$ and $\res(g)$
1190: are nonnegative real numbers that vanish if $f,g$ are constant,
1191: i.e., complex numbers, and $g\ne 0$.
1192: This definition is analogous to the definitions of elementary classical
1193: statistics, where $\Ez$ is a commutative algebra of random variables,
1194: to the present, more general situation; in a statistical context,
1195: the uncertainty
1196: $\sigma(f)$ is referred to as \bfi{standard deviation}.
1197:
1198: There is no need to associate an intrinsic statistical
1199: meaning to the above concepts. We treat the uncertainty
1200: $\sigma(f)$ and the limit resolution $\res(g)$ simply as an absolute
1201: and relative uncertainty measure, respectively, specifying
1202: how accurately one can treat $g$ as a sharp number, given by this
1203: value.
1204:
1205: In experimental practice, the limit resolution is a lower bound
1206: on the relative accuracy with which one can expect $\<g\>$ to be
1207: determinable reliably\footnote{
1208: The situation is analogous to the limit resolution with which one can
1209: determine the longitude and latitude of a city such as Vienna.
1210: Clearly these are well-defined only up to some limit resolution
1211: related to the diameter of the city. No amount of measurements can
1212: reduce the uncertainty below about 10km. For an extended object,
1213: the uncertainty in its position is conceptual,
1214: not just a lack of knowledge or precision. Indeed, a point may be
1215: defined to be an object in a state where the position has zero limit
1216: resolution.
1217: }\ % end footnote
1218: from measurements of a single system at a single time.
1219: In particular, a quantity $g$ is considered to be
1220: \bfi{significant} if $\res(g)\ll 1$, while it is \bfi{noise} if
1221: $\res(g)\gg 1$. If $g$ is a quantity and $\widetilde g$ is a good
1222: approximation of its value then $\Delta g:=g-\widetilde g$ is
1223: noise. Sufficiently significant quantities can be treated as
1224: \bfi{deterministic}; the analysis of noise is the subject of
1225: \bfi{statistics}.
1226:
1227:
1228:
1229: \begin{prop} \label{p5.2}
1230:
1231: For any state,
1232:
1233: (i) $f\leq g \implies \<f\> \leq \<g\>$.
1234:
1235: (ii) For $f,g\in\Ez$,
1236: \[
1237: \cov(f,g)=\re(\<f^*g\>-\<f\>^*\<g\>),
1238: \]
1239: \[
1240: \<f^*f\>=\<f\>^*\<f\>+\sigma(f)^2,
1241: \]
1242: \[
1243: |\<f\>|\leq\sqrt{\<f^*f\>}.
1244: \]
1245:
1246: (iii) If $f$ is Hermitian then $\bar f = \<f\>$ is real and
1247: \[
1248: \sigma(f)=\sqrt{\<(f-\overline{f})^2 \>}
1249: =\sqrt{\<f^2\>-\<f\>^2}.
1250: \]
1251:
1252: (iv) Two commuting Hermitian quantities $f,g$ are uncorrelated iff
1253: \[
1254: \<fg\>=\<f\>\<g\>.
1255: \]
1256:
1257: \end{prop}
1258:
1259: \bepf
1260: (i) follows from (E1) and (E3).
1261:
1262: (ii) The first formula holds since
1263: \[
1264: \<(f-\bar f)^*(g-\bar g)\>
1265: =\<f^*g\>-\bar f^*\<g\>-\<f\>^*\bar g +\bar f^*\bar g
1266: = \<f^*g\>-\<f\>^*\<g\>.
1267: \]
1268: The second formula follows for $g=f$, using (E1), and the third
1269: formula is an immediate consequence.
1270:
1271: (iii) follows from (E1) and (ii).
1272:
1273: (iv) If $f,g$ are Hermitian and commute the $fg$ is Hermitian by
1274: Proposition \ref{p5.1.2}(ii), hence $\<fg\>$ is real. By (ii),
1275: $\cov(f,g)=\<fg\>-\<f\>\<g\>$, and the assertion follows.
1276: \epf
1277:
1278:
1279: Formally, the essential difference between classical mechanics
1280: and quantum mechanics in the latter's lack of commutativity.
1281: While in classical mechanics there is in principle no lower
1282: limit to the uncertainties with which we can prepare the quantities
1283: in a system of interest,
1284: the quantum mechanical uncertainty relation for noncommuting
1285: quantities puts strict limits on the uncertainties in the preparation
1286: of microscopic states. Here, {\em preparation} is defined informally
1287: as bringing the system into an state such that measuring certain
1288: quantities $f$ gives numbers that agree with the values $\<f\>$ to an
1289: accuracy specified by given uncertainties.
1290:
1291: We now discuss the limits of the accuracy to which this
1292: can be done.
1293:
1294:
1295: \begin{prop} \label{p5.1}~\\
1296: (i) The \bfi{Cauchy--Schwarz inequality}
1297: \[
1298: |\< f^*g \>|^2 \le \< f^*f \>\< g^*g \>
1299: \]
1300: holds for all $f,g\in\Ez$.
1301:
1302: (ii) The \bfi{uncertainty relation}
1303: \[
1304: \sigma(f)^2\sigma(g)^2
1305: \geq |\cov(f,g)|^2+\left|\shalf\<f^*g-g^*f\>\right|^2
1306: \]
1307: holds for all $f,g\in\Ez$.
1308:
1309: (iii) For $f,g\in\Ez$,
1310: \lbeq{ecov1}
1311: \cov(f,g)=\cov(g,f)=\shalf(\sigma(f+g)^2-\sigma(f)^2-\sigma(g)^2),
1312: \eeq
1313: \lbeq{ecov}
1314: |\cov(f,g)| \leq \sigma(f)\sigma(g),
1315: \eeq
1316: \lbeq{esig}
1317: \sigma(f+g) \leq \sigma(f)+\sigma(g).
1318: \eeq
1319: In particular,
1320: \lbeq{e.prodbound}
1321: |\<fg\>-\<f\>\<g\>|\leq\sigma(f)\sigma(g)
1322: ~~~\mbox{for commuting Hermitian } f,g.
1323: \eeq
1324:
1325: \end{prop}
1326:
1327: \bepf
1328: (i) For arbitrary $\alpha ,\beta\in \Cz$ we have
1329: \[
1330: \begin{array}{ll}
1331: 0&\le \<(\alpha f-\beta g)^*(\alpha f-\beta g )\> \\
1332: &=\alpha ^* \alpha \< f^*f \>-\alpha ^* \beta \< f^*g \>
1333: -\beta ^*\alpha \< g^*f \>+\beta\beta^* \< g^*g \>\\
1334: &=|\alpha |^2\< f^*f \>-2\re(\alpha ^* \beta \< f^*g \>)
1335: +|\beta|^2\< g^*g \>
1336: \end{array}
1337: \]
1338: We now choose $\beta=\< f^*g \>$, and obtain for arbitrary
1339: real $\alpha $ the inequality
1340: \lbeq{f.8}
1341: 0\le \alpha ^2\< f^*f \>
1342: -2\alpha |\< f^*g \>|^2+|\< f^*g \>|^2\< g^*g \>.
1343: \eeq
1344: The further choice $\alpha=\< g^*g \>$ gives
1345: \[
1346: 0\le \< g^*g \>^2\< f^*f \>-\< g^*g \>|\< f^*g \>|^2.
1347: \]
1348: If $\< g^*g \>>0$, we find after division by $\< g^*g \>$ that (i)
1349: holds. And if $\< g^*g \>\le 0$ then $\< g^*g \>=0$ and we have
1350: $\< f^*g \>=0$ since otherwise a tiny $\alpha $ produces a negative
1351: right hand side in \gzit{f.8}. Thus (i) also holds in this case.
1352:
1353: (ii) Since $(f-\bar f)^*(g-\bar g)-(g-\bar g)^*(f-\bar f)=f^*g-g^*f$,
1354: it is sufficient to prove the uncertainty relation for the case of
1355: quantities $f,g$ whose value vanishes. In this case, (i) implies
1356: \[
1357: (\re \<f^*g\>)^2 +(\im \<f^*g\>)^2 =|\<f^*g\>|^2 \leq
1358: \< f^*f \>\< g^*g \> = \sigma(f)^2\sigma(g)^2.
1359: \]
1360: The assertion follows since $\re \<f^*g\>=\cov(f,g)$ and
1361: \[
1362: i\im \<f^*g\>=\shalf(\<f^*g\>-\<f^*g\>^*)=\shalf\<f^*g-g^*f\>.
1363: \]
1364:
1365: (iii) Again, it is sufficient to consider the case of
1366: quantities $f,g$ whose value vanishes. Then
1367: \lbeq{esig1}
1368: \begin{array}{lll}
1369: \sigma(f+g)^2 &=& \<(f+g)^*(f+g)\>
1370: =\<f^*f\>+\<f^*g+g^*f\>+\<g^*g\>\\
1371: &=& \sigma(f)^2+2\cov(f,g)+\sigma(g)^2,
1372: \end{array}
1373: \eeq
1374: and \gzit{ecov1} follows. \gzit{ecov} is an immediate consequence of
1375: (ii), and \gzit{esig} follows easily from \gzit{esig1} and
1376: \gzit{ecov}. Finally, \gzit{e.prodbound} is a consequence of
1377: \gzit{ecov} and Proposition \ref{p5.2}(iii).
1378: \epf
1379:
1380: If we apply Proposition \ref{p5.1}(ii) to scalar position $q$ and
1381: momentum $p$ variables satisfying the
1382: \bfi{canonical commutation relation}
1383: \lbeq{ccr}
1384: [q,p]=i\hbar,
1385: \eeq
1386: we obtain
1387: \lbeq{e6.unc0}
1388: \sigma(q)\sigma(p)\geq \shalf\hbar,
1389: \eeq
1390: the \bfi{uncertainty relation} of \sca{Heisenberg} \cite{Hei,Rob}.
1391: it implies that no state exists where both position $q$ and momentum
1392: $p$ have arbitrarily small uncertainty.
1393:
1394:
1395:
1396:
1397: