cond-mat0608257/ahe.tex
1: %\documentclass[galley,floats, prb, amsmath,showpacs]{revtex4}
2: \documentclass[floats,prb,twocolumn,amsmath,showpacs]{revtex4}
3: 
4: %
5: \usepackage{epsfig}
6: \usepackage{colordvi}
7: \usepackage{graphicx}
8: %
9: %macros
10: \def\k{{\bf k}}
11: \def\kk{{\boldsymbol \kappa}}
12: \def\E{{\bf E}}
13: \def\B{{\bf B}}
14: \def\R{{\bf R}}
15: \def\r{{\bf r}}
16: \def\b{{\bf b}}
17: \def\q{{\bf q}}
18: %
19: \def\pw{^{({\rm W})}}
20: \def\ph{^{({\rm H})}}
21: \def\la{\langle\kern-2.0pt\langle}
22: \def\ra{\rangle\kern-2.0pt\rangle}
23: \def\vt{\vert\kern-1.0pt\vert}
24: %\def\D{{\cal D}\ph}
25: \def\D{{D}\ph}
26: %
27: \begin{document}
28: %\draft
29: \def\dvm#1{\marginpar{\small DV: #1}}
30: \def\xwm#1{\marginpar{\small XW: #1}}
31: \def\jry#1{\marginpar{\small JY: #1}}
32: \def\ivo#1{\marginpar{\small IS: #1}}
33: 
34: \title{Ab initio calculation of the anomalous Hall
35: conductivity by Wannier interpolation}
36: 
37: \author{Xinjie Wang,$^1$ Jonathan R. Yates,$^{2,3}$ Ivo Souza,$^{2,3}$ and
38: David Vanderbilt$^1$}
39: \affiliation{$^1$Department of Physics and Astronomy, Rutgers University,
40:         Piscataway, NJ 08854-8019\\
41:         $^2$Department of Physics, University of California, Berkeley, CA 94720\\
42:         $^3$Materials Science Division, Lawrence Berkeley National Laboratory, Berkeley,
43:         CA 94720}
44: \date{\today}
45: \begin{abstract}
46: The intrinsic anomalous Hall effect in ferromagnets depends on subtle 
47: spin-orbit-induced effects in the electronic structure, and 
48: recent {\it ab-initio} studies found that it was necessary to 
49: sample the Brillouin zone at millions of $k$-points to converge the
50: calculation.
51: We present an efficient first-principles approach for computing the
52: anomalous Hall conductivity.
53: We start out by performing
54: a conventional electronic-structure calculation including spin-orbit coupling
55: on a
56: uniform and relatively coarse
57: $k$-point mesh.  From the resulting Bloch states,
58: maximally-localized Wannier functions are constructed
59: which reproduce the {\it ab-initio} states up to the Fermi level.
60: The Hamiltonian and position-operator matrix elements, needed to
61: represent the
62: energy bands and Berry curvatures,
63: are then set up between the Wannier orbitals.
64: This completes the first stage of the calculation, whereby
65: the low-energy
66: {\it ab-initio} problem is transformed into an effective tight-binding form.
67: The second stage only involves Fourier transforms and
68: unitary transformations of the small matrices set up  in the first stage.
69: With these inexpensive operations, the quantities of interest are
70: interpolated onto a dense $k$-point mesh and used to evaluate the
71: anomalous Hall conductivity as a Brillouin zone integral.
72: The present scheme, which also avoids the
73: cumbersome summation over all unoccupied states in the Kubo formula,
74: is applied to bcc Fe,
75: giving excellent agreement with conventional, less efficient
76: first-principles calculations.
77: Remarkably, we find that more than 99\% of the effect can be recovered by 
78: keeping a set of terms depending only on the Hamiltonian matrix elements,
79: not on matrix elements of the position operator.
80: \end{abstract}
81: 
82: \pacs{71.15.Dx, 71.70.Ej, 71.18.+y, 75.50.Bb, 75.47.-m.}
83: 
84: \maketitle
85: 
86: \vskip2pc
87: \marginparwidth 3.1in
88: \marginparsep 0.5in
89: %\columnseprule 0pt
90: 
91: %=====================
92: \section{Introduction}
93: %=====================
94: 
95: The Hall resistivity of a ferromagnet depends not only on the
96: magnetic induction, but also on the magnetization; the latter dependence
97: is known as the anomalous Hall effect (AHE).\cite{hurd72} 
98: The AHE is used for investigating surface magnetism, and its potential
99: for investigating nanoscale magnetism, as well as for magnetic sensors and 
100: memory devices applications, is being considered.\cite{gerber02} 
101: Theoretical investigations of the AHE have undergone a revival in
102: recent years, and have also lead to the proposal for a spin counterpart,
103: the spin Hall effect, which has subsequently been realized experimentally.
104: 
105: The first theoretical model of the AHE was put forth by Karplus and
106: Luttinger,\cite{karplus54} who showed that it can arise in a perfect crystal
107: as a result of the spin-orbit interaction of polarized conduction
108: electrons. 
109: Later, two alternative mechanisms, skew scattering\cite{smit58}
110: and side jump scattering,\cite{berger70}
111: were proposed by Smit and Berger respectively.
112: In skew scattering the spin-orbit interaction gives rise to an
113: asymmetric scattering cross section even if the defect potential
114: is symmetric, and in side-jump scattering the spin-orbit coupling causes the
115: scattered electron to acquire an extra transverse translation after
116: the scattering event. These two mechanisms
117: involve scattering from impurities or phonons, while the 
118: Karplus-Luttinger mechanism is a scattering-free bandstructure
119: effect.  For reasons related to the absence of an intuitive physical
120: picture and the lack of reliable quantitative estimates based on
121: bandstructure calculations, the Karplus-Luttinger theory was strongly 
122: disputed in the early literature.
123: 
124: In recent years, new insights into the Karplus-Luttinger mechanism have
125: been obtained by
126: several authors,\cite{chang96,sundaram99,onoda02,jungwirth02,haldane04} who
127: reexamined it in the modern language of Berry's phases.
128: The term ${\boldsymbol\Omega}_n(\k)$ in the equations below
129: was recognized as the Berry curvature of the Bloch states
130: in reciprocal space, a quantity which had previously appeared in the
131: theory of 
132: the integer quantum Hall effect,\cite{thouless82} and also in the
133: Berry-phase theory of polarization.\cite{ksv93}
134: The anomalous Hall conductivity (AHC)
135: is simply given as the Brillouin zone (BZ) integral of the Berry curvature weighted
136: by the occupation factor of each state,
137: %
138: \begin{eqnarray}
139:                 \sigma_{xy}=\frac{-e^2}{(2\pi)^2h}\sum_{n}\int_{\rm BZ}\,
140: d\k \, f_n(\k)\,\Omega_{n,z}(\k) \;.
141: \label{eq:sigma}
142: \end{eqnarray}
143: %
144: While this can be derived in several ways, it is perhaps most
145: intuitively understood from the semiclassical point of view, in which
146: the group velocity of an electron wavepacket in band $n$ 
147: is\cite{adams59,sundaram99}
148: %
149: \begin{equation}
150:  {\dot \r}=\frac{1}{\hbar}\frac{\partial {\cal E}_{n\k}}{\partial \k}-
151:         {\dot \k}\times {{\bf \Omega}_n(\k)}\;.
152: \label{eq:rdot}
153: \end{equation}
154: %
155: The second term, often overlooked in elementary textbook derivations,
156: is known as the ``anomalous velocity.''  The expression for
157: the current density $\bf J$ then acquires a new term
158: $ef_n(\k)\,{\dot \k}\times {\bf \Omega}_n(\k)$ which, with
159: ${\dot \k}=-e{\bf E}/\hbar$, leads to Eq.~(\ref{eq:sigma}).
160: 
161: Recently, first-principles calculations of Eq.~(\ref{eq:sigma}) were
162: carried out for the ferromagnetic perovskite SrRuO$_3$ by
163: Fang {\it et al.},\cite{fang03} and for a transition metal, bcc Fe, by Yao
164:  {\it et al.}\cite{yao04} 
165: In both cases the calculated values compared well with experimental data,
166: lending credibility to the intrinsic mechanism.
167: The most striking feature of these calculations is
168: the strong and rapid variation of the Berry curvature %$\Omega({\bf k})$
169: in $k$-space. In particular, there are sharp peaks and valleys at places
170: where two energy bands are split by the spin-orbit coupling across the Fermi 
171: level. In order to converge the integral, the Berry curvature has to be
172: evaluated over millions of $k$-points in the Brillouin zone. In the previous
173: work this was done via a Kubo formula involving a large number of unoccupied
174: states; the computational cost was very high, even for bcc Fe, with only one
175: atom in the unit cell.
176: 
177: In this paper, we present an efficient method for computing the AHC. 
178: Unlike the conventional approach, it does not require carrying out a full 
179: {\it ab-initio} calculation for every $k$-point where the Berry
180: curvature needs to be evaluated. The actual
181: {\it ab-initio} calculation is performed on a much coarser $k$-point grid. 
182: By a post-processing step, the resulting Bloch states below
183: and immediately above the Fermi level are
184: then mapped onto well-localized Wannier-functions.  
185: In this
186: representation it is then possible to interpolate the Berry curvature onto any
187: desired $k$-point with very little computational effort and essentially
188: no loss of accuracy.
189: 
190: The paper is organized as follows. In Sec.~\ref{sec:background} we
191: introduce the basic definitions and describe the Kubo-formula approach
192: used in previous calculations of the intrinsic AHC.
193: In Sec.~\ref{sec:fdbc} our new Wannier-based approach is described.
194: The details of the band-structure calculation and Wannier-function
195: construction are described in
196: Sec.~\ref{sec:cd}, followed by an application of the method to
197: bcc Fe in Sec~\ref{sec:results}.
198: Finally, Sec.~\ref{sec:conclusion} contains a brief
199: summary and discussion.
200: 
201: %==================
202: \section{Definitions and background}
203: \label{sec:background}
204: %==================
205: 
206: The key ingredient in the theory of the intrinsic anomalous Hall effect is
207: the Berry curvature ${\bf \Omega}_n({\bf k})$, defined as
208: %
209: \begin{eqnarray}
210:         {\bf \Omega}_n({\bf k})=\boldsymbol\nabla\times {\bf A}_n(\k) \;,
211:         \label{eq:bc}
212: \end{eqnarray}
213: where ${\bf A}_{n}$ is the Berry connection,
214: %
215: \begin{eqnarray}
216:         {\bf A}_n(\k)=i\langle u_{n\k}|\boldsymbol\nabla_\k|u_{n\k} \rangle\;.
217:         \label{eq:berrypot}
218: \end{eqnarray}
219: %
220: The Berry curvature can be written in an
221: equivalent but more explicit form:
222: %
223:   \begin{equation}
224:   {\Omega}_{n,\gamma}({\bf k})=\epsilon_{\alpha \beta \gamma}
225:         \, \Omega_{n,\alpha \beta}(\k) \;,
226:   \end{equation}
227:   \begin{equation}
228:         \Omega_{n,\alpha \beta}(\k) =
229:         -2\,{\rm Im}\,\Big\langle \frac{\partial u_{n\k}}
230:         {\partial k_\alpha}
231:         \Big|\frac{\partial u_{n\k}}{\partial k_\beta} \Big\rangle \;,
232:   \label{eq:bcurv}
233:   \end{equation}
234: %
235: where the Greek letters indicate Cartesian coordinates,
236: $\epsilon_{\alpha \beta \gamma}$ is Levi-Civita tensor and
237: $u_{n\k}$ are the cell-periodic Bloch functions. The second-rank
238: %
239: Berry curvature tensor $\Omega_{n,\alpha \beta}(\k)$ is introduced
240: for later use.  The integral of the Berry
241: curvature over a surface bounded by a closed path $C$ in $k$-space
242: is the Berry phase of that path.\cite{berry84}
243: 
244: With this notation we rewrite the quantity we wish to evaluate,
245: Eq.~(\ref{eq:sigma}), as
246: %
247: \begin{eqnarray}
248:                 \sigma_{\alpha\beta}=\frac{-e^2}{(2\pi)^2h}\int_{\rm BZ}\,
249: d\k\, \Omega_{\alpha\beta}(\k) \;,
250: \label{eq:sigma_b}
251: \end{eqnarray}
252: %
253: where we have introduced the {\it total} Berry curvature
254: %
255: \begin{equation}
256: \label{eq:omega_tot}
257: \Omega_{\alpha\beta}(\k)=\sum_{n}\,f_n(\k)\,\Omega_{n,\alpha\beta}(\k).
258: \end{equation}
259: %
260: Direct evaluation of Eq.~(\ref{eq:bcurv})
261: poses a number of practical difficulties related to the presence of
262: $k$-derivatives of Bloch states, as will be discussed in the next section.
263: In previous work\cite{fang03,yao04} these were circumvented by recasting
264: Eq.~(\ref{eq:bcurv}) as a Kubo formula,\cite{thouless82} where
265: the $k$-derivatives are replaced by sums over states:
266: %
267: \begin{eqnarray}
268:         \Omega_{n,\alpha\beta}({\k})=-2{\rm Im}\sum_{m \neq n}
269:    \frac{v_{nm,\alpha}(\k)\,v_{mn,\beta}(\k)}
270:         {(\omega_{m}(\k)-\omega_{n}(\k))^2}\;,
271:          \label{eq:kubo}
272: \end{eqnarray}
273: %
274: where $\omega_{n}(\k)={\cal E}_{n\k}/\hbar$ and the 
275: matrix elements of the Cartesian velocity operators
276: $\hat v_\alpha=(i/\hbar)[\hat H,\hat r_\alpha]$ are given by\cite{blount62}
277: %
278: \begin{equation}
279: v_{nm,\alpha}(\k)=\langle \psi_{n\k}|\hat v_\alpha|\psi_{m\k}\rangle=
280: \frac{1}{\hbar}\,
281: \Big\langle u_{n\k}\Big|\frac{\partial \hat H(\k)}{\partial k_\alpha}\Big|
282: u_{m\k}\Big\rangle\;,
283: \label{eq:vel}
284: \end{equation}
285: %
286: where $\hat H(\k)=e^{-i\k\cdot\hat\r}\hat{H}e^{i\k\cdot\hat\r}$.
287: The merit of Eq.~(\ref{eq:kubo}) lies in its practical implementation on a
288: finite $k$-grid using only the wave functions at a single $k$-point.
289: As is usually the case for such linear-response
290: formulas, sums over pairs of occupied states can be avoided in the
291: $T=0$ version of the formula (\ref{eq:omega_tot}) for the total
292: Berry curvature,
293: %
294: \begin{eqnarray}
295:         \Omega_{\alpha\beta}({\k})=-2{\rm Im}\sum_{v}\sum_{c}
296:    \frac{v_{vc,\alpha}(\k)\,v_{cv,\beta}(\k)}
297:         {(\omega_{c}(\k)-\omega_{v}(\k))^2} \;,
298:          \label{eq:kubotot}
299: \end{eqnarray}
300: where $v$ and $c$ subscripts denote valence (occupied) and conduction 
301: (unoccupied) bands, respectively.  However, the evaluation of this formula
302: requires the cumbersome summation over unoccupied states.  Even if
303: practical calculations truncate the summation to some extent, the
304: computation could be time-consuming. Moreover, the time required to
305: calculate the matrix elements of the velocity operator in Eq.~(\ref{eq:kubo})
306: or (\ref{eq:kubotot}) is not negligible.
307: 
308: 
309: 
310: %====================
311: \section{Evaluation of the Berry curvature by Wannier interpolation}
312: \label{sec:fdbc}
313: %====================
314: 
315: In view of the above-mentioned drawbacks of the Kubo formula for
316: practical calculations, it would be highly desirable to have a
317: numerical scheme based on the
318: the ``geometric formula'' (\ref{eq:bcurv}), in terms of the
319: occupied states only. The difficulties in implementing that
320: formula arise form the $k$-derivatives therein.
321: Since in practice one always replaces the Brillouin zone integration by a
322: discrete summation, an obvious approach would be to use a finite-difference
323: representation of the derivatives on the $k$-point grid. However, this requires
324: some care: a straightforward discretization will yield results which
325: depend on the choice of phases of the Bloch states (i.e., the choice of
326: gauge), even though Eq.~(\ref{eq:bcurv}) is in principle gauge-invariant.
327: The problem becomes more acute in the presence of band crossings and
328: avoided crossings, because
329: then it is not clear which two states at neighboring grid points should
330: be taken as ``partners'' in a finite-differences expression. 
331: (Moreover, since the system is a metal, at $T=0$ the occupation can be
332: different at neighboring $k$-points.) Successful numerical strategies 
333: for dealing with problems of this nature have been developed in the 
334: context of the Berry-phase theory of polarization of insulators, and 
335: a workable  finite-difference scheme which combines those ideas 
336: with Wannier interpolation is sketched in Appendix~B.
337: 
338: We present here a different, more powerful strategy that also relies 
339: on a Wannier representation of the low-energy electronic structure.  
340: We will show that it is possible to express the needed derivatives 
341: analytically in terms of the Wannier functions, so that no finite-difference 
342: evaluation of a derivative is needed in principle. The use of Wannier
343: functions
344: allows us to achieve this while still avoiding the summation over all
345: empty states which appears in the Kubo formula as a result of applying
346: conventional $k\cdot p$ perturbation theory.
347: 
348: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
349: \subsection{Wannier representation}
350: \label{sec:wr}
351: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
352: 
353: \begin{figure}
354: \begin{center}
355: \epsfig{file=fig1.eps,width=3.4in}
356: \end{center}
357: \caption{Band structure of bcc Fe with spin-orbit coupling
358: included. Solid lines: original band structure
359: from a conventional first-principles calculation.
360: Dotted lines: Wannier-interpolated band
361: structure. The zero of energy is the Fermi
362: level.}
363: \label{fig:band}
364: \end{figure}
365: 
366: We begin by using the approach of Souza, Marzari, and Vanderbilt
367: \cite{souza01} to construct a set of Wannier functions (WFs)
368: for the metallic system
369: of interest.  For insulators, one
370: normally considers a set of WFs that span precisely the space of
371: occupied Bloch states.  Here, since we have a metallic system and
372: we want to have well-localized WFs, we choose a number of WFs
373: larger than the number $N_\k$ of occupied states at any $\k$, and 
374: only insist that the
375: space spanned by the WFs should include, as a subset, the space
376: of the occupied states, plus the first few 
377: empty states. Thus, these partially-occupied
378: WFs will serve here as a kind
379: of ``exact tight-binding basis'' that can be used
380: as a compact representation of
381: the low-energy electronic structure of the metal.
382: 
383: This is illustrated in Fig.~\ref{fig:band}, where the bandstructure
384: of bcc Fe is shown.  The details of the calculations will be presented
385: later in Sec.~\ref{sec:cd}.
386: The solid lines show the full {\it ab-initio} bandstructure, while
387: the dashed lines show the bands obtained within the Wannier
388: representation using $M=18$ WFs per cell (9 of each spin;
389: see Sec.~\ref{sec:maxloc}).
390: In the method of Ref.~\onlinecite{souza01},
391: one specifies an energy $E_{\rm win}$ lying somewhat
392: above the Fermi energy $E_{\rm f}$, and insists on finding a set
393: of WFs spanning all the states in an energy window up to $E_{\rm win}$.
394: In the calculation of Fig.~\ref{fig:band} we chose $E_{\rm win}\simeq
395: 18$\,eV, and it is evident that there is an essentially perfect match
396: between the fully {\it ab-initio} and the Wannier-represented bands up to,
397: but not above, $E_{\rm win}$.
398: 
399: More generally, we shall assume that we have $M$ WFs per unit cell
400: (where $M\ge N_\k$ everywhere in the BZ)
401: such that the Bloch-like functions given by the phased sum of the
402: Wannier orbitals,
403: %
404: \begin{equation}
405: \label{eq:blochW}
406: |u_{n\k}\pw\rangle = \sum_\R e^{-i\k\cdot(\hat\r-\R)}\,|\R n\rangle
407: \end{equation}
408: %
409: ($n=1,...,M$), span the actual Bloch eigenstates $|u_{n\k}\rangle$
410: of interest
411: ($n=1,...,N_k$) at each $\k$.  It follows that, if we construct the
412: $M\times M$ Hamiltonian matrix
413: %
414: \begin{equation}
415: \label{eq:hamW}
416: H_{nm}\pw(\k)=\langle u_{n\k}\pw | \hat H(\k) | u_{m\k}\pw \rangle
417: \end{equation}
418: %
419: and diagonalize it by finding an $M\times M$ unitary rotation matrix
420: $U(\k)$ such that
421: %
422: \begin{equation}
423: U^\dagger(\k) H\pw(\k) U(\k) = H\ph(\k)
424: \label{eq:Htrans}
425: \end{equation}
426: %
427: where $H\ph_{nm}(\k)={\cal E}\ph_{n\k}\delta_{nm}$, then ${\cal E}\ph_{n\k}$
428: will be identical to the true ${\cal E}_{n\k}$ for all occupied
429: bands.  Also, the corresponding Bloch states
430: %
431: \begin{equation}
432: |u_{n\k}\ph\rangle = \sum_m |u_{m\k}\pw\rangle U_{mn}(\k)
433: \label{eq:twist}
434: \end{equation}
435: %
436: will also be identical to the true eigenstates $|u_{n\k}\rangle$ for ${\cal E}\le
437: E_{\rm f}$.  (In the scheme of Ref.~\onlinecite{souza01}, these
438: properties will actually hold for energies up to $E_{\rm win}$.)
439: However, the band energies and Bloch states will {\it not} generally
440: match the true ones at higher energies, as shown in Fig.~\ref{fig:band}.
441: We thus use the superscript `H' to distinguish the projected band
442: energies ${\cal E}\ph_{n\k}$ and eigenvectors $|u_{n\k}\ph\rangle$ from
443: the true ones ${\cal E}_{n\k}$ and $|u_{n\k}\rangle$, keeping in mind
444: that this distinction is only significant in the higher-energy unoccupied
445: region (${\cal E}>E_{\rm win}$) of the projected bandstructure.
446: 
447: The unitary rotation of states expressed by the matrix $U(\k)$ is
448: often referred to as a ``gauge transformation,'' and we shall adopt
449: this terminology here.  We shall refer to the
450: Wannier-derived Bloch-like states $|u_{n\k}\pw \rangle$ as belonging
451: to the Wannier (W) gauge, while the eigenstates $|u_{n\k}\ph \rangle$ of the
452: projected bandstructure are said to belong to the Hamiltonian (H)
453: gauge.
454: 
455: Quantities such as the Berry connection ${\bf A}_n(\k)$ of
456: Eq.~(\ref{eq:berrypot}) and the Berry curvature
457: $\Omega_{n,\alpha\beta}(\k)$ of Eq.~(\ref{eq:bc}) clearly depend
458: upon the gauge in which they are expressed.
459: The quantity that we wish to calculate,
460: Eq.~(\ref{eq:omega_tot}), is most naturally
461: expressed in the Hamiltonian gauge, where it takes the form
462: %
463: \begin{equation}
464:   \Omega_{\alpha\beta}(\k)=\sum_{n=1}^{M} f_n(\k) \,
465:   \Omega_{n,\alpha\beta}\ph(\k)
466: \;.
467: \label{eq:omsum}
468: \end{equation}
469: %
470: Here $\Omega_{n,\alpha\beta}\ph(\k)$ is given by Eq.~(\ref{eq:bcurv})
471: with $|u_{n\k}\rangle\rightarrow|u_{n\k}\ph\rangle$.  It is permissible
472: to make this substitution because the projected bandstructure matches the
473: true one for all occupied states.
474: In practice one may take for the occupation factor $f_n(\k)
475: =\theta(E_{\rm f}-{\cal E}_{n\k})$ or introduce a small thermal
476: smearing as desired.
477: 
478: Our strategy now is to see how
479: the right-hand side of Eq.~(\ref{eq:omsum}) can be obtained by starting
480: with quantities that are defined and computed first in the Wannier
481: gauge and then transformed into the Hamiltonian gauge.
482: The resulting scheme can be viewed as a generalized Slater-Koster 
483: interpolation, which takes advantage of the smoothness in $k$-space of the
484: Wannier-gauge objects, a direct consequence of the short range of the Wannier
485: orbitals in real space.
486: 
487: \subsection{Gauge transformations}
488: 
489: Because the gauge transformation of Eq.~(\ref{eq:twist}) involves a
490: unitary rotation among several bands, it is useful to introduce
491: generalizations of the quantities in Eqs.~(\ref{eq:bc}-\ref{eq:berrypot})
492: having two band indices instead of one. Thus, we define
493: %
494: \begin{equation}
495: A_{nm,\alpha}(\k)=i\langle u_{n}|\partial_\alpha u_{m}\rangle
496: \label{eq:Awg}
497: \end{equation}
498: %
499: and
500: %
501: \begin{equation}
502: \Omega_{nm,\alpha\beta}(\k)=
503:   i\langle \partial_\alpha u_{n}|\partial_\beta  u_{m}\rangle
504:  -i\langle \partial_\beta  u_{n}|\partial_\alpha u_{m}\rangle
505: \;,
506: \label{eq:Owg}
507: \end{equation}
508: %
509: where every object in each of these equations should consistently
510: carry either a (W) or (H) label.
511: (We have now suppressed the $\k$ subscripts
512: and introduced the notation $\partial_\alpha=\partial/\partial k_\alpha$ for
513: conciseness.)  In this notation, Eq.~(\ref{eq:omsum}) becomes
514: %
515: \begin{equation}
516:   \Omega_{\alpha\beta}(\k)=\sum_{n=1}^M f_n(\k)\,
517:   \Omega_{nn,\alpha\beta}\ph(\k)
518: \;.
519: \label{eq:omtot}
520: \end{equation}
521: %
522: This matrix is antisymmetric in the Cartesian indices.
523: Note that when $\Omega_{\alpha\beta}$ appears without a
524: (W) or (H) superscript, as on the left-hand side of this equation,
525: it denotes the total Berry curvature on the left-hand side
526: of Eq.~(\ref{eq:omsum}).
527: 
528: The matrix representation of an ordinary operator such as the
529: Hamiltonian or the velocity can be transformed from the Wannier to the 
530: Hamiltonian
531: gauge, or vice versa, just by operating on the left and right by
532: $U^\dagger(\k)$ and $U(\k)$, as in Eq.~(\ref{eq:Htrans});
533: such a matrix is called ``gauge-covariant.''
534: Unfortunately, the matrix objects in Eqs.~(\ref{eq:Awg}-\ref{eq:Owg})
535: are not gauge-covariant, because they involve $k$-derivatives acting on
536: the Bloch states. For example, a straightforward calculation shows that
537: %
538: \begin{equation}
539: A_{\alpha}\ph=
540:     U^\dagger A_{\alpha}\pw U
541:   + i U^\dagger \,\partial_\alpha U
542: \label{eq:Atrans}
543: \end{equation}
544: %
545: where each object is an $M\times M$ matrix and matrix products are
546: implied throughout.  For every matrix object ${\cal O}$, we define
547: %
548: \begin{equation}
549: \overline{\cal O}^{(\rm H)}=
550: U^\dagger {\cal O}^{(\rm W)}U
551: \label{eq:utrans}
552: \end{equation}
553: %
554: so that, by definition,
555: $\overline{\cal O}^{(\rm H)}={\cal O}^{(\rm H)}$
556: only for gauge-covariant objects.
557: 
558: The derivative $\partial_\alpha U$ may be obtained from ordinary
559: perturbation theory.  We adopt a notation in which
560: $\vt\phi_m\ra$ is the $m$-th $M$-component column vector of
561: matrix $U$, so that 
562: $\la\phi_n\vt H\pw \vt\phi_m\ra={\cal E}_n\,\delta_{nm}$;
563: the stylized bra-ket notation is used to emphasize that objects like
564: $H\pw$ and $\vt\phi_n\ra$ are $M\times M$ matrices and $M$-component
565: vectors, i.e., operators and state vectors in the ``tight-binding
566: space'' defined by the WFs, not in the original Hilbert space.
567: Perturbation theory with respect to the parameter $\k$ takes the form
568: %
569: \begin{equation}
570: \vt\partial_\alpha\phi_n\ra=\sum_{l\not= n}
571:   \frac{\la\phi_l\vt H_\alpha\pw\vt\phi_n\ra}{{\cal E}_n\ph-{\cal E}_l\ph}
572:   \,\vt\phi_l\ra
573: \label{eq:pert}
574: \end{equation}
575: %
576: where $H_\alpha\pw\equiv\partial_\alpha H\pw$.  In matrix notation
577: this can be written
578: %
579: \begin{equation}
580: \partial_\alpha U_{mn}=\sum_l\,U_{ml}\,\D_{ln,\alpha} =  (U\D_\alpha)_{mn}
581: \label{eq:dau}
582: \end{equation}
583: %
584: where
585: %
586: \begin{equation}
587: \D_{nm,\alpha}\equiv (U^{\dagger}
588: \partial_{\alpha}U)_{nm}=
589: \begin{cases}
590:   \displaystyle
591:   \frac{\overline H_{nm,\alpha}^{(\rm H)}}{{\cal E}^{(\rm H)}_{m}
592:   -{\cal E}_{n}^{(\rm H)}}& \text{if $n\not= m$}\\ \\
593:   0& \text{if $n=m$}
594: \end{cases}
595: \label{eq:ddef}
596: \end{equation}
597: %
598: and $\overline H_{nm,\alpha}^{(\rm H)}=( U^{\dagger}
599: H_{\alpha}^{(\rm W)}U)_{nm}$ according to Eq.~(\ref{eq:utrans}).
600: Note that while $\Omega_{\alpha\beta}$ and $A_\alpha$ are Hermitian
601: in the band indices, $\D_\alpha$ is instead antihermitian.
602: The gauge choice implicit in Eqs.~(\ref{eq:pert}) and (\ref{eq:ddef}) is
603: $ \la \phi_n\vt\partial_\alpha\phi_n\ra=
604: (U^{\dagger}\partial_{\alpha}U)_{nn}=0$ (this is the so-called ``parallel
605: transport'' gauge).
606: 
607: Using Eq.~(\ref{eq:dau}), Eq.~(\ref{eq:Atrans}) becomes
608: %
609: \begin{equation}
610: A_\alpha\ph=\overline{A}_\alpha\ph+i\D_\alpha
611: \label{eq:Ata}
612: \end{equation}
613: %
614: and the derivative of Eq.~(\ref{eq:twist}) becomes
615: %
616: \begin{equation}
617: |\partial_{\alpha}u_n^{(\rm H)}\rangle=
618: \sum_{m}|\partial_{\alpha} u_m^{(\rm W)}\rangle U_{mn}
619: +\sum_{m}|u_m^{(\rm H)}\rangle
620: \D_{mn,\alpha} \;.
621: \label{eq:udtrans}
622: \end{equation}
623: %
624: Plugging the latter into Eq.~(\ref{eq:Owg}), we finally obtain, after a
625: few manipulations, the matrix equations
626: %
627: \begin{eqnarray}
628: \Omega_{\alpha\beta}\ph &=&
629: \overline\Omega_{\alpha\beta}\ph - [\D_\alpha,\overline A_\beta\ph]
630: \nonumber\\
631: &&\quad + [\D_\beta,\overline A_\alpha\ph] -i[\D_\alpha,\D_\beta]\;.
632: \label{eq:om-a}
633: \end{eqnarray}
634: %
635: The band-diagonal elements $\Omega_{nn,\alpha\beta}\ph(\k)$ then
636: need to be inserted into Eq.~(\ref{eq:omtot}).
637: 
638: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
639: \subsection{Discussion}
640: \label{sec:disc}
641: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
642: 
643: We expect, based on Eq.~(\ref{eq:kubo}), that the largest
644: contributions to the AHC will come from regions of $k$-space where there
645: are small energy splittings between bands
646: (for example, near spin-orbit-split avoided crossings).\cite{fang03}
647: In the present formulation, this will
648: give rise to small energy denominators in Eq.~(\ref{eq:ddef}),
649: leading to very large $\D_\alpha$ values in those regions.
650: These large and spiky contributions will then propagate into
651: $A_\alpha\ph$ and $\Omega_{\alpha\beta}\ph$,
652: whereas $A_\alpha\pw$ and $\Omega_{\alpha\beta}\pw$,
653: and also $\overline A_\alpha\ph$ and $\overline\Omega_{\alpha\beta}\ph$,
654: will remain with their typically smaller values.
655: Thus, these spiky contributions will be present in the second and third
656: terms, and especially in the fourth term, of Eq.~(\ref{eq:om-a}).
657: The contributions of these various terms are illustrated for the
658: case of bcc Fe in Sec.~\ref{sec:berrycurv}, and we show there that
659: the last term typically makes by far the dominant contribution,
660: followed by the second and third terms, and then by the first
661: term.
662: 
663: The dominant fourth term can be recast in the form of a Kubo
664: formula as
665: %
666: \begin{equation}
667: -2{\rm Im}\sum_{m\ne n}
668: \frac{ \la \phi_{n\k} \vt H_{\alpha}^{(\rm W)} \vt \phi_{m\k} \ra
669: \la \phi_{m\k} \vt H_{\beta}^{(\rm W)} \vt \phi_{n\k} \ra }
670: {\big({\cal E}_{m}\ph-{\cal E}_{n}\ph\big)^2}\;.
671: \label{eq:kubom}
672: \end{equation}
673: %
674: The following differences between this equation and the 
675: true Kubo formula, Eq.~(\ref{eq:kubo}), should however be kept in mind.
676: First, the summation in 
677: Eq.~(\ref{eq:kubom}) is
678: restricted to the $M$-band projected band structure. Second, above 
679: $E_{\rm win}$ the projected bandstructure deviates from the original
680: {\it ab-initio} one. Third, even below $E_{\rm win}$, where they do
681: match exactly, the ``tight-binding velocity matrix 
682: elements'' appearing in Eq.~(\ref{eq:kubom}) differ from the 
683: {\it ab-initio} ones, given by Eq.~(\ref{eq:vel}). (The relation between them
684: is particularly simple within the inner window, and follows from 
685: combining the identity
686: $A_{nm,\alpha}=i\langle\psi_n|\hat v_\alpha|
687: \psi_m\rangle/(\omega_m-\omega_n)$, valid for $m\not= n$,
688: with Eqs.~(\ref{eq:ddef}-\ref{eq:Ata}).)
689: All these differences are however exactly compensated by
690: the previous three terms in Eq.~(\ref{eq:om-a}).
691: We emphasize that all terms in that equation are
692: defined strictly within the projected space spanned by the Wannier
693: functions.
694: 
695: We note in passing that it is possible to rewrite Eq.~(\ref{eq:om-a})
696: in such a way that the large spiky contributions are isolated into
697: a single term.  This alternative formulation, which turns out to be
698: related to a gauge-covariant 
699: curvature tensor, will be described in Appendix A.
700: 
701: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
702: \subsection{Sum over occupied bands}
703: \label{sec:bandsum}
704: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
705: 
706: In the above, we have proposed to compute $\Omega\ph_{nn,\alpha\beta}$
707: from Eq.~(\ref{eq:om-a}) and insert it into the band sum, Eq.~(\ref{eq:omtot}),
708: in order to compute the AHC.  However, this approach has a shortcoming
709: in that small splittings (avoided crossings) between a pair of
710: {\it occupied} bands $n$ and $m$ leads to large values of $\D_{nm,\alpha}$,
711: and thus to large but canceling contributions to the AHC coming from
712: $\Omega\ph_{nn,\alpha\beta}$ and  $\Omega\ph_{mm,\alpha\beta}$.  Here,
713: we rewrite the total Berry curvature (\ref{eq:omtot})
714: in  such a way that the cancellation is explicit.
715: 
716: Inserting Eq.~(\ref{eq:om-a}) into Eq.~(\ref{eq:omtot}) and interchanging
717: dummy labels $n\leftrightarrow m$ in certain terms, we obtain
718: %
719: \begin{eqnarray}
720: \Omega_{\alpha\beta}(\k)&=&\sum_n f_n\,\overline{\Omega}_{nn,\alpha\beta}\ph
721: \nonumber\\
722: &+& \sum_{nm} (f_m-f_n)\left(\D_{nm,\alpha}\overline{A}_{mn,\beta }\ph\right.
723: \nonumber\\
724: &&\;\;\left.               -\D_{nm,\beta }\overline{A}_{mn,\alpha}\ph
725:                       +i\D_{nm,\alpha}\D_{mn,\beta}\right) .
726: \label{eq:bsum}
727: \end{eqnarray}
728: %
729: The factors of $(f_m-f_n)$ insure that terms arising from pairs
730: of fully occupied states give no contribution.  Thus,
731: the result of this reformulation is that individual terms in
732: Eq.~(\ref{eq:bsum}) have
733: large spiky contributions only when avoided crossings or
734: near-degeneracies occur across the Fermi energy.  This approach is
735: therefore preferable from the point of view of numerical stability,
736: and it is the formula that we have implemented in the current work.
737: 
738: As expected from the discussion in Sec.~\ref{sec:disc}
739: and shown later in Sec.~\ref{sec:intanom}, the dominant term
740: in Eq.~(\ref{eq:bsum}) is the last one,
741: %
742: \begin{equation}
743: \Omega^{DD}_{\alpha\beta}=
744:     i \sum_{nm} (f_m-f_n) \D_{nm,\alpha}\D_{mn,\beta}
745: \label{eq:OmegaDD}
746: \end{equation}
747: %
748: or, in a more explicitly Kubo-like form,
749: %
750: \begin{equation}
751:   \Omega^{DD}_{\alpha\beta} = i \sum_{nm} (f_m-f_n) \; \frac
752:      { \overline{H}\ph_{nm,\alpha} \overline{H}\ph_{mn,\beta} }
753:      {\big({\cal E}_{m}\ph-{\cal E}_{n}\ph\big)^2}
754: \;.
755: \label{eq:kubototm}
756: \end{equation}
757: %
758: In the zero-temperature limit, the latter can easily be cast into a
759: form like Eq.~(\ref{eq:kubom}), but with the a double sum running
760: over occupied bands $n$ and unoccupied bands $m$, very
761: reminiscent of the original Kubo formula in 
762: Eq.~(\ref{eq:kubotot}).
763: We remark that $(m_e/\hbar)\overline{H}\ph_{nm,\alpha}$ coincides with the
764: ``effective tight-binding momentum operator'' defined in 
765: Ref.~\onlinecite{graf95}.
766: 
767: It is worth pointing out that Eq.~(\ref{eq:kubom}) can be
768: cast explicit as a Berry curvature, the tight-binding-space analog of
769: Eq.~(\ref{eq:bcurv}),
770: %
771: \begin{equation}
772: \Omega_{n,\alpha\beta}^{DD}=
773: -2\,{\rm Im}\,\la \partial_\alpha \phi_{n\k}
774: \vt\partial_\beta \phi_{n\k}\ra\;.
775: \label{eq:bcurv_tb}
776: \end{equation}
777: %
778: In this way Eq.~(\ref{eq:kubototm}) can be written in a form that closely
779: resembles the total Berry curvature, Eq.~(\ref{eq:omsum}):
780: %
781: \begin{equation}
782: \Omega^{DD}_{\alpha\beta}=\sum_{n=1}^{M} f_n \,
783:   \Omega_{n,\alpha\beta}^{DD}\;.
784: \label{eq:omsum_tb}
785: \end{equation}
786: 
787: 
788: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
789: \subsection{Evaluation of the Wannier-gauge matrices}
790: \label{sec:wangauge}
791: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
792: 
793: Eq.~(\ref{eq:bsum}) is our primary result.  To review,
794: recall that this is a condensed notation expressing the $M\times M$
795: matrix $\Omega_{nm,\alpha\beta}\ph(\k)$ in terms of matrices the
796: $\overline{\Omega}_{nm,\alpha\beta}\ph(\k)$, etc.
797: The basic ingredients needed are the four matrices
798: $H\pw$, $H_\alpha\pw$, $A_\alpha\pw$, and $\Omega_{\alpha\beta}\pw$
799: at a given $\k$. Diagonalization of the first of them yields the
800: energy eigenvalues needed to find the occupation factors $f_n$. It also 
801: provides the gauge transformation $U$ which is then used
802: to construct $\overline{H}_\alpha\ph$,
803: $\overline{A}_\alpha\ph$, and $\overline{\Omega}_{\alpha\beta}\ph$
804: from the other three objects 
805: via Eq.~(\ref{eq:utrans}).  Finally, $\overline{H}_\alpha\ph$ is
806: inserted into Eq.~(\ref{eq:ddef}) to obtain $\D_\alpha$, and all
807: terms in Eq.~(\ref{eq:bsum}) are evaluated.
808: 
809: In this section we explain how to obtain
810: the matrices $H^{(\rm W)}(\k)$, $H_\alpha^{(\rm W)}(\k)$,
811: $A_\alpha^{(\rm W)}(\k)$ and $\Omega_{\alpha\beta}^{(\rm W)}(\k)$
812: at an arbitrary point $\k$ for use in the subsequent calculations
813: described above.
814: 
815: %%%%
816: \subsubsection{Fourier transform expressions}
817: \label{sec:fourier}
818: %%%%
819: 
820: The four needed quantities can be expressed as follows:
821: %
822: \begin{equation}
823: H_{nm}\pw(\k)=\sum_\R e^{i\k\cdot\R}\;\langle{\bf 0}n|\hat{H}|\R m\rangle\,,
824: \label{eq:uu}
825: \end{equation}
826: %
827: \begin{equation}
828: H_{nm,\alpha}\pw(\k)=\sum_\R e^{i\k\cdot\R}\;
829:    iR_\alpha\,\langle{\bf 0}n|\hat{H}|\R m\rangle\,,
830: \label{eq:vv}
831: \end{equation}
832: %
833: \begin{equation}
834: A_{nm,\alpha}\pw(\k)=\sum_\R e^{i\k\cdot\R}\;
835:    \langle{\bf 0}n|\hat{r}_\alpha|\R m\rangle\,,
836: \label{eq:ww}
837: \end{equation}
838: %
839: \begin{eqnarray}
840: \Omega_{nm,\alpha\beta}\pw(\k)&=&\sum_\R e^{i\k\cdot\R}\;\Big(
841:    iR_\alpha\,\langle{\bf 0}n|\hat{r}_\beta |\R m\rangle \nonumber\\
842:   &&\qquad -iR_\beta \,\langle{\bf 0}n|\hat{r}_\alpha|\R m\rangle \Big)
843: \;.
844: \label{eq:xx}
845: \end{eqnarray}
846: %
847: (The notation $|{\bf 0}n\rangle$ refers to the $n$'th WF
848: in the home unit cell $\R={\bf 0}$.)  Eq.~(\ref{eq:uu}) follows by
849: combining Eqs.~(\ref{eq:blochW}) and (\ref{eq:hamW}), while
850: Eq.~(\ref{eq:ww}) follows by combining Eqs.~(\ref{eq:blochW}) and
851: (\ref{eq:Awg}).  Eqs.~(\ref{eq:vv}) and (\ref{eq:xx}) are then
852: obtained from (\ref{eq:uu}) and (\ref{eq:ww}) using
853: $H_{nm,\alpha}=\partial_\alpha H_{nm}$ and
854: $\Omega_{nm,\alpha\beta}=\partial_\alpha A_{nm,\beta }
855: -\partial_\beta A_{nm,\alpha}$, respectively.
856: 
857: It is remarkable that the only real-space matrix elements that are
858: required between WFs are those of the four operators
859: $\hat{H}$ and $\hat{r}_\alpha$ ($\alpha=x$, $y$, and $z$).
860: Because the WFs are strongly localized, these matrix elements
861: are expected to decay rapidly as a function of lattice vector $\R$,
862: so that only a modest number of these real-space matrix elements
863: need to be computed and stored once and for all.  Collectively, they define our
864: ``exact tight-binding model'' and suffice to allow subsequent calculation
865: of all needed quantities.  Furthermore, the short range
866: of these matrix elements in real space insures that the Wannier-gauge
867: quantities on the left-hand sides of Eqs.~(\ref{eq:uu}-\ref{eq:xx}) will
868: be smooth functions of $\k$, thus justifying the earlier discussion
869: in which it was argued that these objects should have no rapid variation or
870: enhancement in $k$-space regions where avoided crossings occur.
871: (Recall that such large, rapidly-varying contributions only appear
872: in the $\D$ matrices and in quantities that depend upon them.)
873: It should however be kept in mind that Eq.~(\ref{eq:bsum}) is not written
874: directly in terms of the smooth quantities (\ref{eq:uu}-\ref{eq:xx}), but
875: rather in terms of those quantities transformed
876: according to Eq.~(\ref{eq:utrans}). The resulting objects
877: are not smooth, since the matrices $U$ change rapidly with $\k$. However,
878: even while not smooth, they remain small.
879: 
880: 
881: %%%%%
882: \subsubsection{Evaluation of real-space matrix elements}
883: \label{sec:evalreal}
884: %%%%%
885: 
886: We conclude this section by discussing the calculation of
887: the fundamental matrix elements $\langle{\bf 0}n|\hat{H}|\R m\rangle$
888: and $\langle{\bf 0}n|\hat{r}_\alpha|\R m\rangle$.  There are several
889: ways in which these could be computed, and the choice could well vary
890: from one implementation to another.  One possibility would be to
891: construct the WFs in real space, say on a real-space grid, and then
892: to compute the Hamiltonian and position-operator matrix elements
893: directly on that grid.  In the context of a code that
894: uses a real-space basis (e.g., localized orbitals or grids), this
895: might be the best choice.  However, in the context of plane-wave
896: methods it is usually more convenient to work in reciprocal space
897: if possible.  This is in the spirit of the
898: Wannier-function construction scheme,\cite{marzari97,souza01}
899: which is formulated as a post-processing step after a conventional
900: {\it ab-initio} calculation carried out on a uniform $k$-point grid.
901: (In the following we will use the symbol $\q$ to denote the points
902: of this {\it ab-initio} mesh, to distinguish them from arbitrary
903: or interpolation-grid points denoted by $\k$.)
904: 
905: The end result of the Wannier-construction step are $M$ Bloch-like
906: functions $|u_{n\q}\pw\rangle$ at each $\q$. The WFs are obtained from
907: them via a discrete Fourier transform:
908: %
909: \begin{equation}
910: \label{eq:wf}
911: |\R n\rangle=\frac{1}{N_q^3}
912: \sum_\q\, e^{-i\q\cdot(\R-\hat\r)}|u_{n\q}\pw\rangle \;.
913: \end{equation}
914: %
915: This expression follows from inverting Eq.~(\ref{eq:blochW}). If the
916: {\it ab initio} mesh contains
917: $N_q\times N_q\times N_q$ points, the resulting WFs are
918: really periodic functions over a supercell of dimensions
919: $L\times L\times L$, where $L=N_q a$ and $a$ is the lattice constant of 
920: the unit cell. The idea then is to choose $L$ large enough that the rapid
921: decay of the localized WFs occurs on a scale much smaller than $L$. 
922: This ensures that the matrix
923: elements
924: $\langle{\bf 0}n|\hat{H}|\R m\rangle$ and
925: $\langle{\bf 0}n|\hat{r}_\alpha|\R m\rangle$
926: between a pair of WFs separated by more than $L/2$ are negligible,
927: so that further refinement of the {\it ab-initio} mesh will have
928: a negligible impact on the accuracy of Wannier-interpolated
929: quantities. (In particular, the interpolated band structure,
930: Fig.~\ref{fig:band}, is able to reproduce tiny features of
931: the full bandstructure, such as spin-orbit-induced avoided
932: crossings, even if they occur on a length scale much smaller
933: than the {\it ab-initio} mesh spacing.)  While the choice of
934: reciprocal-space cell spanned by the vectors $\q$ is immaterial,
935: because of the periodicity of reciprocal space, this is not so
936: for the vectors $\R$. In practice we choose the $N_q\times
937: N_q\times N_q$ vectors $\R$ to be evenly distributed
938: on the Wigner-Seitz supercell of volume $N_q^3 a^3$ centered
939: around $\R=\bf 0$.\cite{souza01} This is the most isotropic choice
940: possible, ensuring that the strong decay of the matrix elements for 
941: $|\R|\sim L/2$ is achieved irrespective of direction.
942: 
943: The matrix elements of the Hamiltonian are obtained from Eq.~(\ref{eq:wf})
944: as
945: %
946: \begin{eqnarray}
947: \langle{\bf 0}n|\hat{H}|\R m\rangle= \frac{1}{N_q^3}\sum_\q\,
948:               e^{-i\q \cdot \R}H\pw_{nm}(\q)\;,
949: \label{eq:hw}
950: \end{eqnarray}
951: %
952: which is the reciprocal of Eq.~(\ref{eq:uu}), with the sum running
953: over the coarse {\it ab-initio} mesh.  The position
954: matrix is obtained similarly by inverting  Eq.~(\ref{eq:ww}):
955: %
956: \begin{equation}
957: \langle {\bf 0} n|\hat r_{\alpha}|\R m\rangle =\frac{1}{N_q^3}\sum_\q\,
958: e^{-i\q \cdot \R} A_{nm,\alpha}\pw(\q) \;.
959: \end{equation}
960: %
961: The matrix $A_{nm,\alpha}\pw(\q)$ is then evaluated by
962: approximating the $k$-derivatives in Eq.~(\ref{eq:Awg}) by finite-differences 
963: on the {\it ab-initio} mesh using the expression\cite{marzari97}
964: %
965: \begin{equation}
966:         A_{nm,\alpha}\pw(\q)\simeq
967:         i\sum_\b\,
968:         w_b b_{\alpha}\Big ( \langle u_{n\q}\pw|u_{m,\q+\b}\pw
969:         \rangle
970:         - \delta_{nm}\Big )\;,
971: \label{eq:A_dis}
972: \end{equation}
973: %
974: where $\b$ are the vectors connecting $\q$ to its nearest
975: neighbors on the {\it ab-initio} mesh. This approximation is valid because
976: the Bloch states vary smoothly with $\k$ in the Wannier gauge.
977: We note that the overlap matrices appearing on the right-hand side are
978: available ``for free'' as they have already been computed and stored
979: during the WF construction procedure.
980: This is also the case for the matrices 
981: $H\pw(\q)$ needed in Eq.~(\ref{eq:hw}).
982: 
983: It should be noted that the $k$-space finite-difference procedure outlined
984: above entails an error of order ${\cal O}(\Delta q^2)$ in the
985: values of the position operator matrix elements, where $\Delta q$
986: is the {\it ab-initio} mesh spacing.  The importance of such an error
987: is easily assessed by trying denser $q$-point meshes; in our case, we find
988: that it is not a numerically significant source of error for the
989: $8\times8\times8$ mesh that we employ in our calculations.
990: (In large measure this is simply because less than
991: 0.5\% of the total AHC comes from terms that depend on these
992: position-operator matrix elements, as will be discussed in
993: Section~\ref{sec:results}.  Indeed, we find that the
994: ${\cal O}(\Delta q^2)$ convergence of this small contribution hardly
995: shows in the convergence of the total AHC, which
996: empirically appears to be approximately exponential in the {\it
997: ab-initio} mesh density.)
998: %
999: However, if the ${\cal O}(\Delta q^2)$ convergence is a source of
1000: concern, one could adopt the direct
1001: real-space mesh integration method mentioned at the beginning of this
1002: subsection, which should be free of such errors.
1003: 
1004: %==================
1005: \section{Computational details}
1006: \label{sec:cd}
1007: %==================
1008: 
1009: In this section we present some of the detailed steps of the
1010: calculations as they apply to our test system of bcc Fe.  First, we
1011: describe the first-principles bandstructure calculations that are
1012: carried out initially.
1013: Second, we discuss the procedure for
1014: constructing maximally localized Wannier functions for the bands of
1015: interest following the method of Souza, Marzari, and Vanderbilt.
1016: \cite{souza01} Third, we discuss the variable treatment of
1017: the spin-orbit interaction within these first-principles
1018: calculations, which is useful for testing the dependence of the AHC on the
1019: spin-orbit coupling.
1020: 
1021: %-------------------
1022: \subsection{Band structure calculation}
1023: %-------------------
1024: 
1025: Fully relativistic band structure calculations for bcc Fe in
1026: its ferromagnetic ground state at the experimental lattice constant
1027: $a=5.42$\,Bohr are carried out using the {\tt PWSCF}
1028: code.\cite{pwscf} A kinetic-energy cutoff of 60 Hartree is used
1029: for the planewave expansion of the valence wavefunctions (400
1030: Hartree for the charge densities).  Exchange and correlation
1031: effects are treated with the PBE generalized-gradient
1032: approximation.\cite{perdew96}
1033: 
1034: The core-valence interaction is described here by means of
1035: norm-conserving pseudopotentials which include spin-orbit
1036: effects\cite{soc2,dalcorso05} 
1037: in separable Kleinman-Bylander form.
1038: (Our overall Wannier interpolation approach
1039: is quite independent of this specific choice and can
1040: easily be generalized to other kinds of pseudopotentials or to
1041: all-electron methods.)
1042: The pseudopotential was constructed using a reference valence
1043: configuration of $3d^74s^{0.75}4p^{0.25}$. We treat the overlap
1044: of the valence states with the semicore $3p$ states using the
1045: non-linear core correction approach.\cite{nlcc} The pseudopotential
1046: core radii for the $3d$, $4s$ and $4p$ states are $1.3$, $2.0$
1047: and $2.2$\,Bohr, respectively.  We find the small cut-off radius
1048: for the $3d$ channel to be necessary in order to reproduce the
1049: all-electron bandstructure accurately.
1050: 
1051: We obtain the self-consistent ground state using a 16$\times$16$\times$16
1052: Monkhorst-Pack\cite{mp} mesh of $k$-points and a fictitious Fermi
1053: smearing \cite{coldsmear} of 0.02\,Ry for the Brillouin-zone integration.
1054: The magnetization is along the [001] direction, so that
1055: the only non-zero component of the total Berry curvature is the one
1056: along $z$.  The spin magnetic moment is found to be
1057: 2.22\,$\mu_{\rm B}$,
1058: the same as that from an 
1059: all-electron calculation\cite{yao04} and close to the experimental
1060: value of 2.12\,$\mu_{\rm B}$. 
1061: 
1062: In order to calculate the Wannier functions, we freeze the self-consistent
1063: potential and perform a non-self-consistent calculation on a uniform
1064: $n\times n\times n$ grid of $k$-points.  We tested several grid densities
1065: ranging from $n$=4 to $n$=10 and ultimately chose $n$=8 (see end of
1066: next subsection).  Since we want to construct 18 WFs ($s$, $p$, and
1067: $d$-like for spin up and down), we need to include a sufficient number of
1068: extra bands to cover the orbital character of these intended
1069: WFs everywhere in the Brillouin zone.  With this in mind, we
1070: calculate the first 28 bands at each $k$-point, and then exclude
1071: any bands above 58\,eV (the ``outer window'' of
1072: Ref.~\onlinecite{souza01}).  The 18 WFs are then disentangled from
1073: the remaining bands using the procedure described in the next
1074: section.
1075: 
1076: %--------------
1077: \subsection{Maximally-localized spinor Wannier functions for bcc Fe}
1078: \label{sec:maxloc}
1079: %--------------
1080: 
1081: The energy bands of interest (extending up to, and just above,
1082: the Fermi energy) have mainly mixed $s$ and $d$ character and
1083: are entangled with the bands at higher energies.  In order to
1084: construct maximally-localized WFs to describe these bands,
1085: we use a two-step post-processing procedure\cite{souza01}
1086: as implemented in the {\tt WANNIER90} code.\cite{wannier} 
1087: In the first (``disentangling'') step,
1088: an 18-band subspace (the ``projected space'') is identified that
1089: minimizes the invariant part of the spread functional, subject
1090: to the constraint of including the states within an  
1091: inner energy window.\cite{souza01} We chose this window to span an energy
1092: range of 30~eV from the bottom of the valence bands
1093: (up to $E_{\rm win}$ in Fig.~\ref{fig:band}).
1094: In the second step,
1095: a set of maximally-localized WFs spanning this
1096: subspace is chosen by minimizing the gauge-dependent part of the 
1097: spread functional.\cite{marzari97}
1098: 
1099: Although the original prescription for obtaining maximally-localized Wannier
1100: functions was formulated for the spinless case, it is trivial to
1101: adapt it to treat spinor wavefunctions, in which case the resulting WFs also
1102: have spinor character: each element of the overlap matrix,
1103: which is the key input to the WF-generation code, is
1104: simply calculated as the sum of two spin components,
1105: %
1106: \begin{equation}
1107:   S_{\k,\b}^{nm} = \sum_{\sigma} \,
1108:   \langle u_{n\k}^{\sigma} | u_{m,\k+\b}^{\sigma} \rangle \;,
1109: \end{equation}
1110: %
1111: where $| u_{n,\k+\b}^{\sigma} \rangle$ is one of the two components
1112: of the spinor wavefunction. 
1113: 
1114: In order to facilitate later analysis (e.g., of the orbital and
1115: spin character of various bands), we have modified the second
1116: step as follows.  At each $k$-point on the $8\times 8\times 8$
1117: mesh, we form the 18$\times$18 matrix representation of the spin
1118: operator $\hat S_z=(\hbar/2)\hat\sigma_z$ 
1119: in the space of band states and diagonalize it.
1120: The $18$-dimensional space at this $k$-point is then divided in
1121: two $9$-dimensional subspaces, a mostly spin-up subspace spanned
1122: by the eigenstates having $S_z$ eigenvalues close to $+1$, and
1123: a mostly spin-down subspace associated with eigenvalues close
1124: to $-1$ (we will use units of
1125: $\hbar/2$ whenever we discuss $S_z$ in the remainder of the manuscript).  
1126: The spread functional is then minimized within each of
1127: these subspaces separately.  We thus emerge with 18 well-localized
1128: WFs divided into two groups: nine that are almost entirely spin-up
1129: and nine that are almost entirely spin-down (in practice we find
1130: $|\langle \hat S_z \rangle|> 0.999$ in all cases).
1131: While this procedure results in a total spread that is slightly
1132: greater than would be obtained otherwise, we find that the difference
1133: is very small in practice,
1134: and the imposition of these rules makes for a much more transparent
1135: analysis of subsequent results.  For example, it makes it much easier
1136: to track the changes in the WFs before and after the spin-orbit
1137: coupling is turned on, or to identify the spin character
1138: of various pieces of the Fermi surface.
1139: 
1140: \begin{figure}
1141: \begin{center}
1142: \epsfig{file=fig2.eps,width=2.6in}
1143: \end{center}
1144: \caption{(Color online). 
1145: Isosurface contours of maximally-localized spin-up WF in
1146: bcc Fe (red for positive value and blue for negative value), for
1147: the $8 \times 8 \times 8$ $k$-point sampling. (a) $sp^3d^2$-like WF
1148: centered on a Cartesian axis; (b) $d_{xy}$-like WF centered on the
1149: atom.}
1150: \label{fig:wfs}
1151: \end{figure}
1152: 
1153: To start the minimization procedure, we choose trial functions having
1154: pure spin character (up or down) and a spatial form of
1155: a Gaussian times a predetermined angular factor.
1156: In our first attempts, we chose angular factors appropriate for the
1157: three $t_{2g}$ states $d_{xy}$, $d_{xz}$, and 
1158: $d_{yz}$; the two $e_g$ states
1159: $d_{z^2}$ and $d_{x^2-y^2}$; the three $p$ states $p_x$, $p_y$, and
1160: $p_z$; and $s$.  The iterative procedure\cite{souza01}
1161: then projects these onto
1162: the band subspace and improves upon them.  We found that the spread
1163: minimization procedure converted the $t_{2g}$ trial functions into
1164: $t_{2g}$-like WFs, while it mixed the other six states to form six
1165: hybrid WFs of $sp^3d^2$-type.\cite{Pauling31}
1166: Having discovered this, we have modified our procedure accordingly:
1167: henceforth, we choose three $t_{2g}$-like trial functions and
1168: six $sp^3d^2$-like ones.  With this initialization, we find the
1169: convergence to be quite rapid, with only about 100 iterations
1170: needed to get a well-converged spread functional.
1171: 
1172: The WFs that result from this procedure are shown in Fig.~\ref{fig:wfs}.
1173: The up-spin WFs are plotted,
1174: but the WFs are very similar for both spins.
1175: An example of an $sp^3d^2$-hybrid WF is shown in
1176: Fig.~\ref{fig:wfs}(a); this one extends along the $-x$ axis,
1177: and the five others are similarly projected along the $+x$, $\pm y$,
1178: and $\pm z$ axes.  One of the $t_{2g}$-like WFs is shown in
1179: Fig.~\ref{fig:wfs}(b); this one has $xy$ symmetry, while the others
1180: have $xz$ and $yz$ symmetry.  The centers of the $sp^3d^2$-like WFs
1181: are slightly shifted from the atomic center along $\pm x$, $\pm y$,
1182: or $\pm z$, while the  $t_{2g}$-like WFs remain centered on the atom.
1183: 
1184: We studied the convergence of the WFs and interpolated bands as a function
1185: of the density $n\times n\times n$ of the Monkhorst-Pack $k$-mesh
1186: used for the initial {\it ab-initio} calculation.  We tested
1187: $n=4$, 6, 8, and 10, and found that $n=8$ provided the best tradeoff
1188: between interpolation accuracy and computational cost.  This is
1189: the mesh that was used in generating the results presented in
1190: Sec.~\ref{sec:results}.
1191: 
1192: %--------------
1193: \subsection{Variable spin-orbit coupling in the pseudopotential framework}
1194: \label{sec:spinorbit}
1195: %--------------
1196: 
1197: Since the AHE present in ferromagnetic iron is a spin-orbit-induced effect,
1198: it is obviously important
1199: to understand the role of this coupling as thoroughly as possible.
1200: For this purpose, it is very convenient to be able to treat the
1201: strength of the coupling as an adjustable parameter.  For example,
1202: by turning up the spin-orbit coupling continuously from
1203: zero and tracking how various contributions to the AHC behave, it
1204: is possible to separate out those contributions that are of linear,
1205: quadratic, or higher order in the coupling strength. 
1206: Some results of this kind will be given later in Sec.~\ref{sec:results}.
1207: 
1208: Because the spin-orbit coupling is a relativistic effect, it is
1209: appreciable mainly in the core region of the atom where the
1210: electrons have relativistic velocities.  In a pseudopotential framework
1211: of the kind adopted here, both the scalar relativistic effects and
1212: the spin-orbit coupling are included in the pseudopotential construction.
1213: For example, in the Bachelet-Hamann semilocal pseudopotential scheme,
1214: \cite{bachelet82} the construction procedure generates, for each
1215: orbital angular momentum $l$, a scalar-relativistic potential
1216: $V^{\rm sr}_l(r)$ and a spin-orbit difference potential
1217: $V^{\rm so}_l(r)$ which enter the Hamiltonian in the form
1218: %
1219: \begin{equation}
1220: \hat{V}_{\rm ps}=\sum_l \hat{P}_l\,\left[ V^{\rm sr}_l(r) + \lambda\,
1221:     V^{\rm so}_l(r)\, {\bf L}\cdot {\bf S}\right] \;,
1222: \end{equation}
1223: %
1224: where $\hat{P}_l$ is the projector onto states of orbital angular
1225: momentum $l$ and $\lambda$ controls the strength of spin-orbit coupling
1226: (with $\lambda$=1 being the physical value).  For the free atom,
1227: this correctly leads to eigenstates labeled by total angular
1228: momentum $j=l\pm1/2$.
1229: 
1230: In our calculations, we employ fully non-local pseudopotentials instead
1231: of semilocal ones because of their computationally efficient form.
1232: In this case, controlling the strength of the spin-orbit coupling
1233: requires some algebraic manipulation.
1234: %
1235: We write the norm-conserving non-local pseudopotential operator as
1236: %
1237: \begin{equation}
1238: \hat{V}_{\rm ps}=|\beta_{lj\mu}\rangle\, D_{lj} \,\langle\beta_{lj\mu}|
1239: \label{eq:Vps}
1240: \end{equation}
1241: %
1242: where there is an implied sum running over the indices (orbital angular
1243: momentum $l$, total angular momentum $j=l\pm1/2$, and $\mu=-j,...,j$)
1244: and species and atomic position indices have been suppressed.
1245: The $|\beta_{lj\mu}\rangle$ are radial
1246: functions multiplied by appropriate spin-angular functions and the $D_{lj}$
1247: are the channel weights.
1248: %
1249: We introduce the notation
1250: $\beta^{(+)}_{l}(r)$ and $\beta^{(-)}_{l}(r)$ for the radial
1251: parts of $|\beta_{l,l+1/2,\mu}\rangle$ and $\beta_{l,l-1/2,\mu}\rangle$,
1252: respectively, and similarly define $D^{(\pm)}_{l}=D_{l,l\pm1/2}$.
1253: %
1254: Using this notation, we can define the scalar-relativistic
1255: (i.e., $j$-averaged) quantities
1256: %
1257: \begin{equation}
1258:         D^{\rm sr}_{l}=\frac{l+1}{2l+1}\,D^{(+)}_l
1259:                          +\frac{l}{2l+1}\,D^{(-)}_l  \;,
1260: \end{equation}
1261: %
1262: \begin{equation}
1263:  \beta^{\rm sr}_{l}(r)=
1264:    \frac{l+1}{2l+1}\,
1265:       \sqrt{\frac{D^{(+)}_l}{D^{\rm sr}_{l}}}\;\,\beta^{(+)}_l(r)
1266:   +\frac{l}{2l+1}\,
1267:       \sqrt{\frac{D^{(-)}_l}{D^{\rm sr}_{l}}}\; \,\beta^{(-)}_l(r)
1268: \end{equation}
1269: %
1270: and the corresponding spin-orbit difference quantities
1271: %
1272: \begin{equation}
1273: D^{\rm so}_{lj}=D_{lj}-D^{\rm sr}_{l}\;,
1274: \end{equation}
1275: %
1276:  \begin{equation}
1277:  |\beta^{\rm so}_{lj\mu}\rangle=
1278:     |\beta_{lj\mu}\rangle-|\beta^{\rm sr}_{lj\mu}\rangle \;.
1279:  \end{equation}
1280:  where $|\beta^{\rm sr}_{lj\mu}\rangle$ is $\beta^{\rm sr}_{l}(r)$
1281:  multiplied by the spin-angular function with labels $(lj\mu)$.
1282: %
1283: Then the non-local pseudopotential can be written as
1284: %
1285: \begin{equation}
1286: \hat{V}_{\rm ps}=\hat{V}^{\rm sr}+\lambda\,\hat{V}^{\rm so}
1287: \end{equation}
1288: %
1289: where
1290: %
1291: \begin{equation}
1292: \hat{V}^{\rm sr}=|\beta^{\rm sr}_{lj\mu}\rangle \, D^{\rm sr}_{l} \,
1293:     \langle \beta^{\rm sr}_{lj\mu} |
1294: \label{eq:Vsr}
1295: \end{equation}
1296: %
1297: and
1298: %
1299: \begin{eqnarray}
1300: \hat{V}_{\rm so}&=&|\beta^{\rm sr}_{lj\mu}\rangle \, D^{\rm so}_{lj}\, 
1301:     \langle \beta^{\rm sr}_{lj\mu} | \nonumber\\&&
1302:      +\,|\beta^{\rm so}_{lj\mu}\rangle \, (D^{\rm sr}_{l}+D^{\rm so}_{lj}) \,
1303:     \langle \beta^{\rm sr}_{lj\mu} | \nonumber\\&&
1304:      +\,|\beta^{\rm sr}_{lj\mu}\rangle \, (D^{\rm sr}_{l}+D^{\rm so}_{lj}) \,
1305:     \langle \beta^{\rm so}_{lj\mu} | \nonumber\\&&
1306:      +\,|\beta^{\rm so}_{lj\mu}\rangle \, (D^{\rm sr}_{l}+D^{\rm so}_{lj}) \,
1307:     \langle \beta^{\rm so}_{lj\mu} | \;.
1308: \end{eqnarray}
1309: %
1310: This clearly reduces to the desired results (\ref{eq:Vps}) for $\lambda=1$ and
1311: (\ref{eq:Vsr}) for $\lambda=0$.
1312: 
1313: 
1314: 
1315: %==================
1316: \section{Results}
1317: \label{sec:results}
1318: %==================
1319: 
1320: In this section, we present the results of the calculations of the
1321: Berry curvature and its integration over the BZ using the formulas
1322: presented in Sec.~\ref{sec:fdbc}, for the case of bcc Fe.
1323: 
1324: %--------------
1325: \subsection{Berry Curvature}
1326: \label{sec:berrycurv}
1327: %--------------
1328: 
1329: \begin{figure}
1330: \begin{center}
1331:  \epsfig{file=fig3.eps,width=2.8in}
1332: \end{center}
1333: \caption{Band structure and total Berry curvature, as calculated
1334: using Wannier interpolation, plotted along the path $\Gamma$--H--P
1335: in the Brillouin zone.
1336: (a) Computed at the full spin-orbit coupling strength $\lambda=1$.
1337: (b) Computed at the reduced strength $\lambda=0.25$.
1338: The peak marked with a star has a height of 5$\times$10$^4$\,a.u. }
1339: \label{fig:bdbc}
1340: \end{figure}
1341: 
1342: We begin by illustrating the very sharp and strong variations that
1343: can occur in the total Berry curvature, Eq.~(\ref{eq:omega_tot}),
1344: near Fermi-surface features in the bandstructure.\cite{fang03}  In
1345: Fig.~\ref{fig:bdbc}(a) we plot the energy bands (top subpanel) and the total
1346: Berry curvature (bottom subpanel) in the vicinity of the the
1347: zone-boundary point ${\rm H}=\frac{2\pi}{a}(1,0,0)$, where three
1348: states, split by the spin-orbit interaction, lie just above the Fermi
1349: level. The large spike in the Berry curvature between the H and P
1350: points arises where two bands, split by the spin orbit interaction, lie
1351: on either side of the Fermi level.\cite{yao04} 
1352: This gives rise to small
1353: energy denominators, and hence large contributions, mainly in
1354: Eq.~(\ref{eq:kubototm}).  On reducing the strength of the
1355: spin-orbit interaction as in Fig.~\ref{fig:bdbc}(b), the
1356: energy separation between these bands is reduced, resulting in a
1357: significantly sharper and higher spike in the Berry curvature.
1358: A second type of sharp structure is visible in Fig.~\ref{fig:bdbc},
1359: where one can see two smaller spikes, one at about 40\% and another
1360: at about 90\% of the way from $\Gamma$ to H, which
1361: decrease in magnitude as the as the spin-orbit
1362: coupling strength is reduced.  These arise from pairs of bands that
1363: straddle the Fermi energy even in the absence of spin-orbit
1364: interaction.  Thus, the small spin-orbit coupling does not shift the
1365: energies of these bands significantly, but it does induce an
1366: appreciable Berry curvature that is roughly linear in the spin-orbit
1367: coupling.
1368: 
1369: \begin{figure}
1370: \begin{center}
1371: \epsfig{file=fig4.eps,width=2.8in} 
1372: \end{center}
1373: \caption{Decomposition of the total Berry curvature into contributions coming
1374: from the three kinds of terms appearing in Eq.~(\ref{eq:bsum}).
1375: The path in $k$-space is the same as in Fig.~\ref{fig:bdbc}.
1376: Dotted line is the first ($\overline{\Omega}$) term, dashed line is the sum
1377: of second and third ($D$--$\overline{A}$) terms, and solid line is the
1378: fourth ($D$--$D$) term of
1379: Eq.~(\ref{eq:bsum}). Note the log scale on the vertical axis.}
1380: \label{fig:3bc}
1381: \end{figure}
1382: 
1383: The decomposition of the total Berry curvature into its various
1384: contributions in Eq.~(\ref{eq:bsum}) is illustrated by plotting
1385: the first (``$\overline{\Omega}$'') term,
1386: the second and third (``$D$--$\overline{A}$'') terms,
1387: and the fourth (``$D$--$D$'' or Kubo-like) term of Eq.~(\ref{eq:bsum})
1388: separately along the line $\Gamma$--H--P.  Note the logarithmic scale.
1389: The results confirm the expectations expressed in
1390: Secs.~\ref{sec:disc} and \ref{sec:bandsum}, namely, that the largest
1391: terms would be those reflecting large contributions to $D$ arising
1392: from small energy denominators.  Thus, the $\overline{\Omega}$
1393: term remains small everywhere, the $D$--$\overline{A}$ terms become
1394: one or two orders of magnitude larger at places where small energy
1395: denominators occur, and the $D$--$D$ term, Eq.~(\ref{eq:kubototm}),
1396: is another one or two orders
1397: larger in those same regions.  Scans along other
1398: lines in $k$-space reveal similar behavior.  We may therefore expect
1399: that the $D$--$D$ term will make the dominant overall contribution
1400: to the AHC. As we shall show in the next subsection, this is precisely
1401: the case.
1402: 
1403: In order to get a better feel for the connection between Fermi surface
1404: features and the Berry curvature, we next inspect these quantities
1405: on the $k_y=0$ plane in the Brillouin zone, following Ref.~\onlinecite{yao04}.
1406: In Fig.~\ref{fig:fermispin}
1407: we plot the intersection of the Fermi surface with this plane
1408: and indicate, using color coding, the $S_z$ component of the spin
1409: carried by the corresponding wavefunctions. 
1410: The good agreement between the shape of the Fermi surface given here
1411: and in Fig.~3 of Ref.~\onlinecite{yao04} is further evidence that the
1412: accuracy of our approach matches that of all-electron methods.
1413: It is evident that the presence of the spin-orbit interaction, in
1414: addition to the exchange splitting, is sufficient to remove all
1415: degeneracies on this plane,\cite{singh73} 
1416: changing significantly the connectivity of the Fermi surface.
1417: 
1418: \begin{figure}
1419: \begin{center}
1420: \epsfig{file=fig5.eps,width=2.5in}
1421: \end{center}
1422: \caption{(Color online). 
1423: Lines of intersection between the Fermi surface and the plane $k_y=0$.
1424: Colors indicate the $S_z$ spin-component of the states
1425: on the Fermi surface (in units of $\hbar/2$).}
1426: \label{fig:fermispin}
1427: \end{figure}
1428: 
1429: \begin{figure}
1430: \begin{center}
1431: \epsfig{file=fig6.eps,width=2.6in}
1432: \end{center}
1433: \caption{(Color online).
1434: Calculated total Berry curvature $\Omega_z$ in the plane $k_y=0$
1435: (note log scale).  Intersections of the Fermi surface with this
1436: plane are again shown.}
1437: \label{fig:bc}
1438: \end{figure}
1439: 
1440: The calculated Berry curvature is shown in Fig.~\ref{fig:bc}. It
1441: can be seen that the regions in which the Berry curvature is small
1442: (light green regions) fill most of the plane.  The largest values
1443: occur at the places where two Fermi lines approach one another,
1444: consistent with the the discussion of Fig.~\ref{fig:bdbc}.
1445: Of special importance are the avoided crossings between two
1446: bands having the same sign of spin, or between two bands of
1447: opposite spin.  Examples of both kinds are visible in the figure,
1448: and both tend to give rise to very large contributions in the
1449: region of the avoided crossing.
1450: Essentially, the spin-orbit interaction
1451: causes the character of these bands to change extremely rapidly with
1452: $\k$ near the avoided crossing; this is the origin of the
1453: large Berry curvature.  The large contributions near the H points
1454: correspond to the peaks that were already mentioned in the discussion of
1455: Fig.~\ref{fig:bdbc}, resulting from mixing of nearly degenerate bands
1456: by the spin-orbit interaction.
1457: 
1458: %--------------
1459: \subsection{Integrated anomalous Hall conductivity}
1460: \label{sec:intanom}
1461: %--------------
1462: 
1463: We now discuss the computation of the AHC as an integral
1464: of the Berry curvature over the Brillouin zone,
1465: Eq.~(\ref{eq:sigma_b}).  We first define a nominal $N_0\times
1466: N_0\times N_0$ mesh that uniformly fills the Brillouin zone.
1467: We next reduce this to a sum over the irreducible wedge
1468: that fills $\frac{1}{16}$th of the Brillouin zone, using the
1469: tetragonal point-group symmetry (broken from cubic by the onset
1470: of ferromagnetism), and calculate $\Omega_z$ on each mesh point
1471: using Eq.~(\ref{eq:bsum}).  Finally, following Yao {\it et al.},
1472: \cite{yao04} we implement an adaptive
1473: mesh refinement scheme in which we identify those points of
1474: the $k$-space mesh at which the computed Berry curvature exceeds
1475: a threshold value $\Omega_{\rm cut}$, and recompute $\Omega_z$ on
1476: an $N_a\times N_a\times N_a$ submesh spanning the original cell
1477: associated with this mesh point.  The AHC is then computed as a sum
1478: of $\Omega_z$ over this adaptively refined mesh with appropriate
1479: weights.
1480: 
1481: \begin{table}
1482: \caption{Convergence of AHC with respect to the density of the
1483: nominal $k$-point mesh (left column) and the adaptive refinement
1484: scheme used to subdivide the mesh in regions of large contributions
1485: (middle column).}
1486: \begin{ruledtabular}
1487: \begin{tabular}{ccc}
1488: $k$-point mesh & Adaptive refinement & $\sigma$ $(\Omega$ ${\rm cm})^{-1}$ \\
1489: \hline
1490: $200 \times 200 \times 200$ & $3 \times 3 \times 3$ & 774.55 \cr
1491: $250 \times 250 \times 250$ & $3 \times 3 \times 3$ &774.84\cr
1492: $320 \times 320 \times 320$ & $3 \times 3 \times 3$ &775.80\cr
1493: $200 \times 200 \times 200$ & $5 \times 5 \times 5$ &765.96\cr
1494: $250 \times 250 \times 250$ & $5 \times 5 \times 5$ &766.37\cr
1495: $320 \times 320 \times 320$ & $5 \times 5 \times 5$ & 766.76\cr
1496: $200 \times 200 \times 200$ & $7 \times 7 \times 7$ &763.87\cr
1497: $250 \times 250 \times 250$ & $7 \times 7 \times 7$ &764.84\cr
1498: $320 \times 320 \times 320$ & $7 \times 7 \times 7$ & 765.10\cr
1499: $320 \times 320 \times 320$ & $9 \times 9 \times 9$ & 764.59\cr
1500: $320 \times 320 \times 320$ & $11 \times 11 \times 11$ & 764.37\cr
1501: $320 \times 320 \times 320$ & $13 \times 13 \times 13$ &764.27\cr
1502: \end{tabular}
1503: \end{ruledtabular}
1504: \label{table:convergence}
1505: \end{table}
1506: 
1507: The convergence of the AHC with respect to the choice of
1508: mesh is presented in Table~\ref{table:convergence}. 
1509: We have chosen $\Omega_{\rm cut}=1.0\times10^2$~a.u., which causes the
1510: adaptive mesh refinement to be triggered at approximately 0.11\% of
1511: the original mesh points.
1512: %
1513: The value of 751\,$(\Omega$ ${\rm cm})^{-1}$ reported previously in
1514: Ref.~\onlinecite{yao04} corresponds to a mesh similar to the one in
1515: the first line of Table~\ref{table:convergence}; our value of
1516: 775\,$(\Omega$ ${\rm cm})^{-1}$ for this mesh thus agrees to within
1517: a few percent with their value.
1518: Based on the results of Table~\ref{table:convergence}, we estimate
1519: the converged value of $\sigma$ to be
1520: 764\,$(\Omega$ ${\rm cm})^{-1}$.
1521: 
1522: It can be seen from Table~\ref{table:convergence}
1523: that a $200\times200\times200$ mesh with $3\times3\times3$
1524: refinement approaches within $\sim$1\% of the converged value.  It is
1525: also evident that the level of refinement is more important than
1526: the fineness of the nominal mesh; a $200\times200\times200$
1527: mesh with $5\times5\times5$ adaptive refinement yields a result
1528: that is within 0.1\% of the converged value, better than a
1529: $320\times320\times320$ mesh with a lower level of refinement.
1530: 
1531: \begin{table}
1532: \caption{Contributions to the AHC coming from different regions
1533: of the Brillouin zone, as defined in the text.}
1534: \begin{ruledtabular}
1535: \begin{tabular}{cccc}
1536: $\Delta  E$  (eV)&   like-spin (\%) & opposite-spin (\%) & smooth (\%)\\
1537: \hline
1538: 0.1 & 21 & 26 & 53 \cr
1539: 0.2 & 23 & 51 & 26 \cr
1540: 0.5 & 30 & 68 & 2\cr
1541: \end{tabular}
1542: \end{ruledtabular}
1543: \label{table:percentage}
1544: \end{table}
1545: 
1546: It is interesting to decompose the total AHC into contributions coming
1547: from different parts of the Brillouin zone.  For example, as we saw in
1548: Fig.~\ref{fig:bc}, there is a smooth, low-intensity background
1549: that fills most of the volume of the Brillouin zone, and it is hard to
1550: know {\it a priori} whether the total AHC is dominated by these
1551: contributions or by the much larger ones concentrated in small
1552: regions.  With this motivation, we have somewhat arbitrarily
1553: divided the Brillouin zone into three kinds of regions, which we
1554: label as `smooth', `like-spin', and `opposite-spin'.  To do this,
1555: we identify $k$-points at which there is an occupied band in the
1556: interval $[E_f-\Delta E,E_f]$ and an unoccupied band in the interval
1557: $[E_f,E_f+\Delta E]$, where $\Delta E$ is arbitrarily chosen to be a
1558: small energy such as $0.1$, $0.2$, or $0.5$\,eV.  If so,
1559: the $k$-point is said to belong to the `like-spin' or
1560: `opposite-spin' region depending on whether the dominant characters of
1561: the two bands below and above the Fermi energy are of the same
1562: or of opposite spin.  Otherwise, the $k$-point is assigned to the
1563: `smooth' region.
1564: As shown in Table~\ref{table:percentage}, the results depend strongly
1565: on the value of $\Delta E$. Overall, what is clear is that the major
1566: contributions arise from the bands within $\pm 0.5$\,eV of $E_f$,
1567: and that neither like-spin nor opposite-spin contributions are dominant.
1568: 
1569: Next, we return to the discussion of the decomposition of
1570: the total Berry curvature in Eq.~(\ref{eq:bsum}) into
1571: $\overline{\Omega}$, $D$--$\overline{A}$, and $D$--$D$ terms.
1572: We find that these three kinds of terms account for $-$0.20\%,
1573: 0.71\%, and  99.48\%, respectively, of the total AHC.
1574: (Similarly, for the alternative decomposition of
1575: Appendix A, the second term of Eq.~(\ref{eq:bsumalt}) is
1576: found to be responsible for more than 99\% of the total.)
1577: Thus, if a 1\% accuracy is acceptable, one could
1578: actually neglect the $\overline{\Omega}$ and $D$--$\overline{A}$
1579: terms entirely, and approximate the total AHC by the $D$--$D$
1580: (Kubo-like) terms alone, Eq.~(\ref{eq:kubototm}).  
1581: 
1582: While we had anticipated in Sec.~\ref{sec:disc}
1583: that the $D$--$D$ terms should be expected to dominate, the extent to which 
1584: that occured in the actual calculation is somewhat surprising and merits 
1585: further discussion. It is important to emphasize that this should
1586: not be expected to occur when using an arbitrary Wannier representation, but 
1587: only for WFs which minimize the spread functional. Indeed, only the sum of all 
1588: terms in Eq.~(\ref{eq:bsum}) is uniquely defined; taken separately, the 
1589: $\overline{\Omega}$, $D$--$\overline{A}$, and $D$--$D$ terms
1590: depend on the choice of gauge. Moreover, while the $\overline{\Omega}$ and
1591: $D$--$\overline{A}$ terms involve both the Hamiltonian and position matrix
1592: elements between WFs, the dominant $D$--$D$ term only depends on the 
1593: Hamiltonian matrix elements. Since the minimization of the gauge-dependent
1594: part of the spread functional corresponds precisely to minimizing the RMS
1595: average magnitude of the position matrix element between WFs,\cite{marzari97}
1596: it is perhaps not too surprising that we capture most of the AHC by neglecting
1597: the terms which involve position matrix elements.
1598: 
1599: From a computational point of view,
1600: the fact that the $D$--$D$ terms are fully specified by the
1601: Hamiltonian matrix elements alone means that considerable
1602: savings can be obtained by avoiding the evaluation of
1603: the Fourier transforms in Eqs.~(\ref{eq:ww}-\ref{eq:xx})
1604: at every interpolation point
1605: (and avoiding the setup of the matrix elements 
1606: $\langle {\bf 0}n|\hat r_\alpha|{\bf R}m\rangle$, which can be costly in a
1607: real-space implementation). More importantly,
1608: this observation, if it turns out to hold for other materials as well,
1609: could prove to be important for future efforts to
1610: derive approximate schemes capable of capturing the most important
1611: contributions to the AHC.
1612: 
1613: Finally, we investigate how the total AHC depends upon the
1614: strength of the spin-orbit interaction, following the approach
1615: of Sec.~\ref{sec:spinorbit} to modulate the spin-orbit strength.
1616: The result is shown in Fig.~\ref{fig:lam}.  We emphasize that our
1617: approach is a more specific test of the dependence upon spin-orbit
1618: strength than the one carried out in Ref.~\onlinecite{yao04};
1619: there, the speed of light $c$ was varied, which entails changing
1620: the strength of the various scalar relativistic terms as well.
1621: Nevertheless, both studies lead to a similar conclusion: the
1622: variation is found to be linear for small values of the spin-orbit
1623: coupling ($\lambda\ll 1$), while quadratic or other higher-order
1624: terms also become appreciable when the full interaction is included
1625: ($\lambda=1$).
1626: 
1627: \begin{figure}
1628: \begin{center}
1629: \epsfig{file=fig7.eps,width=2.6in}
1630: \end{center}
1631: \caption{Anomalous Hall conductivity vs.~spin-orbit coupling strength.}
1632: \label{fig:lam}
1633: \end{figure}
1634: 
1635: %--------------
1636: \subsection{Computational Considerations}
1637: %--------------
1638: 
1639: The computational requirements for this scheme are quite modest. 
1640: The self-consistent ground state calculation and the construction
1641: of the WFs takes 2.5 hours on a single 2.2GHz AMD-Opteron processor.
1642: The expense of computing the AHC as a sum over interpolation mesh
1643: points depends strongly on the density of the mesh.  On the same processor
1644: as above, the average CPU time to evaluate $\Omega_z$ on each
1645: $k$-point was about 14\,msec. We find that the mesh
1646: refinement operation does not significantly increase the total
1647: number of $k$-point evaluations until the refinement level $N_a$
1648: exceeds $\sim$10.  Allowing for the fact that the calculation only
1649: needs to be done in the irreducible $\frac{1}{16}$th of the
1650: Brillouin zone, the cost for the AHC evaluation on a
1651: 200$\times$200$\times$200 mesh is about 2 hours.
1652: 
1653: The CPU time per $k$-point evaluation is dominated (roughly 90\%)
1654: by the Fourier transform operations needed to construct the objects
1655: in Eqs.~(\ref{eq:uu}-\ref{eq:xx}). The diagonalization of the
1656: 18$\times$18 Hamiltonian matrix, and other operations needed to
1657: compute Eq.~(\ref{eq:bsum}), account for only about 10\% of the
1658: time.  The CPU requirement for the Fourier transform step is
1659: roughly proportional to the number of $\R$ vectors kept in
1660: Eqs.~(\ref{eq:uu}-\ref{eq:xx}); it is possible that this number
1661: could be reduced by exploring more sophisticated methods for
1662: truncating the contributions coming from the more
1663: distant $\R$ vectors.
1664: 
1665: Of course, the loop over $k$-points in the AHC calculation is
1666: trivial to parallelize, so for dense $k$-meshes we speed up this
1667: stage of the calculation by distributing across multiple processors.
1668: 
1669: %==================
1670: \section{Summary and Discussion}
1671: \label{sec:conclusion}
1672: %==================
1673: 
1674: In summary, we have developed an efficient method for computing the
1675: intrinsic contribution to the anomalous Hall conductivity of a
1676: metallic ferromagnet as a Brillouin-zone integral of the Berry
1677: curvature. Our approach is based on Wannier interpolation, a powerful
1678: technique for evaluating properties that require a very dense sampling
1679: of the Brillouin zone or Fermi surface. The key idea is to map the low-energy
1680: first-principles electronic structure onto an ``exact
1681: tight-binding model'' in the basis of appropriately constructed Wannier 
1682: functions, which are typically partially occupied.
1683: In the Wannier representation the desired quantities can then be evaluated
1684: at arbitrary $k$-points at very low computational cost.
1685: All that is needed is to evaluate, once and for all, the Wannier-basis
1686: matrix elements of the Hamiltonian and a few other
1687: property-specific operators (namely, for the Berry curvature, the
1688: three Cartesian position operators).
1689: 
1690: When evaluating the Berry curvature in this 
1691: way, the summation over all unoccupied bands and the expensive
1692: calculation of the velocity matrix elements needed in the traditional Kubo
1693: formula are circumvented.\cite{explan-kubo}
1694: They are replaced by quantities defined strictly within the
1695: projected space spanned by the WFs. Our final expression for the total Berry
1696: curvature, Eq.~(\ref{eq:bsum}), consists of three types of terms,
1697: i.e., the $\overline{\Omega}$, $D$--$\overline{A}$, and $D$--$D$ terms.
1698: 
1699: We have applied this approach to calculate the AHC of bcc Fe.
1700: While our  Wannier interpolation formalism, with its
1701: decomposition (\ref{eq:bsum}),
1702: is entirely independent of the choice of an all-electron or
1703: pseudopotential method, we have chosen here a relativistic pseudopotential
1704: approach that includes scalar relativistic effects as well as the 
1705: spin orbit interaction.  We find that this
1706: scheme successfully reproduces the fine details of the electronic
1707: structure and of the Berry curvature in good agreement with a
1708: previous calculation \cite{yao04} that used an all-electron LAPW
1709: method.\cite{singh}  The computed AHC is also quite close to that
1710: computed previously.\cite{yao04}
1711: 
1712: Interestingly, we found that more than 99\% of the total
1713: Berry curvature is concentrated in the $D$--$D$ term of our formalism.
1714: This term, given explicitly in Eq.~(\ref{eq:kubototm}), takes the form of 
1715: a Kubo-like Berry curvature formula  for the ``tight-binding states.''
1716: Thus we arrive at the very appealing result that
1717: a Kubo picture defined within the ``exact tight-binding space'' gives an
1718: excellent representation of the Berry curvature in the original 
1719: {\it ab-initio} space. It is worth pointing out that, unlike the other
1720: three terms, this term depends
1721: exclusively on the Hamiltonian matrix elements between the Wannier orbitals,
1722: and not on the position matrix elements. This result merits
1723: further investigation, and may be relevant for
1724: recent discussions in the
1725: tight-binding literature on how to incorporate the coupling to
1726: electromagnetic radiation in a tight-binding 
1727: description.\cite{graf95,garm01,boykin01,foreman02}
1728: 
1729: Several directions for future studies suggest themselves.
1730: For example, it would be desirable to obtain a better understanding
1731: of how the AHC depends on the weak spin-orbit interaction.  As we
1732: have seen, this weak interaction causes splittings and avoided
1733: crossings that give rise to very large Berry curvatures in very
1734: small regions of $k$-space.  There is a kind of paradox here.
1735: Our numerical tests, as in Fig.~\ref{fig:lam}, demonstrate that
1736: the AHC falls smoothly to zero as the spin-orbit strength $\lambda$
1737: is turned off, suggesting that a perturbation theory in $\lambda$
1738: should be applicable.  However, in the limit that $\lambda$
1739: becomes small, the full calculation becomes {\it more difficult},
1740: not less: the splittings occur in narrower and narrower regions
1741: of $k$-space, energy denominators become smaller, and Berry
1742: curvature contributions become larger (see Fig.~\ref{fig:bdbc}),
1743: even if the {\it integrated} contribution is going to zero.
1744: It would be of considerable interest, therefore, to explore ways
1745: to reformulate the perturbation theory in $\lambda$ so that the
1746: expansion coefficients can be computed in a robust and efficient
1747: fashion.  Because the exchange splitting is much larger than the
1748: spin-orbit splitting, it may also be of use to introduce
1749: two separate couplings that control the strengths of the
1750: spin-flip and spin-conserving parts of the spin-orbit
1751: interaction respectively, and to work out the perturbation theory
1752: in these two couplings independently.
1753: 
1754: Another promising direction is to explore whether the AHC can be
1755: computed as a Fermi-surface integral using the formulation of
1756: Haldane\cite{haldane04} in which an integration by parts is used
1757: to convert the volume integral of the Berry curvature to a
1758: Fermi-surface integral involving Berry curvatures or potentials.
1759: Such an approach promises to be more efficient than the volume-integration
1760: approach, provided that a method can be developed for carrying out an
1761: appropriate sampling of the Fermi surface.  This is likely to be a
1762: delicate problem, however, since the weak spin-orbit splitting causes
1763: Fermi sheets to separate and reattach in a complex way at short
1764: $k$-scales, and the
1765: dominant contributions to the AHC are likely to come from precisely
1766: these portions of the reconstructed Fermi surface that are the most
1767: difficult to describe numerically.
1768: 
1769: In any case, even without such further developments, the present approach
1770: is a powerful one.  It reduces the expense needed to do an extremely
1771: fine sampling of Fermi-surface properties to the level where the
1772: AHC of a material like bcc Fe can be computed on a workstation in a
1773: few hours.  This opens the door to realistic calculations of the intrinsic
1774: anomalous Hall conductivity of much more complex materials.
1775: More generally, the techniques developed here for the AHE are readily
1776: applicable to other problems in the physics of metals which also require
1777: a very dense sampling of the Fermi surface or Brillouin zone. 
1778: For example, an extension of these ideas
1779: to the evaluation of the electron-phonon coupling matrix elements by
1780: Wannier interpolation is currently under way.\cite{giustino}
1781: 
1782: 
1783: 
1784: %%=========================================================================
1785: \acknowledgments
1786: %%=========================================================================
1787: 
1788: This work was supported by NSF Grant DMR-0549198 and by
1789: the Laboratory Directed Research and Development Program of Lawrence
1790: Berkeley National Laboratory under the Department of Energy Contract
1791: No. DE-AC02-05CH11231.
1792: 
1793: 
1794: %%=========================================================================
1795: \appendix
1796: \section{Alternative expression for the Berry curvature}
1797: \label{app:alt-berry}
1798: %%=========================================================================
1799: 
1800: In this Appendix, we return to Eq.~(\ref{eq:om-a}) and rewrite it
1801: in such a way that all of the large, rapidly varying contributions
1802: arising from small energy denominators in the expression for
1803: ${D}_\alpha$, Eq.~(\ref{eq:ddef}), are segregated into a single term.
1804: We do this by solving Eq.~(\ref{eq:Ata}) for ${D}_\alpha$
1805: and substituting into Eq.~(\ref{eq:om-a}) to obtain
1806: %
1807: \begin{equation}
1808: \Omega_{\alpha\beta}\ph =
1809: \overline\Omega_{\alpha\beta}\ph -i \left[\overline A_\alpha\ph,\overline
1810:    A_\beta\ph\right]
1811: +i\left[A_\alpha\ph,A_\beta\ph\right]
1812: \;.
1813: \label{eq:om-b}
1814: \end{equation}
1815: %
1816: Then only the last term will contain the large, rapid variations.
1817: This equation could have been anticipated based on the fact that
1818: the tensor
1819: %
1820: \begin{equation}
1821: \widetilde{\Omega}_{\alpha\beta}= \Omega_{\alpha\beta}-i [A_\alpha,A_\beta]
1822: \label{eq:gidef}
1823: \end{equation}
1824: %
1825: is well known to be a gauge-covariant quantity;\cite{mead92,marzari97}
1826: applying Eq.~(\ref{eq:utrans}) to $\widetilde{\Omega}_{\alpha\beta}$ then
1827: leads directly to Eq.~(\ref{eq:om-b}).  
1828: 
1829: This formulation provides an alternative route to the
1830: calculation of the matrix $\Omega_{\alpha\beta}\ph$:
1831: evaluate $\widetilde{\Omega}\pw_{\alpha\beta}$ in the Wannier representation
1832: using Eqs.~(\ref{eq:yya}-\ref{eq:yyb}) below,
1833: convert it to $\widetilde{\Omega}\ph_{\alpha\beta}$ via Eq.~(\ref{eq:utrans}),
1834: compute $A_\alpha\ph$ using Eq.~(\ref{eq:Ata}), and assemble
1835: %
1836: \begin{equation}
1837: \Omega_{\alpha\beta}\ph=\widetilde{\Omega}_{\alpha\beta}\ph
1838:    +i[A_\alpha\ph,A_\beta\ph]
1839: \;.
1840: \label{eq:alt}
1841: \end{equation}
1842: %
1843: The large and rapid variations then appear only in the last term involving 
1844: commutators of the $A$ matrices.
1845: 
1846: In Sec.~\ref{sec:bandsum}, we showed how to write the total Berry
1847: curvature $\Omega_{\alpha\beta}(\k)$ as a sum over bands in such a way that 
1848: potentially
1849: troublesome contributions coming from small energy
1850: denominators between pairs of occupied bands are explicitly
1851: excluded, leading to Eq.~(\ref{eq:bsum}). The corresponding expression based
1852: on Eq.~(\ref{eq:alt}) is
1853: %
1854: \begin{eqnarray}
1855: \Omega_{\alpha\beta}(\k)&=&\sum_n f_n\,\widetilde{\Omega}_{nn,\alpha\beta}\ph
1856: \nonumber\\
1857: &&\quad  +\sum_{nm} (f_n-f_m)\,A_{nm,\alpha}\ph A_{mn,\beta }\ph
1858: \;.
1859: \label{eq:bsumalt}
1860: \end{eqnarray}
1861: %
1862: 
1863: Now, in addition to the four quantities given in
1864: Eqs.~(\ref{eq:uu}-\ref{eq:xx}), we need a corresponding equation for
1865: $\widetilde{\Omega}_{\alpha\beta}$.  After some manipulations, we find that
1866: %
1867: \begin{equation}
1868: \widetilde{\Omega}_{nn,\alpha\beta}\pw(\k)=\sum_\R e^{i\k\cdot\R}\;
1869:    w_{n,\alpha\beta}(\R)
1870: \label{eq:yya}
1871: \end{equation}
1872: %
1873: where
1874: %
1875: \begin{eqnarray}
1876: w_{n,\alpha\beta}(\R) &=&
1877: -i \sum_{\R'm}
1878:    \langle{\bf 0}n|\hat{r}_\alpha|\R' m\rangle
1879:    \langle\R' m|\hat{r}_\beta |\R n\rangle 
1880: \nonumber\\ &&
1881: +i \sum_{\R'm}
1882:    \langle{\bf 0}n|\hat{r}_\beta |\R' m\rangle
1883:    \langle\R' m|\hat{r}_\alpha|\R n\rangle 
1884: \;.
1885: \nonumber\\
1886: \label{eq:yyb}
1887: \end{eqnarray}
1888: %
1889: This formulation again requires the same basic ingredients as
1890: before, namely, the Wannier matrix elements of $\hat{H}$ and
1891: $\hat{r}_\alpha$.  In some respects it is a little more elegant
1892: than the formulation of Eq.~(\ref{eq:bsum}). However,
1893: the direct evaluation of $w_{n,\alpha\beta}$ in the Wannier
1894: representation, as given in Eq.~(\ref{eq:yyb}), is not
1895: as convenient because of the extra sum over intermediate WFs appearing
1896: there; moreover, $w_{n,\alpha\beta}$ is longer-ranged than
1897: the Hamiltonian and coordinate matrix elements.
1898: Also, one appealing feature of the formulation of Section~\ref{sec:fdbc},
1899: that more than 99\% of the effect can be recovered without using the
1900: position-operator matrix elements, is lost in this reformulation.
1901: We have therefore chosen to base our calculations and analysis on
1902: Eq.~(\ref{eq:bsum}) instead.
1903: 
1904: It is informative to obtain Eq.~(\ref{eq:alt}) in a different way:
1905: define the gauge-invariant band projection operator\cite{marzari97}
1906: %
1907: $\hat{P}_\k=\sum_{n=1}^M |u_{n\k}\rangle\langle u_{n\k}|$
1908: %
1909: and its complement $\hat{Q}_\k=1-\hat{P}_\k$. Inserting
1910: $\hat 1=\hat{Q}_\k+\hat{P}_\k$ into Eq.~(\ref{eq:Owg}) in the
1911: Hamiltonian gauge then yields directly Eq.~(\ref{eq:alt}) since, as can
1912: be easily verified, Eq.~(\ref{eq:gidef}) may be written as
1913: %
1914: \begin{equation}
1915: \widetilde{\Omega}_{nm,\alpha\beta}=
1916:   i\langle \widetilde{\partial}_\alpha u_{n}|\widetilde{\partial}_\beta  u_{m}\rangle
1917:  -i\langle \widetilde{\partial}_\beta  u_{n}|\widetilde{\partial}_\alpha
1918:   u_{m}\rangle\;,
1919: \label{eq:cov}
1920: \end{equation}
1921: %
1922: where $\widetilde{\partial}_\alpha\equiv\hat{Q}\partial_\alpha$. The
1923: gauge-covariance of $\widetilde{\Omega}_{\alpha\beta}$ follows directly from
1924: the fact that $\widetilde\partial_\alpha$
1925: is a gauge-covariant derivative, in the sense that
1926: $|\widetilde{\partial}_\alpha u_n\ph\rangle=
1927: \sum_{m=1}^M |\widetilde{\partial}_\alpha u_m\pw\rangle U_{mn}$
1928: is the same transformation law as Eq.~(\ref{eq:twist}) for the Bloch
1929: states themselves.
1930: It is apparent from this derivation that as the number $M$ of WFs increases
1931: and $\hat{P}_\k$ approaches $\hat 1$, the second term on the right-hand side
1932: of Eq.~(\ref{eq:bsumalt}) increases at the expense of the first term.
1933: Indeed, in the large-$M$ limit the entire Berry curvature is contained
1934: in the second term. For the choice Wannier orbitals described in the
1935: main text for bcc Fe, that term already accounts for 99.8\% of the
1936: total AHC.
1937: 
1938: %==================
1939: \section{Finite-difference approach}
1940: \label{app:fda}
1941: %==================
1942: 
1943: In this Appendix, we outline an alternative scheme for computing
1944: the AHC by Wannier interpolation. The essential difference
1945: relative to to the approaches described in Section~\ref{sec:fdbc} and
1946: in Appendix~\ref{app:alt-berry} is that the needed $k$-space
1947: derivatives are approximated
1948: here by finite differences instead of being expressed analytically in
1949: the Wannier representation.
1950: 
1951: This approach is most naturally applied to
1952: the zero-temperature limit where there are exactly 
1953: $N_\k$ occupied states at a given $\k$.
1954: Instead of starting from the Berry curvature of each individual band
1955: separately, as in Eq.~(\ref{eq:bcurv}), we find it convenient here to
1956: work from the outset with the total Berry curvature
1957: %
1958: \begin{eqnarray}
1959:         \Omega_{\alpha\beta}(\k)=\sum_{n=1}^{N_\k}\Omega_{nn,\alpha\beta}(\k)
1960: \label{eq:omtot-a}
1961: \end{eqnarray}
1962: %
1963: of the occupied manifold at $\k$
1964: (the zero-temperature limit of Eq.~(\ref{eq:omtot})).
1965: We now introduce a covariant derivative
1966: $\widetilde\partial_\alpha^{(N_\k)}=\hat{Q}_\k^{(N_\k)}\partial_\alpha$
1967: designed to act on the occupied states only; here
1968: $\hat{Q}_\k^{(N_\k)}=\hat{1}-\hat{P}_\k^{(N_\k)}$ and
1969: $\hat{P}_\k^{(N_\k)}=\sum_{n=1}^{N_\k}\,|u_{n\k}\rangle\langle u_{n\k}|$.
1970: The only difference with respect to the definition of
1971: $\widetilde{\partial}_\alpha$ in 
1972: Appendix~A is that the projection operator here spans
1973: the $N_\k$ occupied states only, instead of the $M$ states of the full
1974: projected space. Accordingly, terms such as ``gauge-covariance'' and
1975: ``gauge-invariance'' are to be understood here in a restricted sense.
1976: For example, the statement that $\widetilde\partial_\alpha^{(N_\k)}$ is a 
1977: gauge-covariant derivative means that
1978: under an $N_\k\times N_\k$ unitary rotation ${\cal U}(\k)$ between the
1979: occupied states at $\k$ it obeys the transformation law
1980: %
1981: \begin{equation}
1982: |\widetilde{\partial}^{(N_\k)}_\alpha u_{n\k}\rangle\rightarrow
1983: \sum_{m=1}^{N_\k} |\widetilde{\partial}^{(N_\k)}_\alpha u_{m\k}\rangle\,
1984: {\cal U}_{mn}(\k).
1985: \end{equation}
1986: %
1987: (We will use calligraphic symbols to distinguish $N_\k\times N_\k$ matrices
1988: such as $\cal U$ from their $M\times M$ counterparts such as $U$.)
1989: We now define a gauge-covariant
1990: curvature $\widetilde\Omega_{\alpha\beta}^{(N_\k)}(\k)$ by replacing
1991: $\widetilde\partial$ by $\widetilde\partial^{(N_\k)}$ in Eq.~(\ref{eq:cov}).
1992: Since the trace of a commutator vanishes, it follows from Eq.~(\ref{eq:gidef})
1993: that Eq.~(\ref{eq:omtot-a}) can be written as
1994: %
1995: \begin{eqnarray}
1996: \label{eq:partial_trace}
1997:         \Omega_{\alpha\beta}(\k)={\rm Tr}^{(N_\k)}\,\left[\,\widetilde
1998:         \Omega_{\alpha\beta}^{(N_\k)}(\k) \,\right],
1999: \end{eqnarray}
2000: %
2001: where the symbol ${\rm Tr}^{(N_\k)}$ denotes the trace over the occupied 
2002: states.
2003: 
2004: The advantage of this expression over Eq.~(\ref{eq:omtot-a})  is that the 
2005: covariant derivative of a Bloch state can be approximated by
2006: a very robust finite-differences formula:\cite{sai02,souza04}
2007: %
2008: \begin{eqnarray}
2009:         \tilde \partial_{\k}^{(N_\k)}\rightarrow \sum_{\b}w_{b}{\b}\hat
2010:         P_{\k,\b}^{(N_\k)}\; ,
2011: \label{eq:dis}
2012: \end{eqnarray}
2013: %
2014: where the sum is over shells of neighboring
2015: $k$-points,\cite{marzari97}
2016: as in Eq.~(\ref{eq:A_dis}),
2017: and we have defined the gauge-invariant operator
2018: %
2019: \begin{eqnarray}
2020:         \hat P_{\k,\b}^{(N_\k)}=\sum_{n=1}^{N_\k}|\widetilde
2021: u_{n,\k+\b}\rangle \langle u_{n\k}|\;
2022: \end{eqnarray}
2023: %
2024: in terms of the gauge-covariant ``dual states''
2025: %
2026: \begin{eqnarray}
2027:                 |\widetilde u_{n,\k,\b}\rangle=
2028:                 \sum_{m=1}^{N_\k}|u_{m,\k+\b}\rangle
2029: \left (\cal Q_{\k+\b,\k}\right )_{mn}\;.
2030: \end{eqnarray}
2031: Here $\cal Q_{\k+\b,\k}$ is the inverse of the 
2032: $N_\k\times N_\k$ overlap matrix,
2033: %
2034: \begin{eqnarray}
2035:         {\cal Q}_{\k+\b,\k}=\left(\cal S_{\k,\k+\b}\right)^{-1}\;,
2036: \end{eqnarray}
2037: %
2038: where
2039: %
2040: \begin{eqnarray}
2041:         \left(\cal S_{\k,\k+\b}\right)_{nm}=\langle u_{n\k}
2042:         | u_{m,\k+\b}\rangle\;.
2043:         \label{eq:overlap}
2044: \end{eqnarray}
2045: %
2046: The discretization (\ref{eq:dis}) is immune to arbitrary gauge phases
2047: and unitary rotations among the occupied states. Because of that
2048: property, the occurrence of band crossings and avoided crossings does
2049: not pose any special problems.
2050: 
2051: Inserting Eqs.~(\ref{eq:dis}-\ref{eq:overlap}) into  
2052: Eq.~(\ref{eq:partial_trace}) and using 
2053: $\cal Q_{\k,\k+\b}=\cal Q^\dagger_{\k+\b,\k}$, we find that an
2054: appropriate finite-difference expression for the total Berry
2055: curvature is
2056: %
2057: \begin{equation}
2058: \Omega_{\alpha \beta}^{(N_\k)}(\k)=
2059: 2\sum_{\b_1,\b_2}w_{b_1}\,w_{b_2}\,b_{1,\alpha}\,b_{2,\beta}
2060: \,\Lambda_{\k,\b_1,\b_2},
2061:         \label{eq:omega}
2062: \end{equation}
2063: %
2064: where
2065: %
2066: \begin{equation}
2067: \Lambda_{\k,\b_1,\b_2}=
2068:        -{\rm Im}\,{\rm Tr}^{(N_\k)}\,
2069:         \left[  {\cal Q}_{\k,\k+\b_1}{\cal S}_{\k+\b_1,\k+\b_2}
2070:         {\cal Q}_{\k+\b_2,\k}\right]\,.
2071:         \label{eq:Lambda}
2072: \end{equation}
2073: %
2074: This expression is manifestly gauge-invariant, since both $\cal S$ and
2075: $\cal Q$ are gauge-covariant matrices, i.e.,
2076: ${\cal S}_{\k,\k+\b}\rightarrow{\cal U}^\dagger(\k){\cal S}_{\k,\k+\b}
2077: {\cal U}(\k+\b)$, and the same transformation law holds for
2078: ${\cal Q}_{\k,\k+\b}$.
2079: 
2080: Eqs.~(\ref{eq:omega}-\ref{eq:Lambda}) can be evaluated at an arbitrary point 
2081: $\k$ once the overlap 
2082: matrices ${\cal S}_{\k,\k+\b}$ are known. For that purpose we construct a 
2083: uniform mesh of spacing $\Delta k$ in the immediate vicinity of $\k$, 
2084: set up the needed shells of
2085: neighboring $k$-points $\k+\b$ on that local mesh, and then evaluate
2086: ${\cal S}_{\k,\k+\b}$ by Wannier interpolation. Since the
2087: WFs span the entire $M$-dimensional projected space, at this stage we revert
2088: to the full $M\times M$ overlap matrices $S_{\k,\k+\b}$.
2089: In the Wannier gauge they are given by a Fourier transform of the form
2090: %
2091: \begin{equation}
2092:         \left({ S}^{\rm (W)}_{\k,\k+\b}\right)_{nm}=
2093: \sum_{\R}e^{i\k \cdot \R}\langle {\bf 0} n |e^{i\b
2094: \cdot (\R-\hat\r)}|\R m\rangle\;.
2095: \end{equation}
2096: %
2097: For sufficiently small $\Delta k$, this can be approximated as
2098: %
2099: \begin{equation}
2100: \left(S^{\rm (W)}_{\k,\k+\b}\right)_{nm}
2101: \simeq\delta_{nm}-i\b\sum_{\R}e^{i\k \cdot \R}\langle {\bf 0} n |\hat\r|\R m\rangle\;.
2102: \end{equation}
2103: %
2104: Note that the dependence of the last expression on $\Delta k$ is trivial,
2105: since it only enter as a multiplicative prefactor. In practice one
2106: chooses $\Delta k$ to be quite small, $\sim10^{-6}$\,a.u.$^{-1}$,
2107: so as to reduce the error of the finite-differences expression.
2108: 
2109: In the Wannier gauge the occupied and empty states are mixed with one another,
2110: because the WFs are partially occupied. In order to decouple the two subspaces
2111: we perform the unitary transformation
2112: %
2113: \begin{eqnarray}
2114:         {S}_{\k,\k+\b}^{\rm (H)}=U^{\dagger}(\k){S}_{\k,\k+\b}^{\rm (W)}
2115: U({\k+\b})\;.
2116: \label{eq:decouple}
2117: \end{eqnarray}
2118: %
2119: This produces the full $M\times M$ overlap matrix in the Hamiltonian
2120: gauge. The $N_\k\times N_\k$ submatrix in the upper left corner is
2121: precisely the matrix ${\cal S}_{\k,\k+\b}\ph$ needed in Eq.~(\ref{eq:Lambda}).
2122: 
2123: Like the approach described in the main text, this approach
2124: still only requires the WF matrix elements of the
2125: four operators $\hat{H}$ and $\hat{r}_\alpha$ ($\alpha=x$, $y$, and $z$).
2126: We have implemented it, and have checked that the
2127: results agree closely with those obtained using
2128: using the method of the main text. Although not as
2129: elegant, this approach has the interesting feature
2130: of circumventing the evaluation of the matrix $D_\alpha\ph$,
2131: Eq.~(\ref{eq:ddef}). This may
2132: be advantageous in certain special situations. For example, if a
2133: parameter such as pressure is tuned in such a way that a $k$-space
2134: Dirac monopole\cite{fang03} drifts to the Fermi surface,
2135: the vanishing of the energy denominator in Eq.~(\ref{eq:ddef}) may result
2136: in a numerical instability when trying to find the monopole
2137: contribution to the AHC.
2138: 
2139: We conclude by noting that Eq.~(\ref{eq:Lambda}) is but one of many
2140: possible finite-differences expressions, and may not even be the most 
2141: convenient one to use
2142: in practice. By recalling that the Berry curvature is the Berry phase
2143: per unit area, one realizes that in the small-$\Delta k$ limit of
2144: interest, the quantity $\Lambda_{\k,\b_1,\b_2}$ in Eq.~(\ref{eq:omega})
2145: can be viewed as the discrete Berry phase $\phi$ accumulated along the 
2146: small loop
2147: $\k\rightarrow \k+\b_1\rightarrow\k+\b_2\rightarrow\k$. As is well-known, the
2148: Berry phase around a discrete loop is defined as\cite{ksv93}
2149: %
2150: \begin{equation}
2151: \phi=-\,{\rm Im}\,\ln\det
2152: \left[
2153:   {\cal S}_{\k,\k+\b_1}{\cal S}_{\k+\b_1,\k+\b_2}{\cal S}_{\k+\b_2,\k}
2154: \right]\;.
2155: \label{eq:dis_berry}
2156: \end{equation}
2157: %
2158: It can be shown 
2159: that $\phi=\Lambda_{\k,\b_1,\b_2}+
2160: {\cal O}(\Delta k^2)$, so that for small loops the two formulas agree. 
2161: Eq.~(\ref{eq:dis_berry}) has the practical advantage over
2162: Eq.~(\ref{eq:Lambda}) that it does not require 
2163: inverting the overlap matrix.
2164: 
2165: 
2166: 
2167: %**************************************************************
2168: \begin{thebibliography}{10}
2169: 
2170: 
2171: \bibitem{hurd72} C.M. Hurd, {\em The Hall effect in metals and alloys}
2172: (Plenum, New York, 1972).
2173: 
2174: %\bibitem{pugh30} E.M. Pugh, Phys. Rev. 36 (1930) 1503.
2175: %(Hall Effect and the Magnetic Properties of Some Ferromagnetic Materials)
2176: 
2177: \bibitem{gerber02} A. Gerber {\it et al.}, J. Magn. Magn. Mat. {\bf 242}, 90 (2002).
2178: 
2179: \bibitem{karplus54} R. Karplus and J.M. Luttinger, Phys. Rev. {\bf 95}, 1154 (1954).
2180: %(Hall effect in ferromagnetics)
2181: 
2182: \bibitem{smit58} J. Smit, Physica {\bf 24}, 39 (1958).
2183: 
2184: \bibitem{berger70} L. Berger, Phys. Rev. B {\bf 2} , 4559 (1970).
2185: 
2186: \bibitem{adams59} E.N. Adams and E.I. Blount, J. Phys. Chem. Solids
2187:   {\bf 10}, 286 (1959).
2188: 
2189: \bibitem{chang96} M.-C. Chang and Q. Niu, Phys. Rev. B {\bf 53}, 7010 (1996).
2190: 
2191: \bibitem{sundaram99} G. Sundaram and Q. Niu, Phys. Rev. B {\bf 59}, 14915 (1999).
2192: 
2193: \bibitem{onoda02} M. Onoda and N. Nagaosa, J. Phys. Soc. Jpn. {\bf 71}, 19
2194:   (2002).
2195: 
2196: \bibitem{jungwirth02} T. Jungwirth, Q. Niu, and A.H. MacDonald,
2197:   Phys. Rev. Lett. {\bf 88}, 207208 (2002).
2198: 
2199: \bibitem{haldane04} F.D.M. Haldane, Phys. Rev. Lett. {\bf 93}, 206602 (2004).
2200: 
2201: \bibitem{thouless82}D.J. Thouless, M. Kohmoto, M.P. Nightingale, and
2202: M. den Nijs, Phys. Rev. Lett. {\bf 49}, 405 (1982).
2203: 
2204: \bibitem{ksv93} R.D. King-Smith and D. Vanderbilt,
2205: Phys. Rev. B {\bf 47}, 1651 (1993).
2206: 
2207: \bibitem{fang03} Z. Fang {\it et al.}, Science {\bf 302}, 92 (2003).
2208: 
2209: \bibitem{yao04} Y. Yao {\it et al.}, Phys. Rev. Lett. {\bf 92}, 037204 (2004).
2210: 
2211: \bibitem{berry84}M.V. Berry, Proc. R. Soc. London Ser. A {\bf 392}, 45 (1984).
2212: 
2213: \bibitem{blount62} E.I. Blount, Solid State Phys. {\bf 13}, 305 (1962).
2214: 
2215: \bibitem{souza01} I. Souza, N. Marzari, and D. Vanderbilt,
2216: Phys. Rev. B, {\bf 65}, 035109 (2001).
2217: 
2218: \bibitem{graf95} M. Graf and P. Vogl, Phys. Rev. B {\bf 51}, 4940 (1995).
2219: 
2220: \bibitem{marzari97} N. Marzari and D. Vanderbilt, Phys.
2221: Rev. B {\bf 56}, 12847 (1997).
2222: 
2223: \bibitem{pwscf}S. Baroni, A. Dal Corso, S. de Gironcoli, P.
2224: Giannozzi, C. Cavazzoni, G. Ballabio, S. Scandolo, G. Chiarotti,
2225: P. Focher, A. Pasquarello, K. Laasonen, A. Trave, R. Car,
2226: N.~Marzari, A. Kokalj, {\tt http://www.pwscf.org/}.
2227: 
2228: \bibitem{perdew96}J.P. Perdew, K. Burke, and M. Ernzerhof, Phys.
2229: Rev. Lett. {\bf 77}, 3865 (1996).
2230: 
2231: \bibitem{soc2}G. Theurich and N.A. Hill, Phys. Rev. B {\bf 64}, 073106 (2001).
2232: 
2233: \bibitem{dalcorso05} A. Dal Corso and A.M. Conte, Phys. Rev. B {\bf 71},
2234: 115106 (2005).
2235: 
2236: \bibitem{nlcc}
2237: S.G. Louie, S. Froyen, and M.L. Cohen
2238:  Phys. Rev. B, {\bf 26}, 1738 (1982).
2239: 
2240: \bibitem{mp}
2241: H.J. Monkhorst and J.D. Pack,
2242: Phys. Rev. B, {\bf 13}, 5188 (1976).
2243: 
2244: \bibitem{coldsmear} N. Marzari, D. Vanderbilt, A. De Vita, and M.C. Payne,
2245: Phys. Ref. Lett. {\bf 82}, 3296 (1999).
2246: 
2247: \bibitem{wannier} A.A. Mostofi, J.R. Yates, N. Marzari, I. Souza,
2248: and D.~Vanderbilt,
2249: {\tt http://www.wannier.org/}. 
2250: 
2251: \bibitem{Pauling31}
2252: L. Pauling, J. Am. Chem. Soc. {\bf 53}, 1367 (1931).
2253: 
2254: \bibitem{bachelet82} G.B. Bachelet and M. Schl\"uter, 
2255: Phys. Rev. B {\bf 25}, 2103 (1982).
2256: 
2257: \bibitem{singh73} M. Singh, C.S. Wang and J. Callaway, Phys. Rev.
2258: B {\bf 11}, 287 (1975).
2259: 
2260: \bibitem{explan-kubo}
2261: %An advantage of the Kubo formula, however, is that it can
2262: %easily be extended to obtain the frequency-dependence of the AHC.
2263: 
2264: An advantage of the Kubo formula is that it can
2265: easily be extended to obtain the frequency-dependence of the AHC.
2266: 
2267: \bibitem{singh} D.J. Singh, {\it Planewaves, Pseudopotentials
2268: and the LAPW Method }(Kluwer Academic, Boston, 1994).
2269: 
2270: \bibitem{garm01} T.G. Pedersen, K. Pedersen, and
2271: T.B. Kriestensen, Phys. Rev. B {\bf 63}, 201101(R) (2001).
2272: 
2273: \bibitem{boykin01} T.B. Boykin, R. C. Bowen, and G. Klimeck,
2274: Phys. Rev. B {\bf 63}, 245314 (2001).
2275: 
2276: \bibitem{foreman02} B.A. Foreman, Phys. Rev. B {\bf 66}, 165212 (2002).
2277: 
2278: \bibitem{mead92} C.A. Mead, Rev. Mod. Phys. {\bf 64}, 51 (1992).
2279: 
2280: \bibitem{sai02} N. Sai, K.M. Rabe, and D. Vanderbilt,
2281: Phys. Rev. B {\bf 66}, 104108 (2002).
2282: 
2283: \bibitem{souza04} I. Souza, J. \'I\~niguez, and D. Vanderbilt,
2284: Phys. Rev. B {\bf 69}, 085106 (2004).
2285: 
2286: \bibitem{giustino} F. Giustino, J.R. Yates, I. Souza, M.L. Cohen, and
2287: S.G. Louie (unpublished).
2288: 
2289: \end{thebibliography}
2290: 
2291: %%
2292: \end{document}
2293: