0804.2827/B1608acsAnalysis.tex
1: \documentclass{emulateapj}
2: 
3: \shorttitle{DISSECTING THE GRAVITATIONAL LENS B1608+656. I.}
4: \shortauthors{Suyu et al.}
5: 
6: 
7: \usepackage{natbib}
8: \input{B1608acsAnalysis_macro}
9:  
10: %===============================================================================  
11: 
12: \begin{document}
13: 
14: \title{DISSECTING THE GRAVITATIONAL LENS B1608+656. I. LENS POTENTIAL RECONSTRUCTION\altaffilmark{*}}
15: 
16: %% Use \author, \affil, and the \and command to format
17: %% author and affiliation information.
18: %% Note that \email has replaced the old \authoremail command
19: %% from AASTeX v4.0. You can use \email to mark an email address
20: %% anywhere in the paper, not just in the front matter.
21: %% As in the title, use \\ to force line breaks.
22: 
23: \author{S.~H.~Suyu\altaffilmark{1,2,3},     
24:         P.~J.~Marshall\altaffilmark{4},
25:         R.~D.~Blandford\altaffilmark{1,2}, 
26:         C.~D.~Fassnacht\altaffilmark{5},
27:         L.~V.~E.~Koopmans\altaffilmark{6}, \\
28:         J.~P.~McKean\altaffilmark{5,7}, and
29:         T.~Treu\altaffilmark{4,8}} 
30: 
31: 
32: \email{suyu@astro.uni-bonn.de}
33: \altaffiltext{*}{Based in part on observations made with the NASA/ESA \textit{Hubble Space Telescope}, obtained at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under 
34: NASA contract NAS 5-26555. These observations are associated with program 
35: GO-10158.}
36: \altaffiltext{1}{Theoretical Astrophysics, 103-33, California Institute of Technology, Pasadena, CA, 91125, USA}
37: \altaffiltext{2}{Kavli Institute for Particle Astrophysics and Cosmology, Stanford University, PO Box 20450, MS 29, Stanford, CA 94309, USA}
38: \altaffiltext{3}{Argelander-Institut f\"{u}r Astronomie, Auf dem H\"{u}gel 71, D-53121 Bonn, Germany}
39: \altaffiltext{4}{Department of Physics, University of California, Santa Barbara, CA 93106-9530, USA}
40: \altaffiltext{5}{Department of Physics, University of California at Davis, 1 Shields Avenue, Davis, CA 95616, USA}
41: \altaffiltext{6}{Kapteyn Astronomical Institute, University of Groningen, P.O.Box800, 9700AV Groningen, The Netherlands}
42: \altaffiltext{7}{Max-Planck-Institut f\"{u}r Radioastronomie, Auf dem H\"{u}gel 69, D-53121 Bonn, Germany}
43: \altaffiltext{8}{Sloan Fellow, Packard Fellow} 
44: 
45: 
46: \begin{abstract}
47:   Strong gravitational lensing is a powerful technique for probing
48:   galaxy mass distributions and for measuring cosmological parameters.
49:   Lens systems with extended source-intensity distributions are
50:   particularly useful for this purpose since they provide additional
51:   constraints on the lens potential (mass distribution).  We present a
52:   pixelated approach to modeling the lens potential and
53:   source-intensity distribution simultaneously.  The method makes iterative and
54:   perturbative corrections to an initial potential model.  For systems
55:   with sources of sufficient extent such that the separate lensed
56:   images are connected by intensity measurements, the accuracy in the
57:   reconstructed potential is solely limited by the quality of the
58:   data.  We apply this potential reconstruction technique to deep 
59:   \textit{Hubble Space Telescope}
60:   observations of B1608+656, a four-image gravitational lens system
61:   formed by a pair of interacting lens galaxies.  We present a
62:   comprehensive Bayesian analysis of the system that takes into
63:   account the extended source-intensity distribution, dust extinction,
64:   and the interacting lens galaxies.  Our approach allows us to
65:   compare various models of the components of the lens system, which
66:   include the point-spread function (PSF), dust, lens galaxy light,
67:   source-intensity distribution, and lens potential.  Using optimal
68:   combinations of the PSF, dust, and lens galaxy light models, we
69:   successfully reconstruct both the lens potential and the extended
70:   source-intensity distribution of B1608+656.  The resulting
71:   reconstruction can be used as the basis of a measurement of the
72:   Hubble constant.  As an illustration of the astrophysical
73:   applications of our method, we use our reconstruction of the
74:   gravitational potential to study the relative distribution of mass
75:   and light in the lensing galaxies. We find that the mass-to-light
76:   ratio for the primary lens galaxy is $(2.0\pm0.2)h \rm{\, M_{\sun}
77:     \, L_{B,\sun}^{-1}}$ within the Einstein radius ($3.9 h^{-1}\,
78:   \rm{kpc}$), in agreement with what is found for noninteracting lens
79:   galaxies at the same scales.
80: \end{abstract}
81: 
82: 
83: 
84: %% Keywords should appear after the \end{abstract} command. The uncommented
85: %% example has been keyed in ApJ style. See the instructions to authors
86: %% for the journal to which you are submitting your paper to determine
87: %% what keyword punctuation is appropriate.
88: \keywords{gravitational lensing: general --- gravitational lensing: individual (B1608+656) --- methods: data analysis --- galaxies: elliptical and lenticular, cD --- galaxies: structure}
89: 
90: 
91: 
92: %-------------------------------------------------------------------------------
93: 
94: 
95: \section{Introduction}
96: \label{sec:intro}
97: 
98: \setcounter{footnote}{8}
99: 
100: Strong gravitational lens systems provide a tool for probing galaxy
101: mass distributions (independent of their light profiles) and for
102: measuring cosmological parameters \citep*[e.g.][ and references
103: therein]{KochanekEtal06}.  Lens systems with extended source-intensity
104: distributions are of special interest because they provide additional
105: constraints on the lens potential (and hence the surface mass density)
106: due to surface brightness conservation.  In this case, simultaneous
107: determination of the source-intensity distribution and the lens
108: potential is needed.  To describe either the source-intensity or the
109: lens potential/mass distribution, there are two approaches in the
110: literature: (1) ``parametric,'' or better, ``simply parameterized,''
111: using simple, physically motivated functional forms described by a few
112: ($\sim 10$) parameters (e.g.,
113: \citeauthor{Kochanek91} \citeyear{Kochanek91};
114: \citeauthor{KneibEtal96} \citeyear{KneibEtal96}; 
115: \citeauthor{Keeton01} \citeyear{Keeton01}; 
116: \citeauthor{Marshall06} \citeyear{Marshall06};
117: \citeauthor{JulloEtal07} \citeyear{JulloEtal07}),
118: and (2) pixel-based (``pixelated,'' or ``free-form,'' or sometimes, 
119: inaccurately, ``nonparametric'') modeling on a grid, which has been 
120: done for both the source intensity (e.g.,
121: \citeauthor{WallingtonEtal96} \citeyear{WallingtonEtal96};
122: \citeauthor{WarrenDye03} \citeyear{WarrenDye03};
123: \citeauthor{TreuKoopmans04} \citeyear{TreuKoopmans04};
124: \citeauthor{DyeWarren05} \citeyear{DyeWarren05};
125: \citeauthor{Koopmans05} \citeyear{Koopmans05};
126: \citeauthor{BrewerLewis06} \citeyear{BrewerLewis06};
127: \citeauthor{SuyuEtal06} \citeyear{SuyuEtal06};
128: \citeauthor{WaythWebster06} \citeyear{WaythWebster06};
129: \citeauthor{DyeEtal08} \citeyear{DyeEtal08}) 
130: and the lens potential/mass distribution (e.g., 
131: \citeauthor{WilliamsSaha00} \citeyear{WilliamsSaha00};
132: \citeauthor{BradacEtal05} \citeyear{BradacEtal05};
133: \citeauthor{Koopmans05} \citeyear{Koopmans05};
134: \citeauthor{SahaEtal06} \citeyear{SahaEtal06};
135: \citeauthor{SuyuBlandford06} \citeyear{SuyuBlandford06};
136: \citeauthor{JeeEtal07} \citeyear{JeeEtal07};
137: \citeauthor{VegettiKoopmans08} \citeyear{VegettiKoopmans08}). 
138: Most of the developed lens modeling methods are simply parameterized.
139: In particular, for the measurement of the Hubble constant, lens
140: potential/mass models prior to \citet{SahaEtal06} have been
141: simply parameterized because most of the strong lens systems with time
142: delay measurements have only point sources (as opposed to extended
143: sources) to constrain the lens potential/mass distribution.  A precise
144: measurement of the value of $H_0$ is important for testing the flat
145: $\Lambda$-cold dark matter (CDM) model and studying dark energy.  The cosmic microwave
146: background (CMB) allows determination of cosmological parameters with
147: high accuracy with the exception of $H_0$ \citep[e.g.][]{KomatsuEtal08}.  An
148: independent measurement of $H_0$ to better than a few percent
149: precision provides the single most useful complement to the CMB for
150: dark energy studies \citep{Hu05}.
151: 
152: \citet{Koopmans05} developed a method for pixelated source-intensity
153: and lens potential reconstruction that is based on the potential
154: correction scheme proposed by \citet*{BlandfordEtal01}.  This
155: pixelated potential reconstruction method is applicable to lens
156: systems with extended source-intensity distributions.  Pixel-based
157: modeling has the advantage over simply-parameterized modeling in the
158: flexibility in the parametrization.  This is especially important in
159: complex lens systems (e.g. multicomponent source galaxies or multiple
160: lens galaxies) where simply-parameterized models may become
161: inadequate.  Furthermore, pixel-based modeling has the capabilities of
162: detecting dark matter substructures \citep{Koopmans05, VegettiKoopmans08}.
163: 
164: In this paper, we present a lens modeling technique that is similar to
165: that of \citet{Koopmans05}, but in a Bayesian framework to allow quantitative
166: comparison between various source intensity and lens potential models.
167: The point-spread function (PSF), lens galaxy light, and dust models
168: are also incorporated in this scheme.  Therefore, this method provides
169: a way to rank these data models (with the five interdependent
170: components: source-intensity distribution, lens potential, PSF, lens
171: galaxy light and dust) quantitatively.  There are also propagation
172: effects due to structures along the line of sight (LOS), but we ignore this
173: for now and characterize this in a forthcoming paper (Paper II).
174: 
175: We choose to reconstruct the lens potential instead of the surface
176: mass density because (1) it is the quantity that directly relates to
177: the cosmological parameters via the time delays and angular diameter
178: distance ratios, and (2) the surface mass density can, in principle, be
179: easily obtained by differentiation.  In contrast,
180: \citet{WilliamsSaha00} and \citet{SahaEtal06} pixelized the surface
181: mass density.  Since the surface mass density over the entire lens
182: plane is required in the integral for obtaining the lens potential,
183: the conversion of the (finite) gridded mass density to the lens
184: potential is not straightforward.
185: 
186: 
187: We apply the pixelated potential reconstruction method to B1608+656
188: \citep{MyersEtal95}, a quadruple image gravitational lens system with
189: an extended source at $z_{\rm s}= 1.394$ \citep{FassnachtEtal96}, and
190: two interacting galaxy lenses at $z_{\rm d}= 0.6304$
191: \citep{MyersEtal95}.  B1608+656 is special in that it is the only
192: four-image gravitational lens systems with all three independent time
193: delays between the images measured with errors of only a few percent
194: \citep{FassnachtEtal99,FassnachtEtal02}.  Thus, it provides a great
195: opportunity to measure the Hubble constant, which is the subject of
196: Paper II.  To obtain the Hubble constant to high precision, an
197: accurate lens potential model is crucial.  \citet{KoopmansFassnacht99}
198: modeled this system using simply-parameterized lens potentials, but
199: did not account for the presence of dust and the extended source
200: intensity.  \citet{KoopmansEtal03} improved on the
201: simply-parameterized modeling of the lens potential with the treatment
202: of dust, the use of a simply-parameterized extended source-intensity
203: distribution, and the inclusion of constraints from stellar dynamics.
204: However, \citet{SuyuBlandford06} showed that this most up-to-date
205: simply-parameterized lens model in \citet{KoopmansEtal03} fails
206: certain tests such as the crossing of the critical curve through the
207: saddle point of the figure-eight-shaped intensity contour of the
208: merging images.  This suggests that the pixelated potential method may
209: be better suited than a simply-parameterized method for the two
210: interacting galaxies.  In this paper, we deliver a comprehensive
211: analysis of the B1608+656 system that incorporates the effects of the
212: extended source intensity, presence of dust, and interacting lenses.
213: The dissection of B1608+656 allows us to study the relative
214: distribution of mass and light in the interacting lens galaxies.
215:  
216: The outline of the paper is as follows.  In Section
217: \ref{sec:PPRMethod}, we introduce the pixelated potential
218: reconstruction method.  We demonstrate the method using simulated data
219: in Section \ref{sec:PPRMethod:demo} and generalize the method to real
220: data in Section \ref{sec:PPRMethod:realData}.  The remaining sections
221: of the paper target B1608+656.  In Section \ref{sec:ImProc}, we
222: summarize the \textit{Hubble Space Telescope} (\HST) observations of
223: B1608+656 and present the image processing.  In Section
224: \ref{sec:PotRec:B1608}, we show a pixelated potential reconstruction
225: of B1608+656.  Finally, in Section \ref{sec:B1608prop}, we comment on
226: the mass-to-light (M/L) ratio in B1608+656 based on the results of our
227: lensing analysis.  In Paper II, we use the resulting potential
228: reconstruction of B1608+656 together with a study of the lens
229: environment to infer the value of the Hubble constant.
230: 
231: Throughout this paper, we assume a flat $\Lambda$-CDM universe with
232: $\Omega_{m}=0.26$, $\Omega_{\Lambda}=0.74$, and $H_0 = 100h \mathrm{\,
233:   km\, s^{-1}\, Mpc^{-1}}$ \citep{KomatsuEtal08}.  For the lens and
234: source redshifts in B1608+656, $1''$ on the sky corresponds to $4.9
235: h^{-1}\, \rm{kpc}$ on the lens plane and $6.1 h^{-1}\, \rm{kpc}$ on
236: the source plane.
237: 
238: % ------------------------------------------------------------------------------
239: 
240: \section{Pixelated potential reconstruction}
241: \label{sec:PPRMethod}
242: In the following subsections, we present the pixelated potential
243: reconstruction method.  Section \ref{sec:PPRMethod:method} contains
244: the formalism of the method, and Section \ref{sec:PPRMethod:matrix} is
245: a practical implementation of the method.
246: 
247: 
248: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
249: 
250: \subsection{Formalism for iterative and perturbative potential corrections}
251: \label{sec:PPRMethod:method}
252: The iterative and perturbative potential correction scheme for lens
253: systems with extended sources was first suggested by
254: \citet{BlandfordEtal01} and studied by \citet{Koopmans05},
255: \citet{SuyuBlandford06}, and recently by \citet{VegettiKoopmans08}.  The pixelated potential reconstruction
256: method that we present here is similar to that in \citet{Koopmans05}
257: but differs in the numerical details and our use of Bayesian analysis,
258: which allows for model comparison.  The method in \citet{VegettiKoopmans08} 
259: is also based on Bayesian analysis and has adaptive gridding on the 
260: source plane.  In the rest of the section, we
261: briefly outline the theory of pixelated potential reconstruction.
262: 
263: The central concept for this method is to start with an initial lens
264: potential model and to correct it, perturbatively and iteratively, to
265: obtain an estimate of the true lens potential.  The initial lens
266: potential will usually be simply-parameterized (to allow faster
267: convergence with a smaller number of parameters) and ideally would be
268: close to the true potential.  It will then be refined via corrections
269: on a grid of pixels.  Obtaining the parameter values in the initial
270: lens potential is often a nonlinear process; in contrast, the
271: potential correction in each iteration is a linear inversion.
272: 
273: One way to think about this procedure is to observe that in a
274: perfectly observed image, nested intensity contours in the source
275: plane map onto multiple regions of the image plane.  Intensity is
276: preserved by the lens and so the map is from a set of single source
277: contours to the corresponding image contours.  The only freedom that
278: we have is to slide image points along the contours.  Using the fact
279: that the deflection field is curl-free effectively removes this
280: freedom.  What we describe is a procedure to determine this map that
281: takes into account a finite PSF, dust extinction, and source-intensity
282: contamination by the lens galaxy light.  In Paper II, we also include
283: the influence of propagation effects.
284: 
285: To keep the formalism simple for the moment, let us ignore the effects
286: of the PSF, dust extinction, and lens galaxy light.  Let
287: $\bmath{\theta}$ be the coordinates on the image plane and
288: $\bmath{\beta}$ be the coordinates on the source plane.  Let
289: $I_{\rm d}(\bmath{\theta})$ be the observed image intensity of a lensed
290: extended source, and let $\psi(\bmath{\theta})$ be an initial scaled
291: surface potential model\footnote{$\psi$ includes the distance ratio.}
292: for the lens system.  Given $\psi(\bmath{\theta})$, one can obtain the
293: best-fitting source-intensity distribution \citep[e.g.,][and
294: references therein]{ SuyuEtal06}.  Let
295: $I_{\rm s}(\bmath{\theta}(\bmath{\beta}))$ be the source intensity
296: translated to the image plane via the potential model,
297: $\psi({\bmath{\theta}})$, where $\bmath{\theta}$ and $\bmath{\beta}$
298: are related via the lens equation $\bmath{\theta}=\bmath{\beta}-
299: \bmath{\nabla}\psi({\bmath{\theta}})$.  We define the intensity
300: deficit (also known as the image residual) on the image plane by
301: \be \label{eq:dI} 
302: \delta I(\bmath{\theta}) = I_{\rm d}(\bmath{\theta}) - I_{\rm s}(\bmath{\theta}(\bmath{\beta})).
303: \ee
304: 
305: Suppose the initial lens potential model is perturbed from the true
306: potential, $\psi_{0}(\bmath{\theta})$, by $\delta \psi(\bmath{\theta})$:
307: \be \label{eq:potmodel}
308: \psi(\bmath{\theta}) = \psi_{0}(\bmath{\theta}) + \delta \psi(\bmath{\theta}).
309: \ee
310: For a given image (fixed $I_{\rm d}(\bmath{\theta})$) and the initial
311: potential model $\psi(\bmath{\theta}$), we can relate the intensity
312: deficit to the potential perturbation $\delta \psi(\bmath{\theta})$ by
313: \be \label{eq:pertEq}
314: \delta I(\bmath{\theta}) = \frac{\partial I_{\rm s}(\bmath{\beta})}{\partial \bmath{\beta}} \boldsymbol{\cdot} \frac {\partial \delta \psi(\bmath{\theta})}{\partial \bmath{\theta}},
315: \ee
316: to first order in $\delta \psi(\bmath{\theta})$  
317: (see e.g., \citet{SuyuBlandford06} for details). 
318: The source-intensity
319: gradient ${\partial I_{\rm s}(\bmath{\beta})}/{\partial \bmath{\beta}}$
320: implicitly depends on the potential model $\psi(\bmath{\theta})$ since
321: the source position $\bmath{\beta}$ (where the gradient is evaluated)
322: is related to $\psi(\bmath{\theta})$ via the lens equation.  We can solve
323: Equation (\ref{eq:pertEq}) for $\delta \psi(\bmath{\theta})$ given the
324: intensity deficit and source-intensity gradients, update the initial
325: (or previous iteration's) potential model, and repeat the process of
326: source-intensity reconstruction and potential correction until the
327: potential converges to the true solution with zero intensity deficit.
328: In Section \ref{sec:PPRMethod:matrix}, we focus on solving Equation
329: (\ref{eq:pertEq}).
330: 
331: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
332: 
333: \subsection{Implementation of pixelated potential reconstruction}
334: \label{sec:PPRMethod:matrix}
335: 
336: \subsubsection{Probability theory}
337: \label{sec:PPRMethod:matrix:probTheory}
338: The first step in solving Equation (\ref{eq:pertEq}) for the potential
339: perturbation is to obtain the source-intensity gradients and the
340: intensity deficit, which appear in the correction equation.  We follow
341: \citet{SuyuEtal06} to obtain the source-intensity distribution on a
342: grid of pixels given the current iteration's lens potential model.  In
343: this source reconstruction approach, the data (observed image) are
344: described by the vector $\data_j$, where $j=1,\ldots, N_{\rm d}$ and
345: $N_{\rm d}$ is the number of data pixels.  The source intensity is
346: described by the vector $\sr_i$, where $i=1,\ldots, N_{\rm s}$ and
347: $N_{\rm s}$ is the number of source-intensity pixels.  The observed
348: image is related to the source intensity via $\data_j =
349: \response_{ji}\sr_i + \noise_j$, where $\response_{ji}$ is the
350: so-called blurred lensing operator (mapping matrix) that incorporates
351: the lens potential (which governs the deflection of light rays) and
352: the PSF (blurring),\footnote{Dust extinction, if present, is also
353:   included in this mapping matrix $\response_{ji}$} and $\noise_j$ is
354: the noise in the data characterized by the covariance matrix $\imCM$.
355: In the inference of $\sr_i$, we impose a prior on $\sr_i$, which can
356: be thought of as ``regularizing'' the parameters $\sr_i$ to avoid
357: overfitting to the noise in the data.  Following \citet{SuyuEtal06},
358: we use quadratic forms of the regularization (specifically,
359: zeroth-order, gradient, and curvature forms of regularization).
360: The Bayesian inference of the source-intensity distribution ($\sr_i$)
361: given the observed image ($\data_j$) is a linear inversion and is a
362: solved problem.  Having obtained the source intensity, we can
363: calculate the intensity deficit and source-intensity gradients.
364:  
365: We pixelize the lens potential to allow for a flexible parametrization
366: scheme.  To solve Equation (\ref{eq:pertEq}), we cast it into a matrix
367: equation and invert the linear system.  To write Equation
368: (\ref{eq:pertEq}) in a matrix form, we discretize the lens potential
369: on a rectangular grid of $N_{\rm p}$ pixels (which is 
370: less than the number of data pixels $N_{\rm d}$ so that the
371: potential and source-intensity pixels are not underconstrained) and
372: denote the potential perturbation by $\dpsi_i$ where $i=1,\ldots, N_{\rm
373:   p}$.  The intensity deficit on the image grid is $\dI_j=\data_j -
374: \response_{ji}\sr_i$ where $j=1,\ldots, N_{\rm d}$ (using the notation
375: from source-intensity reconstruction, $\dataVec$, $\responseSet$ and
376: $\srVec$ are the data vector, the blurred lensing operator, and the
377: source-intensity vector, respectively).  Equation (\ref{eq:pertEq})
378: now becomes 
379: \be
380: \label{eq:pertEqMat}
381: \dIVec = \PRmatSet \dpsiVec + \noiseVec,
382: \ee
383: where $\PRmatSet$ is a $N_{\rm d} \times N_{\rm p}$ matrix which
384: incorporates the PSF, the source-intensity gradient, and the gradient
385: operator that acts on $\dpsiVec$ (see the appendix for the explicit
386: form of $\PRmatSet$), and $\noiseVec$ is the noise in the data.  The
387: above equation is equivalent to
388: \be
389: \label{eq:djWithPert}
390: \dataVec = \responseSet \srVec + \PRmatSet \dpsiVec + \noiseVec.
391: \ee
392: 
393: We can infer the potential corrections $\dpsiVec$ given the data
394: $\dataVec$, source intensity $\srVec$, and source-intensity gradients
395: that are encoded in $\PRmatSet$.  In the inference, we impose a prior
396: on $\dpsiVec$.
397: The posterior probability distribution is
398: \be
399: \label{eq:dpsiPosterior}
400: \overbrace{P(\dpsiVec|\dataVec, \responseSet, \srVec, \PRmatSet, \mu, \regSet_{\dpsi})}^{\rm{posterior}} = \frac{ \overbrace{P(\dataVec|\dpsiVec, \PRmatSet, \responseSet, \srVec)}^{\rm{likelihood}} \overbrace{P(\dpsiVec|\mu, \regSetdpsi)}^{\rm{prior}}}{\underbrace{P(\dataVec|\responseSet, \srVec, \PRmatSet, \mu, \regSetdpsi)}_{\rm{evidence}}}, 
401: \ee
402: where $\mu$ and $\regSetdpsi$ are the (fixed) strength and form of
403: regularization for the potential correction inversion,
404: and all irrelevant (in)dependences have been dropped.  
405: Modeling the noise as
406: Gaussian, the likelihood is
407: \be
408: \label{eq:dpsiLikelihood}
409: P(\dataVec|\dpsiVec, \PRmatSet, \responseSet, \srVec) = \frac{\exp(-E_{\rm{D}}(\dataVec|\dpsiVec, \PRmatSet, \responseSet, \srVec))}{Z_{\rm{D}}}, 
410: \ee
411: where 
412: \bea
413: E_{\rm{D}}(\dataVec|\dpsiVec, \PRmatSet, \responseSet, \srVec) &=& \frac{1}{2}(\dataVec-\responseSet\srVec - \PRmatSet\dpsiVec)^{\rm{T}} \imCM^{-1} \nonumber \\
414: & & {\ \ \ \ \ \ \ } (\dataVec-\responseSet\srVec - \PRmatSet\dpsiVec)\\
415: & = & \frac{1}{2}\chi^2
416: \eea
417: and $Z_{\rm{D}}$ is the normalization for the probability.  We express
418: the prior in the following form:
419: \be
420: P(\dpsiVec|\mu, \regSetdpsi) = \frac{\exp(-\mu E_{\mathrm{\dpsi}}(\dpsiVec|\regSetdpsi))}{Z_{\dpsi}(\mu)}.
421: \ee
422: We use quadratic forms of the regularizing function $E_{\mathrm{\dpsi}}$.  In
423: particular, we use the curvature form of regularization (see, for example,
424: Appendix A of \citet{SuyuEtal06} for an explicit expression of the curvature
425: form of regularization).  We use this regularization instead of the zeroth-order
426: or gradient forms because the lens potential should in general be smooth, being
427: the \textit{integral} of the surface mass density.  Curvature regularization in
428: the potential corrections effectively corresponds to zeroth-order
429: regularization in the surface mass density corrections.  This implies
430: a prior preference toward zero surface mass density corrections, thus 
431: suppressing the addition of mass to the initial mass model unless the data
432: require it.
433: 
434: Maximizing the posterior of parameters $\dpsiVec$, we obtain the most probable
435: solution
436: \be 
437: \label{eq:dpsiMP}
438: \dpsiMPVec = \hessM^{-1} \boldsymbol{D},
439: \ee
440: where
441: \bea
442: \label{eq:hessDefs}
443: \nonumber \hessM & = & \hessD + \mu\hessS, \\
444: \nonumber \hessD & \equiv & \nabla \nabla E_{\mathrm{D}}(\dpsiVec)  = \PRmatSet^{\mathrm{T}} \imCM^{-1} \PRmatSet, \\
445: \nonumber \hessS & \equiv & \nabla \nabla E_{\mathrm{\dpsi}}(\dpsiVec), \\ 
446: \nonumber \boldsymbol{D} & = & \PRmatSet^{\mathrm{T}} \imCM^{-1} (\dataVec-\responseSet\srVec), \\
447: \nonumber \rm {and\ }  \nabla & \equiv & \frac{\partial}{\partial \dpsiVec}.
448: \eea
449: The matrices $\hessM$, $\hessD$ and $\hessS$ have dimensions $N_{\rm p}
450: \times N_{\rm p}$ and are, by definition, the Hessians of the exponential
451: arguments in the posterior, the likelihood, and the prior probability
452: distributions, respectively.
453: 
454: As discussed in detail in, for example, \citet{MacKay92} and
455: \citet{SuyuEtal06}, the evidence is irrelevant in the first level of
456: inference where we maximize the posterior of parameters $\dpsiVec$ to
457: obtain the most probable parameters $\dpsiMPVec$.  However, the
458: evidence is crucial for the second level of inference for model
459: comparison, where a model incorporates the lens potential, PSF, and
460: regularizations of both the source intensity and the potential
461: correction.  If we assert that models are equally probable a
462: priori, then the evidence gives the relative probability of the
463: model given the data.  In other words, the ratio in the evidence
464: values of two models tells us how much more probable the first model
465: is relative to the second model, if we assume that the two models are
466: a priori equally probable.  
467: Since the evidence gives only the relative probability, the data set
468: needs to be kept the same for model comparison.
469: 
470: The posterior ($P(\dpsiVec|\dataVec, \responseSet, \srVec, \PRmatSet,
471: \mu, \regSet_{\dpsi})$) and the evidence ($P(\dataVec|\responseSet,
472: \srVec, \PRmatSet, \mu, \regSetdpsi)$) in Equation
473: (\ref{eq:dpsiPosterior}) are conditional on the source-intensity
474: distribution.  Ideally, we would have an expression of the posterior
475: for both $\srVec$ and $\dpsiVec$: $P(\srVec, \dpsiVec|\dataVec,
476: \responseSet, \lambda, \regSet_{\rm S}, \PRmatSet, \mu,
477: \regSet_{\dpsi})$, where $\lambda$ and $\regSet_{\rm S}$ are, respectively, the
478: strength and form of regularization for $\srVec$.  We would also
479: obtain the evidence by marginalizing both the source-intensity and the
480: potential correction values, $P(\dataVec|\responseSet, \lambda,
481: \regSet_{\rm S}, \PRmatSet, \mu, \regSetdpsi) = \int \rm{d}\srVec\,
482: \rm{d}\dpsiVec\, P(\dataVec|\srVec, \dpsiVec, \responseSet, \PRmatSet)
483: P(\srVec, \dpsiVec | \lambda, \regSet_{\rm S}, \mu, \regSetdpsi)$.
484: However, due to the iterative nature of the method (i.e., $\srVec$
485: and $\dpsiVec$ are not inferred simultaneously), we do not have such
486: expressions for the posterior and the evidence.  Pragmatically, we use
487: the evidence from the source reconstruction (given the corrections
488: $\dpsiVec$), $P(\dataVec | \responseSet, \dpsiVec, \lambda,
489: \regSet_{\rm S})$, for comparing the potential models, PSF and
490: regularizations. Specifically, after iterating through the source-intensity 
491: reconstructions and lens potential corrections, we use the
492: final corrected lens potential for one last source-intensity
493: reconstruction and use the evidence from this final source
494: reconstruction for comparing models.  This approximation is valid
495: provided that the probability distributions of $\dpsiVec$ and the
496: regularization constant are sharply peaked at the most probable
497: values.  \citet{SuyuEtal06} showed that the delta function
498: approximation for the regularization constant is acceptable;
499: simulations of the iterative potential reconstruction method suggest
500: that the probability of $\dpsiVec$ after the final iteration is
501: sharply peaked.  Therefore, the probability of a given potential
502: model, PSF, and form of regularization is $P(\responseSet,
503: \regSet_{\rm S}| \dataVec) \propto \int \rm{d}\lambda\, \rm{d}\dpsiVec
504: \, P(\dataVec| \responseSet,\dpsiVec, \lambda, \regSet_{\rm S})
505: P(\responseSet, \regSet_{\rm S}) \sim P(\dataVec|
506: \responseSet,\hat{\dpsiVec}, \hat{\lambda}, \regSet_{\rm S})
507: P(\responseSet, \regSet_{\rm S})$, where $\hat{\dpsiVec}$ and
508: $\hat{\lambda}$ are the most probable solutions. Assuming that all
509: models are equally probable a priori (i.e., $P(\responseSet,
510: \regSet_{\rm S})$ is constant), the evidence from the source
511: reconstruction serves as a reasonable proxy to use for model
512: comparison.
513: 
514: There is an uncertainty associated with the evidence values due to
515: finite source-intensity resolution as a result of the source
516: pixelization.  The source reconstruction region is initially chosen
517: such that the mapped source region on the image plane encloses the
518: Einstein ring.  This ensures that the source region contains the
519: entire source-intensity distribution.  Throughout the iterative
520: pixelated potential reconstruction, the source region and pixelization
521: are kept the same.  In the final source reconstruction for evidence
522: computation, the evidence value depends on the pixelated source region
523: because the goodness of fit on the image plane generally changes,
524: especially in areas of significant intensity gradients, as one shifts
525: the source region.  To estimate the uncertainty in the evidence
526: values, we perform the last source reconstruction for various source
527: regions that are shifted by a fraction of a pixel from the optimized
528: one in the potential reconstruction.  The range of the resulting
529: evidence values for the various source regions then allow us to
530: quantify the uncertainty in the evidence.  In addition to the
531: uncertainty due to source pixelization, the evidence also depends on
532: the amount of regularization on $\dpsiVec$, which is discussed in 
533: Section \ref{sec:PPRMethod:matrix:technicalities}.
534: 
535: \subsubsection{Technicalities of the pixelated potential reconstruction}
536: \label{sec:PPRMethod:matrix:technicalities}
537: 
538: Solving for the potential perturbations is very similar to solving for the
539: source-intensity distribution in \citet{SuyuEtal06} except for the following
540: technical details:
541: %\renewcommand{\theenumi}{\roman{enumi}}
542: \begin{enumerate}
543: \item In each iteration, the perturbative potential correction is
544:   obtained only in an annular region instead of over the entire lens
545:   potential grid due to the need for the source-intensity gradient
546:   (see Equation (\ref{eq:pertEq})) to be measurable.  Since the
547:   extended source intensity is only non-negligible near the Einstein
548:   ring, we only have information about the source-intensity gradients
549:   in this region.  In practice, the annular region is the mapping of
550:   the finite source reconstruction grid that encloses the extended
551:   source with a minimal number of source pixels (for computational
552:   efficiency).  The annulus of potential corrections obtained at each
553:   iteration is extrapolated for the next iteration by minimizing the
554:   curvature in the potential corrections.  This allows the shape of
555:   the annular region to change as needed when the lens potential gets
556:   corrected.  In addition, the forms of the regularization matrix, as
557:   discussed in Appendix A of \citet{SuyuEtal06}, are modified
558:   accordingly to take into account the nonrectangular reconstruction
559:   region (described in more detail in the third point below).
560: \item Since Equation (\ref{eq:djWithPert}) is a perturbative equation
561:   in $\dpsiVec$, the inversion needs to be \textit{over-regularized}
562:   to enforce a small correction in each iteration.  Empirically, we
563:   set the regularization constant, $\mu$, at roughly the peak of the
564:   function $\mu E_{\rm{\delta \psi}}$ (within a factor of 10), which
565:   corresponds to the value before which the prior dominates.  The
566:   resulting evidence value from the final source-intensity
567:   reconstruction weakly depends on the value of $\mu$, and we include
568:   this dependence in the uncertainty of the evidence value.
569: \item The potential corrections are generally nonzero at the edge of
570:   the annular reconstruction region.  This calls for slightly
571:   different structures of regularization compared to those written in
572:   Appendix A in \citet{SuyuEtal06} for source-intensity reconstruction
573:   (since the source grids are chosen to enclose the entire extended
574:   source such that edge pixels have nearly zero intensities).  The
575:   regularizations are still based on derivatives of $\dpsiVec$;
576:   however, no patching with lower derivatives should be used for the
577:   edge pixels because the zeroth-order regularization at the top/right
578:   edge will incorrectly enforce the $\dpsiVec$ values to zero in those
579:   areas.  The absence of the lower derivative patches leads to a
580:   singular regularization matrix,\footnote{Having a singular
581:     regularization matrix ($\hessS$) does not prevent one from
582:     calculating $\dpsiVec$ because the matrix for inversion
583:     ($\hessM=\hessD+\mu\hessS$) is, in general, nonsingular.} which is
584:   problematic for evaluating the Bayesian evidence for lens potential
585:   correction.  However, since we do not use the evidence values to
586:   compare the forms of regularization for the potential corrections
587:   (because we use only the curvature form) nor to compare the lens
588:   potential and PSF model, the revised structure of regularization is
589:   acceptable.  We have found this structure of regularization for
590:   potential corrections to work for various types of sources (with
591:   varying sizes, shapes, number of components, etc.).
592: \end{enumerate}
593: 
594: In the source reconstruction steps of this iterative scheme, we
595: discover by using simulated data that over-regularizing the source
596: reconstruction in early iterations helps the process to converge.
597: This is because initial guess potentials that are significantly
598: perturbed from the true potential often lead to highly discontinuous
599: source distributions when optimally regularized (corresponding to
600: maximal Bayesian evidence, which balances the goodness of fit and the
601: prior), and over-regularization would give a more regularized source-intensity
602: gradient for the potential correction.  Unfortunately, we do
603: not have an objective way of setting this over-regularization factor
604: for the source reconstruction.  Currently, at each source
605: reconstruction iteration, we set the over-regularization factor such
606: that the magnitude of the intensity deficit is at approximately the
607: same level as that from the optimally-regularized case but with a
608: smoother source-intensity distribution for numerical derivatives.
609: This scheme ensures that we do not over-regularize when we are close
610: to the true potential.  Based on simulated test runs, the recovery of
611: the true potential depends on the amount of over-regularization.  When
612: the initial guess is far from the true potential, over-regularization
613: in the early iterations is crucial for convergence.  We find that it
614: is better to over-regularize in excess than in deficit.  Too much
615: over-regularization simply leads to more iterations to converge,
616: whereas too little over-regularization may not converge at all.
617: 
618: For each iteration of source-intensity reconstruction, there is also a
619: mask on the source plane to exclude source pixels that either (1) are
620: not mapped by that iteration's lens potential on the data grid or (2)
621: have no neighboring pixels for the computation of numerical
622: derivatives.  We generalize the regularizing function for this
623: nonrectangular reconstruction region to have the right-most and
624: top-most pixels (pixels adjacent to the edge or adjacent to the masked
625: source pixels) patched with lower derivatives as we did for the edge
626: pixels in Appendix A of \citet{SuyuEtal06}.  This patching ensures
627: that the regularization matrix is nonsingular for the evaluation of
628: the Bayesian evidence.
629: 
630: Based on simulated test runs, we find that a practical stopping
631: criterion for the iterative procedure is to terminate when the
632: relative potential corrections between all image pairs are $(\dpsi_1
633: - \dpsi_2) / (\psi_1-\psi_2) < 0.1\%$, where $1$ and $2$ label the
634: images in any pair.  After this criterion is reached, further
635: iterations give a negligible contribution to the predicted Fermat
636: potential differences between the images.
637: 
638: \subsubsection{Mass-sheet degeneracy}
639: \label{sec:PPRMethod:matrix:MSD}
640: 
641: The restriction to using only isophotal data implies that the
642: potential correction we obtain at each iteration may be affected by
643: the ``mass-sheet degeneracy'' \citep{FalcoEtal85}. However, the
644: addition of mass sheets is suppressed by the curvature form of the
645: regularization for the potential correction and also by the large
646: amount of over-regularization.  We refer to \citet{KochanekEtal06} and
647: Paper II for a detailed description of the mass-sheet degeneracy; here
648: we review a few key points that are relevant for the potential
649: corrections.  In essence, an arbitrary symmetric paraboloid, gradient
650: sheet, and constant can be added to the potential without changing the
651: predicted lensed image:
652: \be
653: \label{eq:MassSheetTrans}
654: \psi_{\nu}(\vec{\theta}) = \frac{1-\nu}{2} \vert \vec{\theta} \vert
655: ^2 + \vec{a}\cdot\vec{\theta} + c + \nu\psi(\vec{\theta}), 
656: \ee 
657: where $\psi(\vec{\theta})$ is the original potential,
658: $\psi_{\nu}(\vec{\theta}) $ is the transformed potential, and $\nu$,
659: $\vec{a}$, and $c$ are constants.  The constants $\vec{a}$ and ${c}$
660: have no physical effects on the lens systems as they merely change the
661: origin on the source plane (which is unknowable) and change the
662: zero point of the potential (which is not observable).  The parameter
663: $1-\nu$ refers to the amount of mass sheet, which can be seen in the
664: corresponding convergence transformation:
665: $\kappa_{\nu}(\vec{\theta})=(1-\nu)+\nu\kappa(\vec{\theta})$.  To make
666: sure we remain ``close'' to the initial potential model, we set
667: $\nu=1$ and fix three points in the corrected potential after each
668: iteration to the corresponding values of the initial potential.
669: Setting $\nu=1$ ensures that the size of the extended source intensity
670: remains approximately the same, and the three fixed points allow us to
671: solve for $\vec{a}$ and $c$ in Equation (\ref{eq:MassSheetTrans}) to
672: remove irrelevant gradient sheets and constants in the reconstructed
673: potential.  We choose the three points to be three of the four (top,
674: left, right, and bottom) locations of the annular reconstruction
675: region that are midway in thickness between the annular edges.  The
676: three points are usually chosen to be at places with lower surface
677: brightness in the ring.  This technique of ``fixing'' the mass-sheet
678: degeneracy is demonstrated in Section \ref{sec:PPRMethod:matrix:summary} 
679: using simulated data.
680: 
681: \subsubsection{Summary}
682: \label{sec:PPRMethod:matrix:summary}
683: 
684: To summarize, the steps for the iterative and perturbative potential
685: reconstruction scheme via matrices are as follows. (1) Reconstruct
686: the source-intensity distribution given the initial (or corrected)
687: lens potential based on \citet{SuyuEtal06}.  (2) Compute the
688: intensity deficit and the source-intensity gradient.  (3) Solve
689: Equation (\ref{eq:djWithPert}) for the potential corrections
690: $\dpsiVec$ in the annulus of reconstruction.  (4) Update the current
691: potential using Equation (\ref{eq:potmodel}): $\boldsymbol{\psi}_{\rm
692:   {next \ iteration}} = \boldsymbol{\psi}_{\rm{current \
693:     iteration}}-\dpsiVec$.  (5) Transform the corrected potential
694: $\boldsymbol{\psi}_{\rm {next \ iteration}}$ via Equation
695: (\ref{eq:MassSheetTrans}) so that $\nu=1$ and the transformed
696: corrected potential has the same values as the initial potential at
697: the three fixed points. (6) Extrapolate the transformed corrected
698: potential for the next iteration. (7) Interpolate the transformed
699: corrected potential onto the resolution of the data grid for the next
700: iteration's source reconstruction.  (8) Repeat the process using
701: the extrapolated and finely gridded reconstructed potential, and stop
702: the process when the relative potential correction between any pair
703: of images is $<0.1\%$.
704: 
705: 
706: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
707: 
708: \section{Demonstration: potential perturbation due to an invisible mass clump}
709: \label{sec:PPRMethod:demo}
710: 
711: In the previous section, we have outlined a method of pixelated
712: potential reconstruction.  In this section, we will demonstrate this
713: method using simulated data with a lens consisting of two mass
714: components.
715: 
716: \subsection{Simulated data}
717: We use singular isothermal ellipsoid (SIE) potentials
718: \citep{KormannEtal94} to test the potential reconstruction method.
719: For this demonstration, we let the lens be comprised of two SIEs at
720: the same redshift $z_{\rm d}=0.3$: a main component and a perturber.  The
721: main lens has a one-dimensional velocity dispersion of $260 \rm{\, km \, s^{-1}}$, an axis
722: ratio of $0.75$, and a semi-major axis position angle of $45^{\circ}$
723: (from vertical in the counterclockwise direction).  The (arbitrary) origin
724: of the coordinates is set such that the lens is centered at $(2.5'',
725: 2.5'')$, the center of the $5''\times5''$ image.  The perturbing SIE
726: is centered at $(3.8'', 2.5'')$ with a velocity dispersion of $50
727: \rm{\, km\, s^{-1}}$, axis ratio of $0.60$, and semimajor axis position
728: angle of $70^{\circ}$.  The exact potential is the sum of these two
729: SIEs.  We model the source intensity as an elliptical distribution
730: inside the caustics at $z_{\rm s}=3.0$ with an extended component (of peak
731: intensity of 1.0 in arbitrary units) and a central point source (of
732: intensity 3.0).  This source is chosen such that the lensed image
733: resembles B1608+656.  We use $100\times100$ image pixels each of size
734: $0.05''$ (typical pixel size of the \HST Advanced Camera for Surveys
735: (ACS)), $30\times30$ source pixels each of size $0.025''$, and
736: $25\times25$ potential pixels each of size $0.2''$.  To obtain the
737: simulated data, we map the source-intensity distribution to the image
738: plane using the exact lens potential and the lens equation, convolve
739: the lensed image with a Gaussian PSF whose $\rm{FWHM}=0.15''$ and add
740: Gaussian noise of variance $0.015$.  Fig.~\ref{fig:PR:demo1:simData}
741: shows the simulated source in the left-hand panel and the simulated
742: noisy data image in the middle panel.  The Fermat potential difference
743: between the images are listed in Table \ref{tab:PR:demo1:fermPot}.
744: The images are labeled by A, B, C, D, and their locations are
745: $(1.77'',1.02'')$, $(3.90'',3.59'')$, $(3.54'',1.26'')$, and
746: $(1.34'',3.38'')$, respectively.
747: 
748: 
749: \begin{figure*} \begin{center}
750: \includegraphics[width=165mm]{f1.ps}
751: \end{center} 
752: \caption[Demonstration of potential reconstruction: simulated data]
753: {\label{fig:PR:demo1:simData} Demonstration of potential reconstruction:
754: simulated data and potential perturbation.  Left-hand panel: the simulated
755: source-intensity distribution with an extended component (of peak intensity of
756: 1.0 in arbitrary units) and a central point source (of intensity 3.0) on a
757: $30\times30$ grid. The solid curves are the astroid caustics of the initial
758: potential that consists of only the main SIE.  Middle panel: the simulated
759: image of the source-intensity distribution on the left using the true potential
760: consisting of two SIEs (convolution with Gaussian PSF and addition of noise are
761: included, as described in the text).  The solid line is the critical curve of
762: the initial potential and the dotted lines mark the annular region to which the
763: source grid maps (using the mapping matrix $\responseSet$).  Right-hand
764: panel: the fractional potential perturbation in the initial potential model. 
765: The Xs mark the three points where we fix the potential perturbation to zero. 
766: In both the middle and right-hand panels, the asterisk and the plus sign
767: indicate the positions of the main SIE component and the perturbing SIE
768: component, respectively.}
769: \end{figure*}
770: 
771: \begin{table*}
772: \begin{center}
773: \caption[Demonstration of potential reconstruction: actual and predicted Fermat potential differences]{\label{tab:PR:demo1:fermPot} The relative Fermat potential ($\phi = (\vec{\theta}-\vec{\beta})^2/2-\psi$) between the four images of the true potential and of the reconstructed potential for a few selected iterations  }
774: \begin{tabular}{c c c c c c}\tableline \tableline
775:  & & & & Source Position \\
776: Potential  & $\phi_{AB}$ & $\phi_{CB}$ & $\phi_{DB}$ & (arcsec) \\
777: \tableline
778: True & 0.141 & 0.234 & 0.437 & $\dots$ \\
779: Initial &  $0.172\pm0.189$ & $0.228\pm0.156$ & $0.437\pm0.041$ & $(2.587,2.483)\pm(0.013,0.076)$ \\
780: Iteration=0 & $0.178\pm0.070$ & $0.246\pm0.068$ & $0.479\pm0.010$ & $(2.608, 2.483)\pm(0.006,0.034)$ \\
781: Iteration=2 & $0.161\pm0.011$ & $0.242\pm0.010$ & $0.471\pm0.011$ & $(2.623, 2.484)\pm(0.005, 0.005)$ \\
782: Iteration=9 & $0.151\pm0.006$ & $0.244\pm0.004$ & $0.454\pm0.006$ & $(2.621, 2.484)\pm(0.003, 0.002)$ \\
783: \tableline
784: $\nu$ & 0.96 & & & \\
785: Iteration=9 & $0.145\pm0.006$ & $0.234\pm0.004$ & $0.436\pm0.006$ & \\
786: \tableline
787: \end{tabular}
788: \end{center}
789: \tablecomments{We use the average source position of the four source positions for the computation of the Fermat potential.  The four source positions deviate by $\sim 0.1''$ in the initial model, and agree within $\sim 0.005''$ at iteration=9.  The uncertainties in the predicted relative Fermat potential are due to the uncertainties in the source position.  The good agreement between the predicted Fermat potential values for the initial potential and the true values is coincidental due to the use of the average source position.}
790: \end{table*}
791: 
792: \subsection{Iterative and perturbative potential corrections}
793: 
794: We take the initial guess of the lens potential to be the main SIE
795: component but with the position angle changed from $45$ to $40^{\circ}$.  
796: This corresponds to a typical scenario where the perturbing
797: SIE is faint/dark so that it is not detected in the image, and hence is
798: not incorporated in the smooth parametrized model of the main SIE
799: component.  The rotation in the position angle of the main SIE
800: component corresponds to a situation where the mass of the galaxy does
801: not strictly follow the light, but the position angle of the lens mass
802: distribution is initially adopted from the position angle of the lens
803: galaxy light.  Here and after, ``initial potential'' refers to this
804: initial guess of the potential model (as opposed to the true/exact
805: potential).  Fig.~\ref{fig:PR:demo1:simData} shows the potential
806: perturbation relative to the initial potential in the right-hand
807: panel. In obtaining this plot, the initial potential has a constant
808: gradient plane and offset added such that the top, left, and bottom
809: midpoints in the annulus (marked by Xs in the plot) are fixed to the
810: true potential with zero potential perturbation (as described in the
811: passage following Equation (\ref{eq:MassSheetTrans})).  In the
812: iterative potential reconstruction process, the reconstructed
813: potential at each iteration also has these three points in the annulus
814: fixed to the initial model.  The locations of the three fixed points
815: have no impact on recovering the true potential when the source is
816: extended enough to form an Einstein ring on the image plane.  However,
817: if the source is compact, then locations of the three points do matter
818: and they are chosen to be at places where the information content
819: (image intensity) is low.
820: 
821: We perform 10 iterations of the perturbative potential correction
822: method outlined in Section \ref{sec:PPRMethod:matrix}.  The iterations
823: are labeled ``PI'' from 0 to 9.  For each source reconstruction
824: iteration, we adopt the curvature form of regularization and use the
825: source-intensity reconstruction for the evaluation of the source-intensity 
826: gradients that are needed for the potential correction.  The
827: source inversions are over-regularized in early iterations in order to obtain
828: smooth source reconstructions for evaluating the gradients.  For each
829: potential correction iteration, we use the curvature form of
830: regularization, and set the regularization constant for the potential
831: reconstruction to be $10\times$ the value of $\mu$ where $\mu
832: E_{\rm{\dpsi}}$ peaks in iteration=0.  This regularization value is
833: $\sim 10^8$ and is used for all subsequent iterations (since we find
834: that the peak in $\mu E_{\rm{\dpsi}}$ changes little as the iterations
835: proceed).  For comparison, the ``optimal'' regularization constant is
836: $\sim 10^2$ at iteration=0 and is $\sim 10^7$ at iteration=9.
837: Therefore, the potential reconstruction inversions are heavily
838: over-regularized in the early iterations to keep the corrections to
839: first order; as the lens potential gets corrected, the amount of
840: over-regularization diminishes as the inversion approaches the linear
841: regime with small intensity deficits.  We show figures of source
842: reconstructions and potential corrections for some, but not all, of
843: the iterations.
844: 
845: The top row of Fig.~\ref{fig:PR:demo1:itn0_2_9} shows the results of
846: PI=0.  The over-regularized reconstructed source in the left-hand
847: panel does not resemble the original source, and the (normalized)
848: image residual in the middle-left panel shows prominent arc features
849: due to the presence of both the misaligned initial model and the SIE
850: potential perturbation.  The reconstructed $\dpsiVec$ in the
851: middle-right panel is of the same structures as the exact $\dpsiVec$
852: in Fig.~\ref{fig:PR:demo1:simData}, though the magnitude is smaller
853: due to the correction being a perturbative one.  A plot of the image
854: residual after correction $(=\dIVec - \PRmatSet\dpsiVec)$ continues to
855: show arc features though less prominent than in the top middle-left
856: panel in Fig.~\ref{fig:PR:demo1:itn0_2_9}.  The same image residual
857: plot with the true potential perturbation also shows similar arc
858: features, which indicates that Equation (\ref{eq:pertEq}) is indeed a
859: perturbative equation and thus justifies the over-regularization in
860: the potential correction step.
861: 
862: \begin{figure*}
863: \begin{center}
864: \includegraphics[width=180mm]{f2a.ps}
865: \includegraphics[width=180mm]{f2b.ps}
866: \includegraphics[width=180mm]{f2c.ps}
867: \end{center}
868: \caption[Demonstration of potential reconstruction: results of source-intensity 
869: reconstruction and potential correction for iteration = 0, 2,
870: and 9] {\label{fig:PR:demo1:itn0_2_9} Demonstration of potential
871:   reconstruction: results of source-intensity reconstruction and
872:   potential correction for iteration = 0, 2, and 9.  The top row shows
873:   the results for PI=0.  Left-hand panel: the reconstructed source
874:   intensity using curvature regularization that is over-regularized to
875:   ensure a smooth resulting source for evaluation of the gradients.
876:   The caustic curves in solid are those of the initial potential.
877:   Middle-left panel: the normalized image residual (difference between
878:   the simulated image and the predicted image from the reconstructed
879:   source in the left-hand panel, in units of the estimated pixel
880:   uncertainty from the data image covariance matrix).  The prominent
881:   arc features are due to the potential perturbation.  Middle-right
882:   panel: the reconstructed $\dpsiVec$ using the source-intensity
883:   gradients and image residual.  Right-hand panel: the amount of
884:   potential perturbation that remains to be corrected.  The middle and
885:   bottom rows show the results for PI=2 and PI=9, respectively, with
886:   the panels arranged in the same way as in the top row.  As the
887:   iterative potential correction proceeds, the source resembles better
888:   the original source in Fig.~\ref{fig:PR:demo1:simData}, the image
889:   residual becomes less prominent, and the magnitude of the
890:   reconstructed $\dpsiVec$ decreases.  At PI=9, the source in the
891:   left-hand panel has been faithfully reconstructed that results in
892:   negligible image residual in the middle-left panel.  The remaining
893:   potential perturbation in the right-hand panel, now close to zero,
894:   cannot be fully corrected due to the noise in the data.}
895: \end{figure*}
896: 
897: 
898: The second row of Fig.~\ref{fig:PR:demo1:itn0_2_9} shows the results
899: of PI=2.  The reconstructed source in the left-hand panel better
900: resembles the original source in Fig.~\ref{fig:PR:demo1:simData}.  The
901: amount of misfit in the image residual has decreased in the
902: middle-left panel, signaling that we are correcting toward the true
903: potential.  The middle-right panel is the potential correction in
904: PI=2, and the right-hand panel is the amount of perturbation that
905: remains after PI=2.  The amount of potential perturbation remaining is
906: closer to zero compared to the top row, which is a sign that the
907: iterative method converges.
908: 
909: 
910: The bottom panels in Fig.~\ref{fig:PR:demo1:itn0_2_9} show the results
911: of PI=9, the last iteration.  The source is faithfully recovered in
912: the left-hand panel, resulting in negligible image residual in the
913: middle-left panel (reduced $\chi^2=1.02$ inside the
914: annulus\footnote{The reduced $\chi^2$ is given by $\chi^2 /(
915:   N_{\rm{pix\ in\ annulus}}-\gamma)$, where $N_{\rm{pix\ in\
916:       annulus}}$ is the number of data pixels in the annulus that
917:   encloses the ring and $\gamma$ is an estimate of the number of
918:   ``effective'' parameters \citep[e.g.][]{MacKay92, SuyuEtal06}.}).
919: The centroid of the source is slightly shifted compared to the
920: original because of our adding constant gradients to fix the three
921: points in the potential corrections.  The absolute position of the
922: source is irrelevant as we can arbitrarily set the coordinates; it is
923: only the relative positions on the source plane that matter.  The
924: source positions are shifted \textit{relative} to the plotted caustic
925: curve only because these caustic curves are the ones from the initial
926: potential guess (they were not computed for the reconstructed
927: potential due to the low resolution in the reconstructed potential
928: grids).  If we were to plot the caustic curve of the corrected
929: potential, we would find no overall shift in the source with respect
930: to the caustic curve.  The middle-right panel shows the final
931: iteration's potential correction, which is barely visible due to the
932: negligible image residual left to correct.  The right-hand panel shows
933: that most of the potential perturbation to the true potential has been
934: corrected, though there is still some left.  However, this amount of
935: remaining uncorrected potential perturbation leads to image residuals
936: that are effectively masked by the noise in the data.  We have thus
937: reached the limit in the potential correction that is set by noise in
938: the data.
939: 
940: Table \ref{tab:PR:demo1:fermPot} lists the predicted Fermat potential
941: differences for the initial potential guess and for the corrected
942: potential in PI=0, 2, and 9.  We use the average source position (also
943: listed in the table) of the four mapped source positions for the
944: computation of the Fermat potential.  The uncertainty in the predicted
945: Fermat potential difference comes from the error in the source
946: position due to discrepancies in the mapped source positions of the
947: four images.  The mapped source positions agree within $\sim 0.005''$
948: (i.e., within a fifth of a source pixel) in the final iteration, a
949: significant improvement to $\sim 0.1''$ in the initial potential.  The
950: convergent Fermat potential differences in PI=9 are systematically
951: higher than the true Fermat potential differences.  This is because
952: lensing only allows us to recover the Fermat potential differences up
953: to a constant factor due to the mass-sheet degeneracy.  The
954: transformation in Equation (\ref{eq:MassSheetTrans}) would scale the
955: Fermat potential difference by a factor of $\nu$.  The last row in
956: Table \ref{tab:PR:demo1:fermPot} shows that a mass-sheet
957: transformation with $\nu=0.96$ leads to the predicted Fermat potential
958: values agreeing with the true values within the uncertainties.  We
959: expect this particular simulation's reconstructed pixelated
960: potential to be different from the true potential by a mass-sheet
961: transformation of $\nu\sim 0.96$ due to the unaccounted mass of the
962: secondary SIE ($\sim 4\%$ of the primary SIE) in the initial model.
963: In the iterative potential corrections, mass additions are suppressed
964: in the annulus due to the regularization. This breaks the mass-sheet
965: degeneracy, but underestimates the total mass within the annulus (the
966: SIE perturber was not included in the initial model): the
967: reconstructed potential, therefore, continues to have a deficit of mass
968: in the annulus.  Since the value of the convergence in the annulus is
969: generally less than $1$, the reconstructed potential is thus approximately a
970: mass-sheet transformation of the true potential with the mass deficit
971: in the form of a constant sheet.
972:   
973: The simulation we have shown is one of the worst-case scenarios where
974: even the total mass of the initial lens model enclosed within the Einstein
975: ring is wrong.  For initial potential models that have the correct
976: amount of mass within the Einstein ring (this enclosed mass is what
977: lensing can robustly measure to $\sim 1\%-2\%$ accuracy in real systems)
978: and with the mass-sheet degeneracy broken (using external information
979: such as stellar dynamics), the reconstructed potential would
980: faithfully recover the Fermat potential.
981: 
982: 
983: \subsection{Discussion}
984: 
985: This demonstration shows that the iterative and perturbative potential
986: reconstruction method works in practice. Using simulated data, we find
987: that potential perturbations $\lesssim 5\%$ (which may correspond to
988: as much as $\sim 20\%$ in the relative potential perturbations between
989: image pairs) are correctable, though the actual amount depends on the
990: amount of over-regularization for both the source inversion and the
991: potential correction, and on the extendedness of the source-intensity
992: distribution.  In the case where the solution converges, the
993: magnitudes of the relative potential corrections between image pairs
994: steadily decrease, and we end the iterative procedure when the
995: stopping criterion (described in Section \ref{sec:PPRMethod:matrix})
996: is met.
997: 
998: Regarding the size of the source-intensity distribution, the more
999: extended a source is, the better we can recover the potential.  When
1000: the source is extended enough to be lensed into a closed ring, the
1001: true potential can be fully recovered (up to the limit set by the
1002: noise in the data) from potential corrections based on Equation 
1003: (\ref{eq:djWithPert}).  When the source is extended to cover about
1004: half of the Einstein ring, then the corrected potential faithfully
1005: reproduces the source with negligible image residual, but the relative
1006: Fermat potentials may not be recovered due to a slight relative offset
1007: in the potential between the images.  This is because the ``connecting
1008: characteristics'' (see \citet{SuyuBlandford06}) that fix the potential
1009: difference between the images go through regions without much
1010: signal (light of the lensed source).  Therefore, the potential is
1011: locally corrected at regions near the images (where there is light),
1012: but the global offset between the regions cannot be determined.
1013: 
1014: For sources that are small in extent, the potential correction also
1015: depends on the points we choose to fix to the initial potential model.
1016: Since an isolated image is generally more prone to having its
1017: potential be offset relative to the other images, we set two of the
1018: three fixed points in the gaps on both sides of the most isolated
1019: image and one point near the connecting images.
1020: 
1021: We find that a wrong PSF model (e.g., of a different width) would lead
1022: to intensity deficit that would not be correctable by the iterative
1023: potential reconstruction method.  Therefore, an uncorrectable image
1024: residual is a sign that our model of the system (other than the lens
1025: potential) is wrong.
1026: 
1027: The potential grid that we used was $25\times25$, which we find to be
1028: a good balance between the number of degrees of freedom and goodness
1029: of fit.  The higher the number of potential pixels, the better one can
1030: fit to the image residual; however, in this case, it is also more
1031: probable to have degenerate solutions.  The Bayesian evidence from the
1032: source reconstruction in principle can be used to compare the
1033: different potential grids.  In general, we find that a potential grid
1034: that is $\sim 4$ times coarser than the image grid works well.
1035: 
1036: In Section \ref{sec:PPRMethod:realData}, we generalize this iterative potential
1037: reconstruction method, which has been shown to work on simulated data,
1038: to treat real gravitational lens images such as B1608+656.
1039: 
1040: 
1041: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1042: 
1043: \section{Generalization to realistic data: incorporating dust
1044:   extinction and lens galaxy light}
1045: \label{sec:PPRMethod:realData}
1046: In the previous section, we have demonstrated the method of
1047: pixelated potential reconstruction using simulated data.  In the mock
1048: data, only the image of lensed source was there; in reality, there
1049: would also be light from the lens galaxy.  Furthermore, in some cases,
1050: such as B1608+656, dust is present and absorbs light from both the
1051: source galaxy and the lens galaxy.  Based on results of the previous
1052: section, an accurate extraction of the light from the lensed extended
1053: source is crucial for reconstructing the lens potential.  Therefore,
1054: we will generalize the formalism given in Section \ref{sec:PPRMethod}
1055: to incorporate the lens galaxy light and dust.
1056: 
1057: Suppose that we have a set of PSF, dust, and lens galaxy light
1058: models (the process of obtaining these models is described in detail
1059: in Section \ref{sec:ImProc}), a lens potential model, and
1060: the observed image.  Separating the observed image into two
1061: components, the lensed source and the lens galaxy, we can
1062: model the observed image (as a vector for the intensities of the
1063: image pixels) as 
1064: \be
1065: \label{eq:dataVecComp}
1066: \dataVec = \overbrace{\blurSet \cdot \dustSet \cdot \lensSet \cdot
1067:   \srVec}^{\rm{lensed \ extended \ source}} + \overbrace{ \blurSet
1068:   \cdot \dustSet \cdot \glightVec}^{\rm{lens\ galaxy}} +\ \noiseVec,
1069: \ee 
1070: where $\blurSet$ is a PSF blurring matrix, $\dustSet$ is a dust
1071: extinction matrix, $\lensSet$ is the lensing matrix (containing the
1072: lens potential model), $\srVec$ is the source-intensity distribution,
1073: $\glightVec$ is the lens galaxy intensity distribution, and
1074: $\noiseVec$ is the noise in the data characterized by the covariance
1075: matrix $\imCM$.  This is an extended version of the equation
1076: $\dataVec=\responseSet\srVec+\noiseVec$  in \citet{SuyuEtal06} with
1077: $\responseSet$ replaced by $\blurSet \cdot \dustSet \cdot \lensSet$
1078: and $\dataVec$ replaced by $\dataVec -
1079: \blurSet\cdot\dustSet\cdot\glightVec$.  The order of the matrix
1080: products in both terms are obtained by tracing backwards along the
1081: light rays: we first encounter the PSF blurring from the telescope
1082: ($\blurSet$), then dust extinction ($\dustSet$) in the lens plane,
1083: then the strong lensing effects ($\lensSet$) in the case of the lensed
1084: source, and finally the origin of light ($\srVec$ or $\glightVec$).
1085: 
1086: Here we assume that the dust lies in a screen in front of the lensed
1087: source and the lens galaxy.  This assumption is not strictly valid for
1088: the lens galaxy if the dust were to have originated from G2
1089: \citep{SurpiBlandford03}.  In this case, the dust and stars are
1090: mingled together in the lens galaxy.  It is beyond the scope of this
1091: paper to treat this mixed light and dust problem.  However, we note
1092: that the dust screen assumption is acceptable since the aim is to
1093: obtain an accurate lensed source-intensity distribution (for which the
1094: dust screen assumption is valid) and not the lens galaxy
1095: intensity distribution near the core where the mixing effects would
1096: dominate.  Furthermore, in simple toy models, where either the dust and
1097: stars are uniformly mixed or the dust is a screen lying inside
1098: the lens galaxy, we find that the extinction of the lens galaxy light
1099: is well approximated as extinction by a foreground dust screen with a
1100: reduced visual extinction.  Our simple foreground dust screen model
1101: thus provides an {\it effective} extinction that incorporates the
1102: reduced extinction for the lens and the full extinction by a
1103: foreground dust screen for the lensed source.
1104: 
1105: If the lensed source contains a bright core such as an active galactic nucleus (AGN), then we
1106: could consider extending Equation (\ref{eq:dataVecComp}) and model the observed
1107: image as
1108: \be
1109: \label{eq:dataVecCompAGN}
1110: \dataVec = \blurSet \cdot \dustSet \cdot \lensSet \cdot \srVec +
1111: \sum_{i=1}^{N_{\rm{images}}} \dust_i \alpha_i {\rm PSF}(\vec{\theta}_i) + 
1112: \blurSet \cdot \dustSet \cdot \glightVec + \noiseVec, 
1113: \ee
1114: where the light from the extended part of the host (the first term) would be
1115: modeled
1116: separately from that from 
1117: the point sources (the second term), and $\alpha_i$ are the
1118: intensities (flux per unit solid angle in a pixel) of the point sources 
1119: (which are generally not the
1120: same for all images due to finite resolution---both lensing and microlensing
1121: give rise to different magnification of the point-like source---and, 
1122: in the case of a time-varying core, time delay difference).  
1123: However, it is the extended image surface brightness that provides
1124: the information needed
1125: to reconstruct the lens potential.
1126: For B1608+656, by taking into account
1127: the errors in the modeling associated with the presence of the point sources
1128: (see Section \ref{sec:ACSimprocess}), we will find that a separate modeling of
1129: the point sources is not necessary for reconstructing the lens potential.
1130: 
1131: Given $\blurSet$, $\dustSet$, $\glightVec$, $\lensSet$ and $\dataVec$,
1132: one can solve for the most probable source-intensity distribution
1133: $\srMPVec$, as in \citet{SuyuEtal06}.  Furthermore, one can use the
1134: Bayesian evidence of the source reconstruction to rank different
1135: models of PSF, dust extinction, lens galaxy light, and lens potential
1136: (see Section \ref{sec:PPRMethod:matrix:probTheory}).  When we compare
1137: models, we mark an annular region enclosing the Einstein ring and use
1138: the same annulus of data for all models (where models refer
1139: collectively to the lens potential, PSF, dust, lens galaxy light, and
1140: regularization).  For the chosen data set, we determine the source
1141: region that maps to the annular region and reconstructs the source
1142: intensities in this region.  The shape of this source region is
1143: generally not rectangular, so we generalize the regularization schemes
1144: in Appendix A of \citet{SuyuEtal06} to patch the right-most and
1145: top-most pixels (pixels adjacent to the edge of grid or adjacent to
1146: the unmapped source pixels) with lower derivatives.  We will use the
1147: Bayesian evidence values from the source reconstruction in Sections
1148: \ref{sec:ImProc} and \ref{sec:PotRec:B1608} to compare various PSF,
1149: dust, lens galaxy light and lens potential models for B1608+656.
1150: 
1151: To include the effects of galaxy light and dust in the pixelated
1152: potential reconstruction method, we incorporate $\dustSet$ and
1153: $\glightVec$ into Equation (\ref{eq:djWithPert}) as in Equation
1154: (\ref{eq:dataVecComp}), and include $\dustSet$ into $\PRmatSet$ (see
1155: the Appendix for this inclusion).  After these adjustments, we can
1156: iteratively correct for the lens potential in real systems given a
1157: PSF, a dust, and a lens galaxy light model based on the machinery we
1158: developed in the previous sections.
1159: 
1160: To conclude, we have outlined and demonstrated an iterative and
1161: perturbative potential correction scheme where the accuracy in the
1162: reconstruction is limited by the noise in the data.  The inputs for
1163: this method are an initial guess of the lens potential as well as
1164: assumptions regarding the PSF, dust, and lens galaxy light.  The
1165: outputs are the reconstructed potential on a grid of pixels, the
1166: reconstructed source-intensity distribution, and the Bayesian evidence
1167: from source reconstruction, given the assumptions.  Our goal is to
1168: apply this method to the well-observed lens system B1608+656, and we
1169: begin by describing our \HST observations
1170: of B1608+656 in Section \ref{sec:ImProc}.
1171: 
1172: 
1173: %-------------------------------------------------------------------------------
1174: \section{Image processing of B1608+656}
1175: \label{sec:ImProc}
1176: 
1177: 
1178: \subsection{\HST observations of B1608+656}
1179: \label{sec:ImProc:ACSobs}
1180: 
1181: B1608+656 was observed with the ACS camera on \HST in the F606W and
1182: F814W filters in 2004 August (Proposal 10158; PI:Fassnacht),
1183: specifically to get high signal-to-noise ratio (S/N) images of the lensed
1184: source emission.  Table \ref{tab:B1608HSTobs} summarizes the
1185: observations.  Each orbit of the ACS visits consisted of one
1186: four-exposure dither pattern in either F606W or F814W through the Wide
1187: Field Channel (WFC).  We used the same dither pattern described in
1188: \citet{YorkEtal05} to permit drizzling to a higher angular resolution
1189: than the default ACS CCD pixel size ($\sim 0.05''$).  This subpixel
1190: scale is especially important for characterizing the PSF.
1191: 
1192: In order to correct for the dust extinction in the lens system, we also include
1193: the Near Infrared Camera and Multi-Object Spectrometer 1 (NICMOS) F160W images
1194: (Proposal 7422; PI:Readhead).  Details of the NICMOS observations are also
1195: listed in Table \ref{tab:B1608HSTobs}.
1196: 
1197: \begin{table*}
1198: \begin{center}
1199: \caption{\label{tab:B1608HSTobs} \HST observations of B1608+656}
1200: \begin{tabular}{ccccccc}
1201: \tableline
1202: \tableline
1203: Proposal & Proposal & Date & Instrument & Filter & Exposures & Exposure Time \\
1204: PI & ID & & & & & (s) \\
1205: \tableline
1206: C. Fassnacht &  10158 & 2004 Aug 24 & ACS/WFC & F606W & 4 & 609 \\
1207: & & & & & 4 & 646 \\
1208: & & & & F814W & 4 & 632 \\
1209: & & & &  & 4 & 646 \\
1210: & & 2004 Aug 25 & ACS/WFC & F606W & 8 & 609 \\
1211: & & & & & 8 & 646 \\
1212: & & & & F814W & 8 & 632 \\
1213: & & & &  & 8 & 646 \\
1214: & & 2004 Aug 29 & ACS/WFC & F606W & 4 & 609 \\
1215: & & & & & 4 & 646 \\
1216: & & & & F814W & 4 & 632 \\
1217: & & & &  & 4 & 646 \\
1218: & & 2004 Sept 17 & ACS/WFC & F606W & 4 & 609 \\
1219: & & & & F814W & 4 & 632 \\
1220: & & & &  & 4 & 646 \\
1221: & & & &  & 4 & 646 \\
1222: %\tableline
1223: A. Readhead & 7422 & 1998 Feb 7 & NIC1 & F160W & 5 & 3840 \\
1224: & & & & & 1 & 2048 \\
1225: & & & & & 1 & 896 \\
1226: \tableline
1227: \end{tabular}
1228: \end{center}
1229: \end{table*}
1230: 
1231: 
1232: The ACS images of B1608+656 are presented in
1233: Fig.~\ref{fig:B1608acsF606F814} and show the two lensing galaxies and
1234: the presence of a dust lane through the system.  We need to correct
1235: for both the dust lane and the light from the lens galaxies, which can
1236: affect the isophotes of the Einstein ring of the extended lensed
1237: source.  Before we can determine the amount of extinction, we need to
1238: first unify the resolutions of the images in different wavelength
1239: bands due to PSF dependencies.  This requires PSF modeling,
1240: deconvolution, and reconvolution for images.  Having unified the
1241: resolutions of the images, we can determine the intrinsic colors of
1242: the various components (lens galaxies, lensed source galaxy, AGN at
1243: the core of the source galaxy) in the system that are required for the
1244: dust correction.  After correcting for dust, we can then determine the
1245: light profiles of G1 and G2 by fitting them with S\'ersic profiles
1246: ($I(r) \propto \exp(-(r/a)^{1/n})$ where $r$ is the radial coordinate,
1247: $a$ is a scale length, and $n$ is known as the S\'ersic index;
1248: \citep{Sersic68}).  It is only at this stage, with the PSF, dust map, and
1249: lens galaxies' light profiles, that we can recover the lensed Einstein ring
1250: surface brightness distribution for lens potential modeling.
1251: 
1252: To execute the above plan of attack, in Section
1253: \ref{sec:drizzling}, we begin by describing the drizzling process for the ACS
1254: images that are used for the analysis.  In Sections \ref{sec:psf}--\ref{sec:lensGalLight},
1255:  we present a suite of PSF, dust, and lens
1256: galaxies' light models and describe in detail how they are obtained.
1257: Finally, in Section \ref{sec:ImProcModelComp}, we compare these
1258: models.
1259: 
1260: 
1261: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1262: 
1263: \subsection{Image drizzling}
1264: \label{sec:drizzling}
1265: 
1266: In the following subsections, we briefly describe the drizzling process for
1267: combining the dithered ACS images and discuss the alignment of the NICMOS image
1268: to the ACS image.
1269: 
1270: % -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
1271: 
1272: \subsubsection{ACS image processing}
1273: \label{sec:ACSimprocess}
1274: 
1275: The ACS data were reduced using the multidrizzle package
1276: \citep{KoekemoerEtal02} in an early version of the HAGGLeS
1277: image-processing pipeline (P. J. Marshall et al. 2009, in preparation), producing
1278: drizzled images with a $0.03''$ pixel scale.  The drizzled ACS images
1279: are shown in Fig.~\ref{fig:B1608acsF606F814}.  The corresponding
1280: output weight images from multidrizzle give the values for the inverse
1281: variance of each pixel.  We approximate the noise covariance matrix as
1282: diagonal and use the variance pixel values for the diagonal entries,
1283: even though drizzling will correlate the noise between adjacent
1284: pixels.  It is assumed that the effect of drizzling can be modeled as
1285: having a diagonal covariance matrix with the diagonal elements
1286: rescaled \citep{CasertanoEtal00}.  In practice, we do not need to do
1287: the rescaling because the ranking of the models using the
1288: \textit{relative} log evidence values from the source reconstruction
1289: is insensitive to rescaling of the covariance matrix.
1290: 
1291: A pixelated representation of a continuous intensity distribution
1292: generally introduces error in the interpolated intensity values
1293: between pixels, especially for intensity distributions with sharp
1294: features.  
1295: This error should be 
1296: incorporated into the likelihood function.
1297: Therefore,
1298: for modeling the source-intensity distribution on a grid (in Sections
1299: \ref{sec:ImProcModelComp} and \ref{sec:PotRec:B1608}), we also include
1300: the error due to pixelization on the image and source planes (which we
1301: call ``regridding error'') in the image covariance matrix.  We express
1302: the regridding error on the image plane in terms of the data (instead
1303: of on the source plane and transforming it to the image plane) in
1304: order to obtain a noise map that is independent of the pixelated lens
1305: modeling.  The regridding error associated with pixel $i$ is
1306: \be
1307: \label{eq:regridError}
1308: (\sigma_{\rm grid}^2)_i = \frac{1}{12} \mu_i
1309: \frac{\Delta\beta^2}{\Delta\theta^2}
1310: \sum_{\parbox{17mm}{\centering\scriptsize\it j $\in$ {\rm pixels} \\ {\rm adjacent\ to\ } i}}^{N_{\rm adj}} \frac{(d_j-d_i)^2}{N_{\rm adj}}, 
1311: \ee 
1312: where $\mu_i$ is the lensing magnification at pixel $i$, $\Delta\beta$
1313: is the source pixel size, $\Delta\theta$ is the image pixel size,
1314: $N_{\rm adj}$ is the number of pixels adjacent to pixel $i$, and $d_i$
1315: ($d_j$) is the image intensity at pixel $i$ ($j$).  The summation
1316: divided by $\Delta\theta^2$ in the above equation is a conservative
1317: estimate on the error due to pixelization on the image plane. 
1318: Since sharper features in the image have larger gradients (hence,
1319: larger values for the summations), the regridding error is higher in
1320: these areas by construction.
1321: The
1322: remaining quantities in the equation, $\mu_i\Delta\beta^2 / 12$,
1323: account for the uncertainty in the predicted image (the source image
1324: mapped to the image plane) due to the pixelization of the source-intensity 
1325: distribution.  The factor $1/12$ is the second moment of a
1326: uniform distribution between $-0.5$ and $0.5$.  When one constructs
1327: the predicted image by mapping each image pixel to the source plane
1328: and reading off the source-intensity value, the mapped source position
1329: (of an image pixel) is generally not centered on a source pixel, but
1330: have on average a $(1/\sqrt{12})$-pixel shift from the center of the
1331: source pixel.  Therefore, $\Delta\beta/\sqrt{12}$ is the effective
1332: size of the source pixel, which is then magnified by (on average)
1333: $\sqrt{\mu_i}$ due to lensing.  In the pixelated potential
1334: reconstruction, we approximate the magnification at each image pixel
1335: (which requires the second derivative of the potential) by the value
1336: computed from the initial potential because (1) the approximation
1337: enforces the regridding error to be independent of the pixelated
1338: potential modeling and (2) the corrected potential values are
1339: obtained on an annular region only a few pixels thick.  Having
1340: obtained an estimate for the regridding error, we add it in quadrature
1341: to the variance from the weight image to obtain the entries of the
1342: approximated diagonal covariance matrix.  
1343: 
1344: The inclusion of the
1345: regridding error is important for source-intensity reconstructions
1346: with sharp intensity features (such as the presence of a bright core);
1347: it has the effect of stabilizing the evidence values with respect to 
1348: choices in the source pixelization.  
1349: Without including the regridding error, a pixelated description of,
1350: for example, a source-intensity distribution with a bright core would
1351: be highly sensitive to the centering of the core on the source pixels.
1352: A small mismatch could create large image residuals near the cores
1353: that would veto an otherwise good lensing model, which has the rest of
1354: the extended features well described.  Such an undesirable effect is
1355: mostly removed by the inclusion of the regridding error.
1356: For B1608+656, the ratio of the
1357: regridding error to the error from the multidrizzle weight image is
1358: around $\sim 30$ near the image centroids and $\sim 1$ in other parts
1359: in the Einstein ring.
1360: 
1361: \begin{figure*}
1362: \begin{center}
1363: \includegraphics[width=75mm]{f3a.ps}
1364: \includegraphics[width=75mm]{f3b.ps}
1365: \caption[Drizzled \HST ACS F606W and F814W images]
1366: {\label{fig:B1608acsF606F814} Left-hand (right-hand) panel: drizzled
1367:   \HST ACS F606W (F814W) images with $0.03''$ pixels from 9 (11) \HST
1368:   orbits.  The dust lane and interacting galaxy lenses are clearly
1369:   visible.  The white dots indicate the centroid positions of the
1370:   images.}
1371: \end{center}
1372: \end{figure*}
1373: 
1374: % -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
1375: 
1376: \subsubsection{NICMOS image processing}
1377: \label{NICMOSimprocess}
1378: The NICMOS F160W image was taken from \citet{KoopmansEtal03}.
1379: Drizzled images on rectangular grids for different instruments are
1380: generally not on the same resolution and not aligned.  This is the
1381: case for the NICMOS and ACS images.  We use SWarp\footnote{A package developed by Emmanuel Bertin at Institut d'Astrophysique de Paris for resampling and coadding together FITS images.} to
1382: align the combined NICMOS image to the ACS images.  The final SWarped
1383: NICMOS F160W image with $0.03''$ pixel scales is shown in
1384: Fig.~\ref{fig:B1608nicF160}.
1385: 
1386: \begin{figure}
1387: \begin{center}
1388: \includegraphics[width=80mm]{f4.ps}
1389: \end{center}
1390: \caption[SWarped \HST NICMOS F160W image]{\label{fig:B1608nicF160}
1391:   \HST NICMOS F160W image that is SWarped to aligned to the ACS frame
1392:   with a $0.03''$ pixel size.  The white dots indicate the centroid
1393:   positions of the images.}
1394: \end{figure}
1395: 
1396: 
1397: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1398: 
1399: \subsection{PSF modeling}
1400: \label{sec:psf}
1401: In this subsection, we describe the procedure for obtaining the PSFs for
1402: each of the ACS and the NICMOS data sets.
1403: 
1404: % -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
1405: 
1406: \subsubsection{ACS PSF}
1407: 
1408: The ACS PSF is both spatially and temporally varying
1409: \citep[e.g.][]{RhodesEtal07}.  One source of temporal variation is the
1410: ``breathing'' of the telescope while it orbits, which causes the focal
1411: length (and, hence, the PSF) of the telescope to change.  Instead of
1412: adopting a universal PSF, we take the approach of modeling several
1413: PSFs using different means, and quantitatively comparing them using
1414: the Bayesian analysis described in Section
1415: \ref{sec:PPRMethod:matrix:probTheory}.  This has the advantage of
1416: using the data (the observed image) to rank the models.  For each of
1417: the two drizzled ACS images, we create five models for the PSF either based
1418: on the TinyTim package \citep{KristHook97} or from the
1419: unsaturated stars in the field: (1) drizzled PSF (``PSF-drz'') from a
1420: set of TinyTim simulations \citep[following][]{RhodesEtal07}, (2)
1421: single (nondrizzled) TinyTim PSF (``PSF-f3'') with a telescope focus
1422: value of $-3$, (3) the closest star (``PSF-C'') located at $\sim
1423: 9''$ in the northeast direction from B1608+656 in the drizzled ACS
1424: field with a Vega magnitude of 21.3 in F814W, (4) bright star \#1
1425: (``PSF-B1'') that is located at $\sim 1.9'$ southwest of B1608+656 in
1426: the drizzled ACS field with a Vega magnitude of 18.7 in F814W, and (5)
1427: bright star \#2 (``PSF-B2'') that is located at $\sim 1.6'$ south of
1428: B1608+656 in the drizzled ACS field with a Vega magnitude of 19.1.
1429: 
1430: The TinyTim frame(s) were drizzled and resampled to pixel sizes of
1431: $0.03''$ to match the resolution of the ACS images.  We keep in mind
1432: that the TinyTim PSFs (PSF-drz and PSF-f3) may be insufficient due to
1433: the time-varying nature of the PSF and the aging of the detector
1434: since the TinyTim code was written.  We expect the closest star to
1435: B1608+656 (PSF-C) to be a good approximation to the PSF because the
1436: spatial variation of the PSF across $\sim 9''$ should be negligible
1437: and any temporal variations are the same as in the lens system.
1438: However, this closest star is not bright enough to see the secondary
1439: maxima in the PSF, so we additionally include two of the brightest
1440: stars in the drizzled field mentioned above.  For each of the stars in
1441: F606W and F814W, we make a small cutout around the star ($25\times25$
1442: pixels for PSF-C, $51\times51$ pixels for PSF-B1, and $41\times41$
1443: pixels for PSF-B2) and center it on a $200\times200$ grid, which is
1444: the size of the drizzled science image cutouts of B1608+656 that are
1445: used for the image processing.
1446: 
1447: 
1448: % -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
1449: 
1450: \subsubsection{NICMOS PSF}
1451: 
1452: The NICMOS PSF is thought to be more stable, and thus we assume a
1453: TinyTim model for it.  The output TinyTim PSF is in the CCD frame of
1454: NICMOS with pixel size $0.043''$.  As with the F160W science image,
1455: the PSF was SWarped to be aligned with the ACS images with $0.03''$
1456: pixels.  Since there is only one PSF model for NICMOS, PSF
1457: specifications throughout the rest of this paper refer to the ACS
1458: PSFs.
1459: 
1460: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1461: 
1462: \subsection{Dust correction}
1463: \label{sec:dustCorr}
1464: 
1465: With observations in two or more wavelengths, we can correct for the
1466: dust extinction using empirical dust extinction laws.  We adopt the
1467: extinction law of \citet{CardelliEtal89} with the following dust
1468: extinction ratios at the redshift of the lens $z_{\rm d}=0.63$ for $R_V=3.1$
1469: (Galactic extinction): 
1470: ${A_{\rm{F606W}}}/{A_{\rm{V}}}=1.56$,
1471: ${A_{\rm{F814W}}}/{A_{\rm{V}}}=1.14$, and
1472: ${A_{\rm{F160W}}}/{A_{\rm{V}}}=0.41$,
1473: where $A_{\rm \lambda}$ is the extinction (difference between the
1474: observed and intrinsic magnitudes) at wavelength $\lambda$.  These
1475: dust extinction ratios agree with the values from the extinction law
1476: in \citet{Pei92} to within $1.5\%$.  In order to correct for the
1477: extinction, we need to know the intrinsic colors of the objects
1478: (details in Section \ref{sec:dustCorr:IntColor}).  For each color type
1479: of object (the lens galaxies, the source galaxy, and the AGN of source
1480: galaxy), we denote the intrinsic color by $Q_F=(m_{F,
1481:   \rm{intrinsic}}-m_{1,\rm{intrinsic}})$ where $F=1,\ldots, N_{\rm{b}}$
1482: is in sequence from the reddest to the bluest wavelengths (by
1483: construction $Q_{1}=0$), and $N_{\rm{b}}$ is the number of wavelength
1484: bands used for dust correction.  Combining the dust extinction ratios
1485: and the definition of intrinsic colors, we can model the observed
1486: magnitudes at each image pixel in each of the wavelength bands $F$ in
1487: terms of $A_V$ and the intrinsic magnitude of the reddest wavelength
1488: band $m_{1,\rm{intrinsic}}$ as
1489: \be
1490: \label{eq:mobsInAvm1}
1491: m_F \equiv m_{F,\rm{observed}} =  m_{1,\rm{intrinsic}} + Q_F + A_V k_F + n_F,
1492: \ee
1493: where $k_F \equiv {A_{F}}/{A_V}$ are constants given by the extinction
1494: law and $n_F$ is the noise in the data of wavelength band $F$.  We can
1495: solve for $A_V$ and $m_{1,\rm{intrinsic}}$ at each image pixel by
1496: minimizing the following $\chi^2_{\rm{dust}}$ for each pixel:
1497: \be
1498: \label{eq:dustChi2}
1499: \chi^2_{\rm{dust}} = \sum_{F=1}^{N_{\rm{b}}} \left(m_F - m_{1,\rm{intrinsic}} - Q_F - A_V k_F \right)^2.
1500: \ee
1501: We have weighted the images of the different bands equally because the
1502: uncertainty associated with $m_F$ is negligible compared to that of
1503: $Q_F$, and the uncertainties in $Q_F$ are of comparable magnitudes for
1504: the different bands $F$ relative to the reddest.  The solution that
1505: minimizes $\chi^2_{\rm{dust}}$ is
1506: \bea
1507: \label{eq:AvSolnToChi2}
1508: A_V =& \bigg[ &\frac{1}{N_{\rm{b}}}\left(\sum_F k_F \right) \left(\sum_F m_F \right) - \nonumber \\
1509: & & - \frac{1}{N_{\rm{b}}}\left(\sum_F k_F \right) \left(\sum_F Q_F \right) - \nonumber \\
1510: & & - \sum_F k_F m_F + \sum_F k_F Q_F \bigg] \bigg/ \nonumber \\
1511: & \bigg[&  \frac{1}{N_{\rm{b}}} \left(\sum_F k_F\right)^2 - \sum_F k_F^2 \bigg],
1512: \eea
1513: and
1514: \be
1515: \label{eq:m1SolnToChi2}
1516: m_{1,\rm{intrinsic}} = \frac{1}{N_{\rm{b}}}\left(\sum_F m_F - \sum_F
1517:   Q_F - \sum_F A_V k_F \right), 
1518: \ee 
1519: where the sums over $F$ go from $1,\ldots, N_{\rm{b}}$.  We emphasize
1520: that Equations (\ref{eq:AvSolnToChi2}) and (\ref{eq:m1SolnToChi2})
1521: give the $A_V$ and $m_{1,\rm{intrinsic}}$ at each pixel.  Since
1522: $A_V$ varies from pixel to pixel (depending on the amount of dust seen
1523: in that pixel), the various $A_V$ values of all pixels provide a dust map.
1524: Similarly, the $m_{1,\rm{intrinsic}}$ values of all pixels give the
1525: dust-corrected image in the reddest wavelength band.  The resulting
1526: values of $m_{1,\rm{intrinsic}}$ and the intrinsic colors yield the
1527: intrinsic (dust-corrected) magnitudes in the other bands
1528: $m_{F,\rm{intrinsic}}$ where $F=2,\ldots, N_{\rm{b}}$.  For any one
1529: band $F$, we can then construct the diagonal
1530: dust matrix $\dustSet$ in Equation (\ref{eq:dataVecComp}) whose
1531: nonzero entries are $10^{-0.4 A_V k_F}$.
1532: 
1533: % -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
1534: 
1535: \subsubsection{Obtaining the intrinsic colors}
1536: \label{sec:dustCorr:IntColor}
1537: The dust correction method outlined above requires the intrinsic
1538: colors to be determined from the color maps.  To construct the color
1539: maps, we need to unify the different resolutions of the images in
1540: different bands (due to the wavelength dependence of the PSF).  We do
1541: so by deconvolving the F606W, F814W, and F160W images using their
1542: corresponding PSFs, and reconvolving the images with the F814W PSF for
1543: each set of the five ACS PSFs and the single NICMOS PSF described in
1544: Section \ref{sec:psf}.  Reconvolved images are preferred to
1545: deconvolved images, because the latter show small-scale features (of a
1546: few pixels' size) that are artificial due to the amplification of the
1547: noise during the deconvolution process.  We select the F814W PSF for
1548: the reconvolution because F814W will be used for the lens potential
1549: modeling, due to its high S/N compared with F160W and
1550: its less severe dust extinction compared with F606W.  In working with
1551: the reconvolved images, we assume that the dust varies on a
1552: scale larger than the F814W PSF, which is true for the regions near
1553: the Einstein ring.  For the deconvolution, we use IDL's
1554: \texttt{max\_entropy} iterative routine that is based on the algorithm
1555: by \citet{HollisEtal92}.  We were unable to deconvolve the ACS F814W
1556: image using PSF-f3.  This suggests that PSF-f3 is a bad model, which
1557: we have expected due to temporal variations in the PSF.  PSF-f3 is a
1558: single-epoch PSF whereas the F814W image was drizzled from multiple
1559: exposures.  We, therefore, discard this PSF model.
1560: 
1561: For each set of PSF models (PSF-drz, PSF-C, PSF-B1, and PSF-B2 for
1562: ACS, and TinyTim PSF for NICMOS), we construct the color maps
1563: F606W--F814W, F606W--F160W, and F814W--F160W from the reconvolved F606W,
1564: F814W, and F160W images.  Fig.~\ref{fig:colorMapB1Star} shows the three
1565: color maps derived for PSF-B1.  
1566: Regions with bluer color slightly west of G1 are shown in all three color maps.  
1567: Since the centroid of
1568: this blue region is offset from the centroid of G1, we believe that
1569: this blue region arises from differential reddening and not from
1570: intrinsic color variations within G1, which is an elliptical galaxy
1571: \citep{SurpiBlandford03}.  Since elliptical galaxies typically contain
1572: little dust, \citet{KoopmansFassnacht99} and \citet{SurpiBlandford03}
1573: suggested that the dust comes from G2, likely a dusty late-type
1574: galaxy, through dynamical interaction.  This may explain why the
1575: spectrum of G1 shows signatures of a young stellar population plus a
1576: poststarburst population \citep{DresslerGunn83, MyersEtal95,
1577:   SurpiBlandford03, KoopmansEtal03}: gas from G2 may have been
1578: transferred to G1, where the tidal interactions may have triggered
1579: star formation.
1580: 
1581: The color maps also show regions of bluer color around images C and D,
1582: and we again believe that these are mostly differential reddening due to
1583: the misalignment of the image positions and the centroids of these
1584: blue regions, especially in F606W--F160W and F814W--F160W.  Furthermore,
1585: we find more dust at the crossing point of the isophotal separatrix
1586: (the figure-eight-shaped intensity contour) of the image pair
1587: A--C. This is encouraging, as lensing models indeed predict the
1588: crossing point to be closer to image A (see discussion in Section \ref{sec:resultDustMaps}).  
1589: However, these bluer regions near
1590: images C and D may also arise from the lensed source being
1591: intrinsically bluer than the surrounding emission.  The F814W--F606W
1592: color for these blue regions is consistent with typical star-forming
1593: galaxies \citep[e.g.][]{ColemanEtal80}.  In the F606W--F814W color map,
1594: there is a faint ridge of redder color connecting images A and C.
1595: This may be due to the asymmetry in the stellar PSF model (with the
1596: star position not exactly centered within a pixel), which would cause
1597: the F606W and F814W isophotes to shift relative to each other after
1598: the deconvolution and reconvolution.  For the color maps from the
1599: other PSF models, we find that the color maps from PSF-C and PSF-B2
1600: look similar to that from PSF-B1 with varying amounts of noise due to
1601: varying brightnesses of the stellar PSFs.  PSF-drz gave color maps that
1602: differ from those from the stellar PSFs (PSF-C, PSF-B1 and PSF-B2)
1603: because PSF-drz, especially in the F606W band, did not exhibit a
1604: single brightness peak but a string of equal brightness pixels at the
1605: center due to frame alignment difficulties during the drizzling
1606: process.  This caused the brightest pixels in the Einstein ring to
1607: shift by $\sim 1$ pixel after the deconvolution and the reconvolution
1608: process in F606W, and created artificial sharp highlights tracing the
1609: edge of the ring in the F606W--F814W color map.  As will be seen in
1610: Section \ref{sec:ImProcModelComp}, this leads to PSF-drz and its
1611: resulting dust map giving a lower goodness of fit in the lens
1612: inversion, and hence being ranked lower compared with other models.
1613: 
1614: \begin{figure*}
1615: \begin{center}
1616: \includegraphics[width=48mm]{f5a.ps}
1617: \includegraphics[width=48mm]{f5b.ps}
1618: \includegraphics[width=48mm]{f5c.ps}
1619: \end{center}
1620: \caption[Color maps from F606W, F814W, and F160W bands of B1608+656]{\label{fig:colorMapB1Star} From left to right: the derived color maps F606W--F814W, F606W--F160W, and F814W--F160 using PSF-B1.}
1621: \end{figure*}
1622: 
1623: 
1624: In each of the color maps, we define three color regions for the three
1625: color components: one within the Einstein ring for the lens galaxies
1626: (we assume G1 and G2 to have the same colors), one for the Einstein
1627: ring of the lensed extended source, and one for the lensed AGN (core
1628: of the extended source).  Following \citet{KoopmansEtal03}, we
1629: determine the bluest color within each region, assume that this part
1630: of the region was not absorbed by dust, and adopt this color as the
1631: intrinsic color.  This assumes that each of the three components has a
1632: constant intrinsic color.  This would allow us to obtain the
1633: \textit{differential} reddening for each of the components across the
1634: lensed image; \textit{absolute} reddening is not needed because a
1635: uniform dust screen does not affect lens modeling.  Table
1636: \ref{tab:intrinsicColors} lists the intrinsic colors for each of the
1637: three pairs of color maps.  The intrinsic colors of F606W--F814W are
1638: not identical to the difference between F606W--F160W and F814W--F160W,
1639: but agree within the uncertainties (0.02--0.1).
1640: 
1641: \begin{table}
1642: \begin{center}
1643: \caption[Intrinsic colors of the AGN, Einstein ring, and lens galaxies in B1608+656]{\label{tab:intrinsicColors} Intrinsic colors of the AGN, Einstein ring, and lens galaxies in B1608+656
1644: }
1645: \begin{tabular}{llccc}
1646: \tableline
1647: \tableline
1648:   & & F606W--F814W & F814W--F160W & F606W--F160W \\
1649: \tableline
1650: %\cline{2-5} 
1651: PSF-drz & AGN & $0.50$ & $1.4$  & $1.91$ \\
1652: %\cline{2-5}   
1653:   & Ring & $0.70$      & $1.5$  & $2.20$ \\
1654: %\cline{2-5}  
1655:   & Lens & $0.84$      & $1.0$  & $1.88$ \\
1656: %\tableline                                           
1657: PSF-C & AGN & $0.78$   & $1.3$  & $2.10$ \\
1658:   & Ring & $0.84$      & $1.5$  & $2.30$ \\
1659:   & Lens & $1.04$      & $1.0$  & $2.05$ \\
1660: %\tableline                                           
1661: PSF-B1 & AGN & $0.72$  & $1.1$  & $1.85$ \\
1662:   & Ring & $0.76$      & $1.3$  & $2.10$ \\
1663:   & Lens & $1.04$      & $0.82$ & $1.85$ \\
1664: %\tableline                                           
1665: PSF-B2 & AGN & $0.70$  & $1.17$ & $1.99$ \\
1666:   & Ring & $0.80$      & $1.3$  & $2.10$ \\
1667:   & Lens & $1.01$      & $0.85$ & $1.92$ \\
1668: \tableline
1669: \end{tabular}
1670: \end{center}
1671: \tablecomments{The intrinsic colors are based on color maps derived from the four ACS PSF models (PSF-drz (drizzled TinyTim), PSF-C (closest star), PSF-B1 (bright star \#1), and PSF-B2 (bright star \#2)) and the single NICMOS TinyTim PSF.  The intrinsic colors for each of the three color regions are determined from the bluest colors in the respective region.  The uncertainties on the intrinsic colors vary from 0.02 to 0.1.  The higher uncertainties are associated with the F160W image, which has a lower S/N.}
1672: \end{table}
1673: 
1674: % -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
1675: 
1676: \subsubsection{Resulting dust maps}
1677: \label{sec:resultDustMaps}
1678: With the intrinsic colors determined for each PSF model, we obtain two
1679: dust maps ($A_V$ maps) using (1) only the ACS F606W and F814W images
1680: and (2) the ACS F606W and F814W images together with the NICMOS F160W
1681: image.  In this way, we can assess whether the inclusion of the lower
1682: S/N NICMOS image (with the much broader PSF) improves the
1683: dust correction.  
1684: 
1685: The left-hand panel of Fig.~\ref{fig:AVcorrF814WB1Star3band} is the
1686: resulting $A_V$ dust map derived using PSF-B1 and using images in all
1687: three bands.  The dust map shows the east-west dust lane through the
1688: system (absorbing light from C, G2, G1, and D) that is visible in the
1689: original drizzled ACS F606W and F814W images.  There is little
1690: extinction near images A and B, but there are faint rings surrounding
1691: the images that are mostly due to imperfect F160W deconvolution.  We
1692: note that the low S/N exterior to the Einstein ring
1693: results in the dust map being noisy in this area.  We make sure that
1694: these noisy areas are not included in the Bayesian evidence
1695: computations in Sections \ref{sec:ImProcModelComp} and
1696: \ref{sec:PotRec:B1608}.  The right-hand panel of
1697: Fig.~\ref{fig:AVcorrF814WB1Star3band} is the resulting dust-corrected
1698: F814W image that exhibits two signs of proper dust correction: the
1699: correctly shifted crossing point of the isophotal separatrix of the
1700: image pair A--C, as shown more clearly in Fig.~\ref{fig:overlayF814W},
1701: and the smoother lens galaxy profiles.  As a result of recovering the
1702: absorbed light, the dust-corrected image has higher intensity values
1703: than the uncorrected image.  Therefore, we create a weight map for the
1704: dust-corrected image by scaling the multidrizzle weight image in order to keep
1705: the S/N of each pixel the same (before and after
1706: dust correction).  This ``dust-corrected weight image'' will be used
1707: in the next section for determining the lens galaxy light.
1708: 
1709: The dust maps obtained from the other PSF models with or without the
1710: inclusion of the NICMOS image show similar features except for the
1711: following two dust maps. 
1712: \begin{enumerate}
1713: \item The ACS-only (no NICMOS) dust map from PSF-B2 showed a
1714: faint ridge of dust connecting images A and C.  As explained, this may
1715: be due to the asymmetrical/bad PSF model.  Since the dust map otherwise
1716: exhibits the correct features, we keep this dust map for the next
1717: analysis step.
1718: \item The ACS-only dust map from PSF-drz
1719: showed prominent artificial lensing arc features due to the
1720: $\sim 1$ pixel offset in the image positions/arcs in the deconvolved
1721: and reconvolved F606W and F814W images, respectively.  Therefore, we discard this
1722: dust map of the ACS-only images for PSF-drz, but keep the dust map
1723: derived from using all three bands (that includes NICMOS).
1724: \end{enumerate}
1725: 
1726: After discarding the ACS PSF-f3
1727: and the ACS-only dust map from PSF-drz, we have a total of
1728: seven dust maps (and resulting dust-corrected F814W images).  
1729: All of these are reasonable
1730: dust corrections to use since they are derived using representative PSFs and intrinsic colors.  We will compare these dust maps and PSF
1731: models in Section \ref{sec:ImProcModelComp}.
1732: 
1733: 
1734: \begin{figure*}
1735: %\includegraphics[width=170mm]{}
1736: \begin{center}
1737: \includegraphics[width=75mm]{f6a.ps}
1738: \includegraphics[width=75mm]{f6b.ps}
1739: \end{center}
1740: \caption[Dust map and dust-corrected F814W image]{\label{fig:AVcorrF814WB1Star3band} Left-hand panel: the $A_V$ map obtained from dust correction with PSF-B1 using all three bands of images and the intrinsic colors listed in Table \ref{tab:intrinsicColors}.  The galactic dust extinction law was assumed.  The dust lane through images C, G2, G1, and D is visible.  Right-hand-panel: dust-extinction-corrected F814W image using PSF-B1 and the three-band dust map in the left-hand panel.  Compared to the right-hand panel in Fig.~\ref{fig:B1608acsF606F814}, the light profile of G1 is more elliptical and the crossing point of the isophotal separatrix of images A and C has shifted toward A after the dust correction.}
1741: \end{figure*}
1742: 
1743: \begin{figure}
1744: %\includegraphics[width=170mm]{}
1745: \begin{center}
1746: \includegraphics[width=80mm]{f7.ps}
1747: \end{center}
1748: \caption[Overlay of the dust-corrected and galaxy-subtracted image and the lens
1749: potential]{\label{fig:overlayF814W} 
1750: Crossing isophotes of the B1608+656 Einstein ring. Shown here is  the
1751: dust-corrected and galaxy-subtracted F814W image (solid contours), with the
1752: critical curves of the SPLE1+D (isotropic) potential model
1753: \citep{KoopmansEtal03} overlaid (dashed curves).  The inset shows a
1754: ``zoomed-in'' view of the region between images C and A; here, the dotted curves
1755: in the zoomed-in panel are the intensity contours of the  galaxy-subtracted
1756: F814W image \emph{without} dust correction.  After dust correction, the 
1757: crossing point of the isophotal separatrix (the center of the figure-eight
1758: isophote)  is shifted toward the critical curve, indicating successful dust
1759: correction.}
1760: 
1761: \end{figure}
1762: 
1763: 
1764: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1765: 
1766: \subsection{Lens galaxy light}
1767: \label{sec:lensGalLight}
1768: 
1769: For each of the seven resulting dust-corrected F814W images in Section
1770: \ref{sec:resultDustMaps} and its corresponding PSF, we create an
1771: elliptical mask for the lens galaxies' region that excludes the
1772: Einstein ring, and fit the lens galaxies' light to elliptical S\'ersic
1773: profiles using GALFIT \citep{PengEtal02}.  In particular, we impose
1774: the S\'ersic indices to be one of the following pairs: $(n_{\rm{G1}},
1775: n_{\rm{G2}}) = (1,1), (2,2), (3,3), (3,4), (4,3), (4,4)$.  There are
1776: more pairings with $n=3$ and $n=4$ since previous works by, for
1777: examples, \citet{BlandfordEtal01} and \citet{KoopmansEtal03} found G1
1778: to be well described by $n=4$ (de Vaucouleurs profile).  With the
1779: dust-corrected weight image, we obtain a reduced $\chi^2$ value for
1780: each of the profile fittings.  For each dust-corrected F814W image, we
1781: pick the S\'ersic index pair with the lowest reduced $\chi^2$ from the
1782: fit (top two pairs in the case of PSF-drz) and list it in Table
1783: \ref{tab:galfitChi2}.  As an illustration,
1784: Fig.~\ref{fig:galfitB1Star3band} shows the GALFIT S\'ersic
1785: $(n_{\rm{G1}}, n_{\rm{G2}}) = (3,4)$ results of the dust-corrected
1786: F814W image using the three-band dust map from PSF-B1.  The dark (light)
1787: patches in the upper right-hand corner of the middle (right-hand)
1788: panel result from the noisy dust map due to low signal to noise in
1789: this area.  Apart from this area and the lens galaxies' cores, most of
1790: the observed lens galaxies' light matches the dusted S\'ersic profiles
1791: in the middle panel, as shown in the residual map in the right-hand
1792: panel.  The misfit near the cores could be due to intrinsic color
1793: variations in the lens galaxies, the dust screen assumption, PSF
1794: imperfections, and/or inapplicability of a single S\'ersic model at
1795: the center.  Nonetheless, accurate light fitting near the cores of the
1796: lens galaxies is not important; it is for the isophotes of the
1797: Einstein ring that we need to have accurate dust and lenses' light
1798: corrections for the lens modeling.  For the ring, the dust screen
1799: assumption in our approach is valid.
1800: 
1801: \begin{table}
1802: \begin{center}
1803: \caption[Best-fit S\'ersic light profiles for the lens galaxies]{\label{tab:galfitChi2} Best-fitting S\'ersic light profiles for the lens galaxies G1 and G2 for the seven different dust-corrected F814W images based on different PSF and dust maps
1804: }
1805: \begin{tabular}{ccccc}
1806: \tableline
1807: PSF & Dust Map & S\'ersic Indices $(n_{\rm{G1}}, n_{\rm{G2}})$ & Reduced $\chi^2_{\rm{lens\ light}}$ \\
1808: \tableline
1809: drz & Three-band & $(3,4)$ & 4.48 \\
1810: %\tableline
1811: drz & Three-band & $(3,3)$ & 4.53 \\
1812: %\tableline
1813: C & Three-band & $(3,4)$ & 5.11 \\
1814: %\tableline
1815: C & Two-band & $(3,3)$ & 6.13 \\
1816: %\tableline 
1817: B1 & Three-band & $(3,4)$ & 5.53 \\
1818: %\tableline 
1819: B1 & Two-band & $(2,2)$ & 7.16 \\
1820: %\tableline 
1821: B2 & Three-band & $(2,2)$ & 5.95\\
1822: %\tableline 
1823: B2 & Two-band & $(2,2)$ & 8.19\\
1824: \tableline
1825: \end{tabular}
1826: \end{center}
1827: \tablecomments{In the PSF column, ``drz'' = drizzled TinyTim, ``C'' = closest star, ``B1'' = bright star \#1, and ``B2'' = bright star \#2.  In the dust map column, ``two-band'' represents the dust map obtained from just the two ACS bands, and ``three-band'' represents the dust map obtained from the two ACS and the one NICMOS band.}
1828: \end{table}
1829: 
1830: \begin{figure*}
1831: \begin{center}
1832: \includegraphics[width=48mm]{f8a.ps}
1833: \includegraphics[width=48mm]{f8b.ps}
1834: \includegraphics[width=48mm]{f8c.ps}
1835: \end{center}
1836: \caption[Lens galaxy light fitting using S\'ersic profiles]
1837: {\label{fig:galfitB1Star3band} S\'ersic lens galaxy light profile
1838:   fitting to the dust-corrected F814W image, with PSF-B1 and its
1839:   corresponding three-band dust map, using GALFIT.  The left-hand panel
1840:   shows the best-fit S\'ersic light profiles with S\'ersic indices
1841:   $(n_{\rm{G1}},n_{\rm{G2}})=(3,4)$.  The middle panel shows the dust-extincted 
1842:   galaxy light profiles, which is the left-hand panel with
1843:   the dust extinction added back in.  The right-hand panel shows image
1844:   residual (difference between the F814W drizzled image in
1845:   Fig.~\ref{fig:B1608acsF606F814} and the middle panel) with misfit
1846:   near the cores of the lens galaxies of $\sim 25-35\%$.}
1847: 
1848: \end{figure*}
1849: 
1850: % - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1851: 
1852: \subsection{Comparison of PSF, dust, and lens galaxy light models}
1853: \label{sec:ImProcModelComp}
1854: 
1855: Following the method outlined in Section \ref{sec:PPRMethod:realData},
1856: we can use the Bayesian evidence from the source-intensity
1857: reconstruction to compare the different PSF ($\blurSet$), dust
1858: ($\dustSet$) and lens galaxy light ($\glightVec$) models.  For each
1859: set of $\blurSet$, $\dustSet$, and $\glightVec$, we obtain the
1860: corresponding galaxy-subtracted F814W image
1861: ($\dataVec-\blurSet\cdot\dustSet\cdot\glightVec$) that is analogous to
1862: the one shown in the right-hand panel of
1863: Fig.~\ref{fig:galfitB1Star3band}.  We then make a $130\times130$ pixel
1864: cutout of the $0.03''$ galaxy-subtracted image and use the SPLE1+D
1865: (isotropic) lens potential model in \citet{KoopmansEtal03}, which is
1866: the most up-to-date simply-parameterized lens potential model for
1867: B1608+656, for the source-intensity reconstruction.  Due to the source
1868: and image pixelizations, we include the regridding error (described in
1869: Section \ref{sec:ACSimprocess}) in the image covariance matrix.
1870: 
1871: We select an annular region enclosing the Einstein ring, and use the
1872: data inside this region for the source-intensity reconstructions for
1873: each set of the PSF, dust, and lens galaxy light models.  The source
1874: grid, which we fix to have $32\times32$ pixels, has pixel sizes that
1875: are $\sim 0.022''$ to cover the marked elliptical annular region when
1876: mapped to the image plane.  This is sufficient for achieving
1877: reasonable reconstructions and is computationally manageable.  In the
1878: inversions, we reduced the PSF to $15\times15$ pixels to keep the
1879: matrices such as $\blurSet$ reasonably sparse for computing speed.  We
1880: try three forms of regularization: zeroth-order, gradient and
1881: curvature (e.g. Appendix A of \citeauthor{SuyuEtal06}
1882: \citeyear{SuyuEtal06}).
1883: 
1884: Table \ref{tab:B1608SingleSrRecEvid} lists the suite of PSF, dust, and
1885: lens galaxy light models we obtained in the previous section.  We
1886: label the different models by numbers from 1 to 11 in the left-most
1887: column.  Models 9 and 10 correspond to the mixing of the dust maps and
1888: lens galaxy light profiles derived from PSF-B1 with PSF-C and vice
1889: versa.  Model 11, which is included as a consistency check, uses
1890: PSF-B1 and has no dust correction applied.  For each set of models,
1891: the source-intensity distribution for B1608+656 is reconstructed.  As
1892: an example, Fig.~\ref{fig:B1608SrRecB1Star3band} shows the results of
1893: the source reconstruction with gradient regularization using PSF-B1,
1894: its corresponding three-band dust map, and the resulting S\'ersic
1895: ($n_{\rm{G1}},n_{\rm{G2}})=(3,4)$ galaxy light profile.  The top
1896: left-hand panel shows the reconstructed source-intensity distribution
1897: that is approximately localized, an indication that the lens potential
1898: model is close to the true potential model.  In the top-middle panel,
1899: the pixels that are far from the source but are inside the
1900: caustics have lower $1\sigma$ error values than the pixels outside the
1901: caustics due to higher image multiplicity inside the caustics.  The
1902: bottom right-hand panel shows significant image residuals (the reduced
1903: $\chi^2$ is 1.9 inside the annulus), a sign that the PSF, dust, lens
1904: galaxy light, and/or the lens potential models are not optimal.  In
1905: Section \ref{sec:PotRec:B1608}, we will use the pixelated potential
1906: correction scheme, which is more suitable for interacting galaxy
1907: lenses, to improve the simply-parameterized SPLE1+D (isotropic) model.
1908: 
1909: The source-intensity reconstructions using other PSF and lens galaxy
1910: light models with three-band dust maps give overall similar inverted
1911: source intensities and image residuals, but the source intensities can
1912: be more or less localized and the magnitude and structures of the
1913: image residuals vary for different model sets.  However, the
1914: source-intensity reconstructions using models with two-band dust maps
1915: result in source intensities that are not localized, and the image
1916: residuals show surpluses of light in the ring region and deficits of
1917: light in the lens galaxy region (corresponding to the color regions we
1918: marked for obtaining the intrinsic colors).  The reason is that with
1919: only two bands, the resulting dust-corrected F814W image is highly
1920: sensitive to relative shifts between the F606W and F814W images (due to an
1921: imperfect PSF model, deconvolution, and reconvolution) and errors in
1922: the modeled intrinsic colors.  The abrupt change in the modeled
1923: intrinsic colors across the boundaries of the color regions creates
1924: artificial surpluses or deficits of dust-corrected light near the
1925: boundaries.  This effect is suppressed with the addition of the F160W
1926: image because the F160W image suffers relatively little extinction,
1927: and the error due to misalignment in the images and abrupt change in
1928: the modeled intrinsic colors is reduced when one has more than two
1929: bands.  A few tests suggest that the error in the dust-corrected image
1930: due to the range of intrinsic colors listed in Table
1931: \ref{tab:intrinsicColors} overwhelms the error associated with the
1932: foreground dust screen assumption for the lens galaxy light.
1933: 
1934: The source-intensity reconstruction in Model 11 with no dust
1935: correction shows significant image residuals in the extended ring,
1936: with overall surpluses of light surrounding images A and B and deficits
1937: surrounding images C and D.  The source intensity is also poorly
1938: reconstructed, being nonlocalized and noisy.  This illustrates the
1939: importance of dust correction for the initial SPLE1+D (isotropic)
1940: model.
1941: 
1942: \begin{table}
1943: \begin{center}
1944: \caption[PSF, dust, and lens galaxies' light model comparison based on Bayesian source inversion]{\label{tab:B1608SingleSrRecEvid} PSF, dust, and lens galaxies' light model comparison based on Bayesian source inversion 
1945: }
1946: \begin{tabular}{cccccc}
1947: \tableline
1948: \tableline
1949:  & PSF & Dust Map & S\'ersic $(n_{\rm{G1}}, n_{\rm{G2}})$ & Reg. Type & Log Evidence \\
1950:  & & & & & ($\times 10^4$)\\
1951: \tableline
1952: 1 & drz & Three-band & $(3,4)$ & grad & $1.49$ \\
1953: %\tableline
1954: 2 & drz & Three-band & $(3,3)$ & grad & $1.48$ \\
1955: %\tableline
1956: 3 & C & Three-band & $(3,4)$ & grad & $1.60$ \\
1957: %\tableline
1958: 4 & C & Two-band & $(3,3)$ & zeroth & $1.40$ \\
1959: %\tableline 
1960: 5 & B1 & Three-band & $(3,4)$ & grad & $1.56$ \\
1961: %\tableline 
1962: 6 & B1 & Two-band & $(2,2)$ & zeroth & $1.10$ \\
1963: %\tableline 
1964: 7 & B2 & Three-band & $(2,2)$ & grad & $1.55$ \\
1965: %\tableline 
1966: 8 & B2 & Two-band & $(2,2)$ & zeroth & $1.23$ \\
1967: %\tableline 
1968: 9 & C & B1/three-band & $(3,4)$ & zeroth & $1.56$ \\
1969: %\tableline 
1970: 10 & B1 & C/two-band & $(3,3)$ & zeroth & $1.36$ \\
1971: %\tableline
1972: 11 & B1 & --- & $(3,4)$ & zeroth & $1.27$ \\
1973: \tableline
1974: \end{tabular}
1975: \end{center}
1976: \tablecomments{For each set of the PSF, dust, and lens galaxy light profiles derived in Sections \ref{sec:psf}--\ref{sec:lensGalLight}, the Bayesian log evidence value is from the source-intensity reconstruction using the SPLE1+D (isotropic) model in \citet{KoopmansEtal03}.  The uncertainty in the log evidence value due to source pixelization is $\sim 0.03\times10^4$.  In the PSF column, ``drz'' = drizzled TinyTim, ``C'' = closest star, ``B1'' = bright star \#1, and ``B2'' = bright star \#2.  In the dust map column, we list ``two-band'' for the dust map obtained from just the two ACS bands and ``three-band'' for the dust map obtained from the two ACS and the one NICMOS band.  Unless otherwise indicated in the dust map column, the PSF model used for the dust map derivation was the same as the corresponding PSF model in the PSF column that was used for source reconstruction.  For completeness, we restate the S\'ersic indices in Table \ref{tab:galfitChi2} in the lens galaxy light profile column, which were obtained for the corresponding dust maps and PSFs specified in the dust map column.  The column of ``Reg.~Type'' refers to the preferred type of regularization for the source reconstruction, based on the highest Bayesian evidence value.  It can be one of three types: zeroth-order, gradient, or curvature.}
1977: \end{table}
1978: 
1979: 
1980: \begin{figure*}
1981: \begin{center}
1982: \includegraphics[width=150mm]{f9.ps}
1983: \end{center}
1984: 
1985: \caption[source-intensity reconstruction of B1608+656 with PSF-B1, its corresponding three-band dust map and lens galaxy light]
1986: {\label{fig:B1608SrRecB1Star3band} Source-intensity reconstruction of
1987: B1608+656  (assuming model \#5 in Table \ref{tab:B1608SingleSrRecEvid}). 
1988: Top panels from left
1989: to right: the reconstructed source-intensity distribution with the caustic
1990: curves of the SPLE1+D (isotropic) model overlaid, 
1991: the $1\sigma$ error for the source-intensity values, 
1992: the S/N of the reconstruction (i.e., 
1993: the ratio of the top
1994: left-hand to the top-middle panel).  
1995: Bottom panels from left to right: the observed
1996: F814W galaxy-subtracted image, the reconstructed image using the reconstructed
1997: source in the top left-hand panel, and the normalized image residual (i.e., 
1998: the map of the 
1999: difference between the bottom left-hand and the bottom middle panels, in units
2000: of the estimated pixel uncertainty from the data image covariance matrix).}
2001: 
2002: \end{figure*}
2003: 
2004: \subsubsection{Results of Comparison}
2005: \label{sec:ImProcModelComp:CompResults}
2006: 
2007: 
2008: Table \ref{tab:B1608SingleSrRecEvid} summarizes the results of model
2009: comparison.  The ``Reg.~Type'' column denotes the preferred type of
2010: regularization for the source reconstruction based on the highest
2011: Bayesian evidence value \citep{SuyuEtal06}.  It can be one of the
2012: three types that we use: zeroth-order, gradient, and curvature.  The
2013: last column lists the log evidence values from the inversions.
2014: Assuming the different models to be equally probable a
2015:   priori, we use these evidence values for model comparison.  The log
2016: evidence values range from $1.1\times10^4$ to $1.6\times 10^4$ with
2017: uncertainties of $\sim 0.03\times10^4$ due to the finite source
2018: resolution.
2019: 
2020: The list shows that the three-band dust models have higher evidence values
2021: than the two-band dust models.  This is attributed to the two-band dust
2022: models showing image residuals from the aforementioned artificial
2023: surpluses and deficits of light in the dust-corrected image.  The
2024: inclusion of the NICMOS F160W image to the ACS images (F606W and
2025: F814W) for the dust correction is, therefore, crucial due to (1) the
2026: proximity in the wavelengths of the ACS images and (2) the reduction
2027: in the error associated with image misalignments and simplistic
2028: intrinsic color models.
2029: 
2030: The three-band dust models also have higher evidence values than the
2031: no-dust model.  This further validates the three-band dust correction, as
2032: already indicated by Fig.~\ref{fig:overlayF814W}.  The evidence value
2033: of the no-dust model is in midst of the values for the two-band dust
2034: models, suggesting that the systematic effects in the two-band dust maps
2035: are comparable to the corrections that the dust maps are meant to
2036: achieve, thus leading to little improvement in the lens modeling.
2037: 
2038: The difference between the evidence values in Models 1 and 2 (where
2039: the models only differ in the S\'ersic light profiles) is, in general,
2040: smaller than the difference between one of these two models and
2041: another PSF/dust model.  Therefore, the source reconstruction (part of
2042: lens modeling) seems to be less sensitive to the galaxy light profiles
2043: than the PSF/dust models.  This is in agreement with our finding that
2044: the dust-corrected image depends more on the PSF and the intrinsic
2045: color models than on the form of the lens galaxy light and the dust
2046: associated with the lens galaxies.  Models 1 and 2 with PSF-drz have
2047: log evidence values on the low side of the collection of models with
2048: three-band dust maps, which was expected with PSF-drz not having a single
2049: brightness central peak due to misalignments in the drizzling process.
2050: The other models with three-band dust maps (Models 3, 5, 7 and 9) have
2051: effectively the same evidence values within the uncertainties.  The
2052: models with two-band dust maps (Models 4, 6, 8, and 10) lead to a range
2053: of evidence values with the PSF-C dust map being preferred to the
2054: PSF-B1 and PSF-B2 dust maps.  The two-band dust maps suggest that the
2055: shape of the primary maximum in the PSF is more important in the
2056: modeling than the inclusion of secondary maxima since PSF-C, which we
2057: expect to have a more accurate shape for the primary PSF maximum than
2058: PSF-B1 and PSF-B2, does not have the secondary maxima whereas PSF-B1
2059: and PSF-B2 do.  The asymmetry in the PSF due to the star not being
2060: centered on a single pixel may also explain the less-preferred PSF-B1
2061: and PSF-B2.  The distinction between the various stellar PSFs vanishes
2062: with the three-band dust maps, possibly due to the higher amount of noise
2063: in the three-band dust map with the inclusion of the lower S/N
2064: NICMOS image.  In this case, the effects of the PSF variations across
2065: the field are suppressed.
2066: 
2067: All models preferred either the zeroth-order or gradient form of
2068: regularization, but never the curvature form; however, we mention that
2069: the difference in the log evidence values between the different
2070: regularization schemes ($\lesssim 3\times 10^2$) are on the order of
2071: the uncertainties due to source pixelization, and the resulting
2072: reconstructions for different types of regularizations are almost
2073: identical.  This is because differences in evidence values between
2074: models are currently dominated by changes in goodness of fit rather
2075: than subtle differences between the prior forms.  Only when the image
2076: residual is reduced will the prior (regularization) begin to play a
2077: greater role in avoiding the reconstruction to fit to noise in the
2078: data by keeping the source model simple.
2079: 
2080: This section has illustrated a method of creating sensible PSF, dust,
2081: and lens galaxy light models for the gravitational lens B1608+656.  We
2082: have obtained a representative sample of models, and have compared
2083: these models quantitatively.  This collection of PSF, dust, and lens
2084: galaxy light models leads to image residuals that cannot be beaten
2085: down further unless we improve the SPLE1+D (isotropic)
2086: simply-parameterized lens potential model by \citet{KoopmansEtal03} to
2087: take into account the two \textit{interacting} galaxy lenses.  The
2088: pixelated potential reconstruction of B1608+656 is the subject of Section
2089: \ref{sec:PotRec:B1608}.
2090: 
2091: %-------------------------------------------------------------------------------
2092: 
2093: \section{Pixelated lens potential of B1608+656}
2094: \label{sec:PotRec:B1608}
2095: 
2096: We reconstruct the lens potential for each set of the PSF, dust, and
2097: lens galaxies' light in Models 2--11 in Table
2098: \ref{tab:B1608SingleSrRecEvid}.  We describe in detail the potential
2099: reconstruction using Model 5, which is one of the four models that,
2100: within the uncertainties, have the highest Bayesian evidence value
2101: before the potential correction.  At the end of the section, we
2102: discuss the differences in the potential reconstruction between the
2103: various PSF, dust, and lens galaxies' light models.
2104: 
2105: To reconstruct the lens potential of B1608+656, we use a
2106: $130\times130$ pixel cutout of the drizzled ACS/F814W image with the pixel
2107: size $0.03''$ shown in Fig.~\ref{fig:B1608acsF606F814}.  The
2108: galaxy-subtracted F814W image
2109: ($=\dataVec-\blurSet\cdot\dustSet\cdot\glightVec$) is a $130\times130$
2110: subimage of the right-hand panel in Fig.~\ref{fig:galfitB1Star3band}
2111: with $200\times200$ pixels.
2112: 
2113: We follow the potential reconstruction method that was shown to
2114: succeed in Section \ref{sec:PPRMethod:demo}.  For the initial lens
2115: potential model, we use the SPLE1+D (isotropic) model from
2116: \citet{KoopmansEtal03}. We perform nine iterations (labeled as 0--8) of
2117: pixelated potential corrections on B1608+656.  For each iteration, we
2118: first reconstruct the source intensity on a $32\times32$ grid with
2119: pixel sizes of $0.022''$.  The source region is chosen so that it maps
2120: to a completely joined annulus on the image plane (so that we can
2121: determine the relative potential difference between images).  As in
2122: Section \ref{sec:ImProcModelComp}, the PSF is reduced to a
2123: $15\times15$ matrix to keep the inversion matrices sparse (and
2124: computation time low).  Furthermore, we use only the curvature type of
2125: regularization for the source reconstruction to reduce computation
2126: time and to have regularized source-intensity gradients for the
2127: potential corrections.  The source inversions are over-regularized in
2128: the early iterations to ensure a smooth resulting source for taking
2129: gradients.  The source over-regularization factors start at 1000 and
2130: are gradually decreased to 1 at iteration=8.  With the resulting
2131: source-intensity gradients and intensity deficits from the source
2132: reconstruction, we perform the potential correction on a grid of
2133: $30\times30$ pixels.  We use the curvature form of regularization for
2134: each potential correction iteration.  To keep the corrections linear,
2135: the potential corrections are also over-regularized with the
2136: regularization constant ($\mu$) set at $10$ times the value where $\mu
2137: E_{\rm{\dpsi}}$ peaks, as in Section \ref{sec:PPRMethod:demo}.  The
2138: corrected potential has the midpoints in the left, bottom, and right
2139: parts of the annular reconstruction region fixed to the initial
2140: potential model.
2141: 
2142: The top row of Fig.~\ref{fig:PotRec:B1608} shows the results of
2143: iteration=0 of source and potential reconstruction.  The left-hand
2144: panel shows the reconstructed source that has been over-regularized by
2145: a factor of 1000.  The caustics are those of the initial SPLE1+D
2146: (isotropic) model.  The source is localized and compact, a sign that
2147: the initial SPLE1+D (isotropic) potential we started from is close to
2148: the true model.  The middle-left panel shows significant image
2149: residuals that are to be corrected, especially near the cores of the
2150: images due to the over-regularization of the source-intensity
2151: distribution.  The annular region marks the region of data that we use
2152: for the evidence computation in the final iteration of source
2153: reconstruction.  Using the gradient from the reconstructed source and
2154: the intensity deficit, the middle-right panel shows the potential
2155: reconstruction of iteration=0 and the right-hand panel shows the
2156: fraction of the accumulated potential corrections relative to the
2157: initial model.
2158: 
2159: \begin{figure*}
2160: \begin{center}
2161: \includegraphics[width=180mm]{f10a.ps}
2162: \includegraphics[width=180mm]{f10b.ps}
2163: \includegraphics[width=180mm]{f10c.ps}
2164: \end{center}
2165: \caption[Pixelated potential reconstruction of
2166: B1608+656]{\label{fig:PotRec:B1608} Results of the iterative pixelated
2167:   potential reconstruction of B1608+656.  Top row, which shows the
2168:   results of iteration=0: the left-hand panel shows the
2169:   over-regularized curvature source reconstruction, the middle-left
2170:   panel shows the normalized image residual (in units of the estimated
2171:   pixel uncertainty from the data image covariance matrix) based on
2172:   the inverted source, the middle-right panel shows the potential
2173:   corrections on an annulus using the curvature form of regularization,
2174:   and the right-hand panel shows the accumulated potential corrections
2175:   relative to the initial potential model.  The source is localized,
2176:   an indication that we are close to the initial model, but not at the
2177:   true potential model because significant image residuals are
2178:   present. Middle row, which shows the results of iteration=2: the
2179:   panels are arranged in the same way as in the top row.  Compared to
2180:   iteration=0, the image residuals and the potential corrections are
2181:   both smaller.  Bottom row, which shows the results of iteration=8:
2182:   the panels are arranged in the same way as in the top row.  The
2183:   resulting source of the corrected potential is more localized than
2184:   that of the uncorrected potential in
2185:   Fig.~\ref{fig:B1608SrRecB1Star3band}, and the image residual
2186:   corresponds to a reduced $\chi^2$ of 1.1.  The accumulated potential
2187:   correction is only $\sim 2\%$.}
2188: \end{figure*}
2189: 
2190: 
2191: The middle row of Fig.~\ref{fig:PotRec:B1608} shows the result of
2192: iteration=2 of source and potential reconstruction.  Compared to
2193: iteration=0 that has the same over-regularization factors, the source
2194: reconstruction is slightly smoother, the image residual has decreased,
2195: and the potential correction is not as large.
2196: 
2197: In the iterations from 3 to 8, the potential corrections are small;
2198: therefore, the source reconstruction and image residual change only
2199: gradually during these iterations.  The bottom row of
2200: Fig.~\ref{fig:PotRec:B1608} shows the results of iteration=8 (the last
2201: iteration).  The reconstructed source in the left-hand panel has more
2202: background noise than iteration=2 because the source is now optimally
2203: regularized.  The source after the potential correction is more
2204: localized than that before the potential correction in
2205: Fig.~\ref{fig:B1608SrRecB1Star3band}, which is a good indication that
2206: the reconstructed potential is closer to the true potential (up to the
2207: mass-sheet degeneracy).  The normalized image residual in the
2208: middle-left panel shows an overall decrease in the image residual
2209: compared with that in Fig.~\ref{fig:B1608SrRecB1Star3band}.  There
2210: remains intensity deficit near the image locations since the
2211: intensities of point-like images do not generally match due to the
2212: time delays and variability.  This misfit can also be due to the
2213: undersampling of the PSF.  There is also remaining image residual near
2214: image C that is likely due to imperfections in the dust correction.
2215: Nonetheless, the reduced $\chi^2$ inside the annulus is 1.1 
2216: (keeping in mind the unscaled
2217: nature of our image pixel uncertainties).  The right-hand panel in the
2218: bottom row of Fig.~\ref{fig:PotRec:B1608} shows that the final
2219: accumulated potential correction relative to the initial model is
2220: only $\sim 2\%$.  The structure of the accumulated potential
2221: correction may seem to resemble the simulation in
2222: Fig.~\ref{fig:PR:demo1:simData}; however, this does not mean that the
2223: potential correction in B1608+656 corresponds to a mass clump as in
2224: the simulation.  We point out that the maps of the potential
2225: corrections that generally look similar (due to the fixing of the
2226: three points in the annulus) may lead to very different convergence
2227: maps.
2228: 
2229: The potential reconstruction described above is for Model 5.  After
2230: repeating the procedure for the other models, we find that the image
2231: residual and source reconstruction in the final iteration for the
2232: other three-band dust models are similar in feature to Model 5.  In
2233: contrast, the two-band dust maps' source reconstruction continue to show
2234: nonlocalized source intensities with spurious light pixels outside of
2235: the main component.  Furthermore, parts of the artificial surpluses or
2236: deficits of the dust-corrected light near the color boundaries remain
2237: after the potential correction.  For Model 11 with no dust, the
2238: potential corrections lead to a localized source with image residuals
2239: that show misfit only near the image cores and locations of the dust
2240: lane.
2241: 
2242: These results of the potential reconstructions can be quantified using
2243: the Bayesian evidence values from the source reconstruction of the
2244: final corrected potential.  Table
2245: \ref{tab:B1608SingleSrRecEvidAfterPPR} lists the evidence values for
2246: Models 2 to 11.  The uncertainties in the evidence values are due to
2247: the source pixelization and the possible range of over-regularization
2248: for the source-intensity reconstruction and lens potential correction.
2249: We explored over-regularization factors in the range between 1 and
2250: 1000 for the source intensity, and various factors within 30 of the
2251: regularization constant $\mu$ that corresponds to the peak of $\mu
2252: E_{\dpsi}$ for the potential correction.  The table shows that the
2253: three-band dust maps are consistently ranked higher than the two-band dust
2254: maps, indicating the importance of including the NICMOS image for the
2255: dust correction.  All three-band dust maps give the same evidence values
2256: within the uncertainties, indicating that the various PSF and three-band
2257: dust models are all acceptable.  Furthermore, the resulting Fermat
2258: potential differences between the images for these models agree within
2259: the uncertainties.  Model 11 with no dust leads to the same evidence
2260: value as the values for three-band dust models.  The predicted Fermat
2261: potential differences between the images for Model 11 are also in
2262: similar ranges as those of the three-band dust models.  This shows that
2263: the global structure of the lens potential remains relatively intact
2264: after the dust correction to give similar predicted Fermat potential
2265: values, even though local pixelated potential corrections are flexible
2266: enough to mimic the effects of dust extinction.  It is encouraging
2267: that the dust extinction in B1608+656 does not alter the surface
2268: brightness in a systematic way as to change the global structure of
2269: the lens potential.  This robustness in the global structure of the
2270: lens potential is important for inferring the value of the Hubble
2271: constant.
2272: 
2273: 
2274: \begin{table}
2275: \begin{center}
2276: \caption[Ranked model comparison after potential reconstruction]{\label{tab:B1608SingleSrRecEvidAfterPPR} Ranked model comparison after potential reconstruction
2277: }
2278: \begin{tabular}{cccc}
2279: \tableline
2280: \tableline
2281: Model & PSF & dust & log evidence  \\
2282:       &     &      & $(\times10^4)$ \\
2283: \tableline
2284: 5 & B1 & Three-band    & $1.77\pm0.05$ \\
2285: 9 & C  & B1/three-band & $1.76\pm0.04$ \\
2286: 3 & C  & Three-band    & $1.76\pm0.05$ \\
2287: 11 & B1 & ---      & $1.76\pm0.05$ \\
2288: 2 & drz& Three-band    & $1.75\pm0.05$ \\
2289: 7 & B2 & Three-band    & $1.75\pm0.05$ \\
2290: 10& B1 & C/two-band  & $1.61\pm0.05$ \\
2291: 4 & C  & Two-band    & $1.58\pm0.05$ \\
2292: 6 & B1 & Two-band    & $1.41\pm0.05$ \\
2293: 8 & B2 & Two-band    & $1.40\pm0.05$ \\
2294: \tableline
2295: \end{tabular}
2296: \end{center}
2297: \tablecomments{In the PSF column, ``drz'' = drizzled TinyTim, ``C'' = closest star, ``B1'' = bright star \#1, and ``B2'' = bright star \#2.  In the dust map column, ``two-band'' represents the dust map obtained from just the two ACS bands, and ``three-band'' represents the dust map obtained from the two ACS and the one NICMOS band.  The uncertainty in the log evidence from the source-intensity reconstruction is due to the source pixelization and the possible range of over-regularization for the source-intensity reconstruction and lens potential correction.  Within the uncertainties, Models 5, 9, 3, 11, 2 and 7 have the highest evidence values.  Note that the three-band dust maps are ranked higher than the two-band dust maps.}
2298: \end{table}
2299: 
2300: 
2301: In summary, for the top PSF, dust, and lens galaxies' light models, the
2302: pixelated potential correction scheme was successfully applied to
2303: B1608+656 leading to potential corrections of $\sim 2\%$.  This is
2304: only a small amount of correction, indicating that the smooth
2305: potential model in \citet{KoopmansEtal03} is remarkably good.  The
2306: resulting source is also well localized.
2307: 
2308: This completes the dissection of the gravitational lens B1608+656.
2309: The image residual is not fully eliminated possibly due to imperfect
2310: PSF, dust, lens galaxies' light modeling, variability in the point
2311: source intensities, finite source resolution, and/or undersampled PSF.
2312: In Paper II, we use the models in Table
2313: \ref{tab:B1608SingleSrRecEvidAfterPPR} to derive $H_0$ and to estimate
2314: its uncertainty associated with the modeling.
2315: 
2316: 
2317: % ------------------------------------------------------------------------------
2318: 
2319: \section{Mass and light in the B1608+656 lens system}
2320: \label{sec:B1608prop}
2321: 
2322: The clean dissection of the lens system in the previous sections
2323: allows us to study the mass and light in G1 and G2.
2324: 
2325: Since the amount of potential correction is small, we can safely
2326: neglect the implied corrections when estimating the mass associated
2327: with the lens galaxies.  Integrating the SPLE1+D (isotropic) surface
2328: mass density of each of the lens galaxies within their respective
2329: Einstein radii, the mass of G1 enclosed within $r_{\rm{E;G1}}=0.81''$
2330: is $M_{\rm{G1}}=1.9\times10^{11} h^{-1}\, \rm{M_{\sun}}$, and the mass
2331: of G2 enclosed within $r_{\rm{E;G2}}=0.28''$ is
2332: $M_{\rm{G2}}=2.8\times10^{10} h^{-1}\, \rm{M_{\sun}}$ .  Our dust
2333: correction enables us to recover the intrinsic luminosity of the lens
2334: galaxies.  We use the fitted S\'ersic light profiles to estimate the
2335: luminosity of G1 and G2.  Integrating the flux of G1 and G2 within
2336: $r_{\rm{E;G1}}$ and $r_{\rm{E;G2}}$, respectively, the \textit{total
2337:   mass} to rest-frame B-band light ratio of G1 is $({\rm M}/L_{\rm
2338:   B})_{\rm{G1}}=(2.0 \pm 0.2) h\, \rm{M_{\sun}\, L_{\rm B,\sun}^{-1}}$
2339: and of G2 is $({\rm M}/L_{\rm B})_{\rm{G2}}=(1.5\pm 0.2) h\, \rm{M_{\sun}\,
2340:   L_{\rm B,\sun}^{-1}}$.  The total mass and M/L of G1 are
2341: consistent with those from earlier works on B1608+656
2342: \citep[e.g.,][]{FassnachtEtal96} after taking into account the
2343: difference in the Einstein radius (due to the different number of
2344: components in the lens model) and the lowered M/L as a result of the
2345: dust correction.  The M/L ratio of G1 is low compared to the lens
2346: galaxies in \citet{TreuKoopmans04}, which have M/L in the range
2347: $\sim $3--8$\, \rm{M_{\sun}/L_{\rm B,\sun}}$.  This is consistent with
2348: the spectrum of G1 showing signatures of both young and poststarburst
2349: populations, since these types of galaxies can have lower M/L ratios
2350: by a factor of $\sim 10$ compared to other E/S0 galaxies at similar
2351: redshifts \citep[e.g.,][]{vanDokkumStanford03}.  Therefore, even
2352: though B1608+656 consists of two interacting galaxy lenses that lie in
2353: a group \citep{FassnachtEtal06}, the M/L ratio of G1 is consistent
2354: with those in noninteracting lens systems.
2355: 
2356: 
2357: % ------------------------------------------------------------------------------
2358: 
2359: \section{Conclusions}
2360: \label{sect:concl}
2361: 
2362: In this paper, we have described and tested an iterative and
2363: perturbative lens potential reconstruction scheme whose accuracy in
2364: the recovered lens potential is in principle solely limited by the
2365: noise in the data, provided we have extended sources giving
2366: well connected ring-like images.  The method is based on a Bayesian
2367: analysis, which provides a quantitative approach for comparing
2368: different models of the various constituents of a lens system: PSF,
2369: dust, lens galaxy light, and lens potential.  We applied this method
2370: to the gravitational lens B1608+656 with deep \HST ACS observations.
2371: We presented an image processing technique for obtaining a suite of
2372: PSF, dust, and lens galaxies' light models, and compared these models
2373: quantitatively.  For each model, we reconstructed the lens potential
2374: on a grid of pixels, using the simply-parameterized SPLE1+D
2375: (isotropic) model in \citet{KoopmansEtal03} as our initial model.  The
2376: reconstructions for the models with three-band dust maps were deemed
2377: successful in that they led to an acceptable level of image residual
2378: and a well-localized inferred source-intensity distribution.
2379: 
2380: From our analysis, we draw the following conclusions.
2381: 
2382: \begin{enumerate}
2383: 
2384: \item The potential reconstruction method, which simultaneously determines
2385: the extended source intensity and the lens potential distributions on grids
2386: of pixels, can correct for potential perturbations that are $\lesssim5\%$.
2387: 
2388: \item The mass-sheet degeneracy is broken in the potential corrections by
2389: choosing forms of regularization that suppress large deviations from the
2390: initial (mass-constrained) model unless the data require them.
2391: 
2392: \item The NICMOS F160W image is needed to complement the ACS F606W and F814W
2393: images for dust correction in order to avoid systematic errors.
2394: 
2395: \item The level of potential correction required in B1608+656 was found to
2396: be $\sim 2\%$, validating the use of the simply-parameterized model of
2397: \citet{KoopmansEtal03}.
2398: 
2399: \item The effect of dust extinction does not alter the global structure of
2400: the lens potential, and hence the predicted Fermat potential differences
2401: between the images.
2402: 
2403: \item The mass and ${\rm M}/L_{\rm B}$ of G1 inside $r_{\rm E}=0.81''$ are
2404: $1.9\times10^{11} h^{-1}\, \rm{M_{\sun}}$ and $(2.0\pm0.2) h \,
2405: \rm{M_{\sun}\, L_{\rm B,\sun}^{-1}}$, respectively.  These values are
2406: consistent with the spectral type of this galaxy, and previous less
2407: accurate estimates of its M/L ratio.
2408: 
2409: \end{enumerate}
2410: 
2411: Although the pixelated potential reconstruction method can be applied to any
2412: lens system with an extended source-intensity distribution, it is
2413: particularly useful for measuring $H_0$ in time-delay lenses.  B1608+656 is
2414: the only four-image gravitational lens system that have all three
2415: independent relative time delays measured with errors of a few percent
2416: \citep{FassnachtEtal99, FassnachtEtal02}.  However, current and future
2417: imaging surveys (such as the Canada-France-Hawaii Telescope (CFHT) Legacy Survey,
2418: the Panoramic Survey Telescope \& Rapid Response System, the Large Synoptic
2419: Survey Telescope, and the Joint Dark Energy Mission) either are or soon  will be producing many more lenses:
2420: we can anticipate building up a sample of lens systems that can be 
2421: fruitfully studied using the methods we have developed.
2422: 
2423: %-------------------------------------------------------------------------------
2424: 
2425: \acknowledgments We thank M. Brada{\v c}, J. Krist, R.
2426: Massey, C. Peng, J. Rhodes, and P. Schneider for useful
2427: discussions and the anonymous referee for helpful comments 
2428: that improved the presentation of the paper.  
2429: We are grateful to M. Brada{\v c} and T.
2430: Schrabback for their help with the image processing.  S.H.S. thanks the
2431: Kavli Institute for Theoretical Physics for the Graduate Fellowship in
2432: the fall of 2006 and for hosting the gravitational lensing workshop,
2433: during which significant progress on this work was made.  S.H.S.
2434: acknowledges the support of the NSERC (Canada) through the
2435: Postgraduate Scholarship.  C.D.F. and J.P.M. acknowledge support under the \HST
2436: program \#GO-10158. Support for program \#GO-10158 was provided by
2437: NASA through a grant from the Space Telescope Science Institute, which
2438: is operated by the Association of Universities for Research in
2439: Astronomy, Inc., under NASA contract NAS 5-26555.  C.D.F. and J.P.M.
2440: acknowledge the support from the European Community's Sixth Framework
2441: Marie Curie Research Training Network Programme, contract no.
2442: MRTN-CT-2004-505183 ``ANGLES.''  L.V.E.K. is supported in part through an
2443: NWO-VIDI career grant (project number 639.042.505).  T.T. acknowledges
2444: support from the NSF through CAREER award NSF-0642621, by the Sloan
2445: Foundation through a Sloan Research Fellowship, and by the Packard
2446: Foundation through a Packard Fellowship.  This work was supported in
2447: part by the NSF under award AST-0444059, the Deutsche Forschungsgemeinschaft under
2448: the project SCHN 342/7--1 (S.H.S.), the TABASGO foundation in the
2449: form of a research fellowship (P.J.M.), and by the US Department of
2450: Energy under contract number DE-AC02-76SF00515.  Based in part on
2451: observations made with the NASA/ESA \textit{Hubble Space Telescope},
2452: obtained at the Space Telescope Science Institute, which is operated
2453: by the Association of Universities for Research in Astronomy, Inc.,
2454: under NASA contract NAS 5-26555. These observations are associated
2455: with program \#GO-10158.
2456: 
2457: 
2458: %-------------------------------------------------------------------------------
2459:  
2460: \bibliographystyle{apj}
2461: \bibliography{B1608acsAnalysis}
2462: 
2463: %-------------------------------------------------------------------------------
2464: \appendix
2465: \section{The matrix operator for pixelated potential correction}
2466: \label{app:PRmatrix}
2467: 
2468: A comparison of the potential correction Equation (\ref{eq:pertEq}) with
2469: its matrix form in Equation (\ref{eq:pertEqMat}) shows that the matrix
2470: operator $\PRmatSet$ needs to include the PSF blurring, the
2471: reconstructed source-intensity gradient, and the gradient operator
2472: that acts on the potential perturbations $\dpsiVec$.  We will consider
2473: each of these in the reverse order.
2474: 
2475: Before discussing the gradient operator, we need to define the domain
2476: over which the gradient operates.  Recall that the potential corrections
2477: are obtained on an annular region that contains the Einstein ring of
2478: the lensed source.  This region was obtained by tracing all the
2479: potential pixels back to the source plane (from the lens equation) and
2480: seeing which ones land on the finite source region of reconstruction.
2481: Only these potential pixels that trace back to the finite source
2482: region will have values of the source-intensity gradient for potential
2483: correction via Equation (\ref{eq:pertEq}).  These pixels tend to mark
2484: an annular region.  Therefore, we need to find the gradient operator on
2485: this annular region for $\dpsi$.
2486: 
2487: To construct the gradient operator, we use finite differencing to
2488: obtain numerical derivatives.  For simplicity, first consider a $M
2489: \times N$ rectangular grid with $x_1$ and $x_2$ as axes and $(i,j)$ as
2490: pixel indices (typically $M\sim N\sim 30$).  In this case, the partial
2491: derivatives of a function $f_{i,j}$ defined on the grid are:
2492: \bea
2493: \label{eq:NumDerRectGrid}
2494: \frac {\partial f_{i,j}}{\partial x_1} & = & \left\{ \begin{array}{ll} \frac{1}{2\Delta x_1}(-3f_{1,j}+4f_{2,j}-f_{3,j}) & \textrm {if $i=1$}\\
2495: \frac{1}{2\Delta x_1}(f_{i+1,j}-f_{i-1,j}) & \textrm {if $i=2,\ldots,M-1$}\\
2496: \frac{1}{2\Delta x_1}(f_{M-2,j}-4f_{M-1,j}+3f_{M,j}) & \textrm {if $i=M$} \end{array} \right. \nonumber \\
2497: \frac{\partial f_{i,j}}{\partial x_2} & = &\left\{ \begin{array}{ll} \frac{1}{2\Delta x_2}(-3f_{i,1}+4f_{i,2}-f_{i,3}) & \textrm {if $j=1$}\\
2498: \frac{1}{2\Delta x_2}(f_{i,j+1}-f_{i,j-1}) & \textrm {if $j=2,\ldots,N-1$}\\
2499: \frac{1}{2\Delta x_2}(f_{i,N-2}-4f_{i,N-1}+3f_{i,N}) & \textrm {if $j=N$} \end{array} \right. ,
2500: \eea
2501: where $\Delta x_1$ and $\Delta x_2$ are, respectively, the pixel sizes in the $x_1$
2502: and $x_2$ directions.  For the annular region of potential
2503: corrections, we only need to elaborate slightly on Equation
2504: (\ref{eq:NumDerRectGrid}).  Fig.~\ref{fig:App:annulus} shows a typical
2505: annular region and the types of pixels when numerically
2506: differentiating in the $x_1$ direction.  The edge pixels of the
2507: annulus, which are denoted by ``e'' in the figure for the $x_1$
2508: direction, are treated as though they are like the edge pixels of the
2509: rectangular grid (so that the $i=1$, $i=M$, $j=1$, or $j=N$ expressions
2510: are used) when the edge pixels are adjacent to at least two other
2511: pixels in the annulus in the direction of which the numerical
2512: derivative is taken.  If an edge pixel of the annulus is only adjacent
2513: to one other pixel in the direction of which the numerical derivative
2514: is taken, such as the shaded pixels in the figure for the $x_1$
2515: direction, then we construct the gradient by taking the difference
2516: between the two and dividing by the pixel size.  For example, if
2517: $f_{i,j}$ is at the edge, and $f_{i+1,j}$ is also in the annulus
2518: (which will have to be an edge pixel if $f_{i+2,j}$ is not in the
2519: annulus), then the numerical derivatives in the $x_1$ direction for
2520: both $f_{i,j}$ and $f_{i+1,j}$ are
2521: \be
2522: \label{eq:NumDer2pix}
2523: \frac{\partial f_{i,j}}{\partial x_1} = \frac{f_{i+1,j}-f_{i,j}}{\Delta x_1}.
2524: \ee
2525: A similar equation applies for the $x_2$ direction.  If an edge pixel
2526: in the annulus is ``exposed'' in the sense that in one of the
2527: directions $x_1$ or $x_2$, it has no adjacent pixels in the annulus,
2528: then this pixel is removed from the annular region of reconstruction
2529: as no numerical derivative can be formed.  An example of an
2530: ``exposed'' pixel in the $x_1$ direction is the hashed pixel in the
2531: figure.  Following the above prescription, we can obtain the values
2532: $({\partial f_{i,j}}/{\partial x_1}, {\partial f_{i,j}}/{\partial
2533:   x_2})$ of all the $(i,j)$ pixels in the annulus in terms of values of
2534: the function in the annulus $f_{kl}$.  Factoring out the $f_{kl}$
2535: values, we obtain the gradient operator defined as two matrices:
2536: $\boldsymbol{\mathsf{D}}_1$ for ${\partial }/{\partial x_1}$ and
2537: $\boldsymbol{\mathsf{D}}_2$ for ${\partial }/{\partial x_2}$.
2538: 
2539: \begin{figure}
2540: \begin{center}
2541: \includegraphics[width=80mm]{fa1.eps}
2542: \end{center}
2543: \caption[Numerical derivative on the annular region of potential
2544: reconstruction]{\label{fig:App:annulus} Typical annular region for
2545:   potential corrections and the form of $\partial f_{i,j}/\partial
2546:   x_1$ for each pixel.  The blank pixels use the $i=2,\ldots,M-1$
2547:   expression for $\partial f_{i,j}/\partial x_1$ in Equation
2548:   (\ref{eq:NumDerRectGrid}).  The pixels with ``e'' are edge pixels that
2549:   use the $i=1$ or $i=M$ expressions for $\partial f_{i,j}/\partial
2550:   x_1$ in Equation (\ref{eq:NumDerRectGrid}).  The shaded pixels use
2551:   Equation (\ref{eq:NumDer2pix}) for $\partial f_{i,j}/\partial x_1$.
2552:   The hashed pixel is an example of an ``exposed'' pixel with no
2553:   adjacent pixel in the $x_1$ direction.}
2554: \end{figure}
2555: 
2556: To conform to the data grid (since the image residual and image
2557: covariance matrix is defined on the data grid), we use bilinear
2558: interpolation.  We overlay the data grid on the coarser grid, and for
2559: every data pixel that lies inside the annular region on the coarse
2560: grid, we bilinearly interpolate to get, effectively, gradient
2561: operators on the data grid.  This gives us an $N_{\rm d} \times N_{\rm
2562:   p}$ matrix $\boldsymbol{\mathsf{G}}$ where each row (corresponding
2563: to a data pixel that lies within the annulus) has four nonzero values
2564: that correspond to the coefficients of bilinearly interpolating among
2565: the four coarse potential pixels surrounding this data pixel.
2566: Associated with each data pixel are the source-intensity gradient
2567: values (${\partial I}/{\partial \beta_1}$ and ${\partial I}/{\partial
2568:   \beta_2}$) that were obtained by mapping the data pixel back to the
2569: source plane using the lens equation, and interpolating on the
2570: reconstructed source-intensity gradient on the source grid.  We define
2571: matrices $\boldsymbol{\mathsf{G}}_1$ and $\boldsymbol{\mathsf{G}}_2$
2572: as the matrix $\boldsymbol{\mathsf{G}}$ multiplied by the source-intensity 
2573: gradient components ${\partial I}/{\partial \beta_1}$ and
2574: ${\partial I}/{\partial \beta_2}$, respectively.  By definition,
2575: $\boldsymbol{\mathsf{G}}_1$ and $\boldsymbol{\mathsf{G}}_2$ are also
2576: $N_{\rm d} \times N_{\rm p}$ matrices.
2577: 
2578: Lastly, we represent the PSF as a blurring matrix (operator)
2579: $\blurSet$ that is of dimensions $N_{\rm d} \times N_{\rm d}$ (see
2580: e.g., Section \ref{sec:PPRMethod:realData}; \citet{TreuKoopmans04}).
2581: Note that this matrix $\blurSet$ is different from the matrix in
2582: Section \ref{sec:PPRMethod:matrix:probTheory} that is the Hessian of
2583: the $E_{\rm D}$.
2584: 
2585: Combining all the pieces together, the matrix operator $\PRmatSet$ is 
2586: \be
2587: \label{eq:PRmatSetExpression}
2588: \PRmatSet = \blurSet \cdot \boldsymbol{\mathsf{G}}_1 \cdot \boldsymbol{\mathsf{D}}_1 + \blurSet \cdot \boldsymbol{\mathsf{G}}_2 \cdot \boldsymbol{\mathsf{D}}_2,
2589: \ee 
2590: which is of dimensions $N_{\rm d}\times N_{\rm p}$.
2591: 
2592: For the gravitational lens system B1608+656, we also need to include
2593: the effects of dust extinction, which we express as a diagonal matrix
2594: $\dustSet$.  Tracing back along the light rays, we encounter the dust
2595: immediately after the PSF blurring (for the light from the lensed
2596: source).  Therefore, we include it in Equation
2597: (\ref{eq:PRmatSetExpression}) after $\blurSet$ to get the following
2598: expression for the matrix operator $\PRmatSet$ that includes dust:
2599: \be
2600: \label{eq:PRmatSetDustedExpression}
2601: \PRmatSet = \blurSet \cdot \dustSet \cdot \boldsymbol{\mathsf{G}}_1 \cdot \boldsymbol{\mathsf{D}}_1 + \blurSet \cdot \dustSet \cdot \boldsymbol{\mathsf{G}}_2 \cdot \boldsymbol{\mathsf{D}}_2.
2602: \ee 
2603: 
2604: 
2605: 
2606: %-------------------------------------------------------------------------------
2607: 
2608: \end{document}
2609: