astro-ph0302587/ms.tex
1: \documentclass[12pt,preprint]{aastex}
2: 
3: %% This manuscript uses the AASTeX v5.x LaTeX 2e macros.
4: 
5: %% The first piece of markup in an AASTeX v5.x document
6: %% is the \documentclass command which calls the preprint style. 
7: %% (LaTeX will ignore any data that comes before this command.)
8: 
9: %% Examples of commands for styles:
10: %% \documentclass[12pt,preprint]{aastex}  => single spaced, single column
11: %% \documentclass[manuscript]{aastex}   => double spaced, single column
12: %% \documentclass[preprint2]{aastex}  => single spaced, double column
13: 
14: %% To create new macros, use \newcommand. 
15: %% These should appear before the \begin{document} command.
16: %% \documentclass[preprint2]{aastex}
17: 
18: \usepackage{psfig}
19: \include{epsf}
20: \newcommand{\be}{\begin{equation}}
21: \newcommand{\ee}{\end{equation}}
22: \newcommand{\ba}{\begin{eqnarray}}
23: \newcommand{\ea}{\end{eqnarray}}
24: \newcommand{\bsf}[1]{\mbox{\begin{bfseries}\textsf{{#1}}\end{bfseries}}}
25: \newcommand{\nbsf}[1]{\mbox{\textsf{{#1}}}}
26: 
27: %% To insert a short comment on the title page:
28: %% \slugcomment{Not to appear in Nonlearned J., 45.}
29: 
30: %% Running head information may be supplied, although
31: %% this information may be modified by the editorial offices.
32: %% The left head contains a list of authors,
33: %% usually a maximum of three (otherwise use et al.).  The right
34: %% head is a modified title of up to roughly 44 characters.  Running heads
35: %% will not print in the manuscript style.
36: 
37: \shorttitle{Semi--linear lens inversion}
38: \shortauthors{Warren \& Dye}
39: 
40: %% This is the end of the preamble. Start doc:
41: 
42: %\received{2002 November 18}
43: \begin{document}
44: 
45: %% LaTeX will automatically break titles if they run longer than
46: %% one line. However, use \\ to force a line break.
47: 
48: \title{Semi--linear gravitational lens inversion}
49: 
50: %% Use \author, \affil, and the \and command to format
51: %% author and affiliation information.
52: %% Note that \email has replaced the old \authoremail command
53: %% from AASTeX v4.0. \email can be used to mark an email address
54: %% anywhere in the paper, not just in the front matter.
55: %% As in the title, \\ forces line breaks.
56: 
57: \author{S. J. Warren and S. Dye } \affil{Astrophysics Group, Blackett
58: Laboratory, Imperial College London, Prince Consort Road, London, SW7 2BW, UK}
59: 
60: \begin{abstract}
61: 
62: We describe a new method for analyzing gravitational lens images, for
63: the case where the source light distribution is pixelized. The method
64: is suitable for high resolution, high $S/N$ data of a multiply--imaged
65: extended source.  For a given mass distribution, we show that the step
66: of inverting the image to obtain the deconvolved pixelized source
67: light distribution, and the uncertainties, is a linear one. This means
68: that the only parameters of the non--linear problem are those required
69: to model the mass distribution. This greatly simplifies the search for
70: a min.$-\chi^2$ fit to the data and speeds up the inversion. The
71: method is extended in a straightforward way to include linear
72: regularization. We apply the method to simulated Einstein ring images
73: and demonstrate the effectiveness of the inversion for both the
74: unregularized and regularized cases.
75: 
76: \end{abstract}
77: 
78: %% Keywords should appear after the \end{abstract} command.
79: %% See the instructions to authors to determine appropriate 
80: %% keyword punctuation.
81: 
82: \keywords{gravitational lensing}
83: 
84: 
85: \section{Introduction}
86: \label{sec:intro}
87: 
88: This paper is concerned with the problem of inversion of a
89: gravitationally--lensed image of an extended source, i.e. a galaxy
90: rather than a star or quasar. This problem is interesting because
91: lensed images of extended sources provide more information than images
92: of point sources, and so potentially are more useful for determining
93: the mass profiles in galaxies and clusters of galaxies. Also, because
94: of the magnification, one can measure structure in the light profile
95: of the source at enhanced resolution. In this paper we show how this
96: problem can be separated in a natural way into linear and non--linear
97: dimensions, and how this provides a number of advantages.
98: 
99: In this introduction we review solutions to the inversion problem and
100: introduce some of the terminology used in the remainder of the paper.
101: In all the solutions described here the mass in the lens is
102: parameterized. Nevertheless the analysis applies equally to a
103: pixelized mass distribution.
104: 
105: When presented with the lensed image of an extended source, the
106: unknowns to solve for are the source light profile and the lens mass
107: profile, and the uncertainties in these quantities. One approach to
108: this problem, suggested by Kayser and Schramm (1988), uses the fact
109: that regions of the source that are multiply--imaged have the same
110: surface--brightness. For a trial mass distribution, the method traces
111: image pixels to the source plane where the counts in different image
112: pixels mapping to the same source pixel are compared. The solution for
113: the mass is obtained by minimizing the dispersion in the image pixel
114: counts for such multiply--imaged source pixels. Kochanek et al. (1989)
115: successfully applied this approach to the inversion of the radio
116: Einstein ring MG1131+0456. The algorithm was refined by Wallington,
117: Kochanek, and Koo (1995) who applied it to the triply--imaged giant
118: arc in the galaxy cluster Cl 0024+1654.
119: 
120: The main shortcoming of this approach is that it does not deal with
121: the image point--spread--function (psf). If psf smearing of the image
122: (either instrumental or atmospheric) is significant, the light profile
123: of the source is not correctly recovered by backward tracing the
124: image, even if the mass distribution is exactly known. To deal with
125: the psf, a forward approach is needed i.e. one chooses a model for the
126: source light profile (parameterized or pixelized), and a model for the
127: mass (parameterized or pixelized), forms the image, convolves it with
128: the psf, and compares it to the actual image, adjusting the source and
129: lens models to minimize a merit function e.g. $\chi^2$.
130: 
131: An argument for choosing to parameterize rather than pixelize the
132: source light profile is that it forces the solution to be smooth.
133: Nevertheless, the source light profile may be complex. This is true,
134: for example, in the cases of MG1131+0456 and Cl\,0024+1654, cited
135: above. A large number of parameters might be required to provide a
136: satisfactory description. Without clues to the character of the source
137: it is extremely difficult to select the best parameterization i.e. the
138: one which provides a satisfactory fit with the smallest number of
139: parameters. In the most extreme example Tyson, Kochanski, \&
140: dell'Antonio (1998) used 232 parameters to model the source light
141: distribution of the galaxy lensed by the cluster Cl0024+1654.
142: 
143: If the source light profile is complex it is natural to consider
144: pixelizing the source, i.e. the counts in each pixel is a free
145: parameter. This removes the difficulty in finding a good
146: parameterization for the source, and thereby avoids any bias in the
147: fitted mass profile resulting from a poor choice. On the other hand,
148: due to the deconvolution, and because the pixels are independent, the
149: solution can be noisy. It is possible to achieve a smooth pixelized
150: solution by adding a suitable `regularizing' term to the merit
151: function. This term, if minimized on its own, would produce a smooth
152: source light profile. By adding this term to $\chi^2$ the final
153: solution involves a balance between obtaining the best fit to the
154: image (minimizing $\chi^2$), and obtaining a smooth source solution
155: (minimizing the regularizing term). Wallington, Kochanek, and Narayan
156: (1996) apply this approach to the case of the radio Einstein ring MG
157: 1654+134. They use a maximum entropy approach i.e. the regularizing
158: term to be minimized is the negative of the entropy, the negentropy.
159: Labeling the counts in source pixel $i$ by $s_i$, the source plane
160: negentropy is $\sum_i s_i\ln(s_i)$, and the
161: merit function
162: they minimize is 
163: \be
164: \label{eq:eqa}
165:   G=\chi^2_{im}+\lambda\sum_i s_i\ln(s_i) 
166: \ee 
167: (here we have followed the notation in Press et al., 2001, \S 18.7).
168: The purpose of the multiplier $\lambda$ is to give more or less weight
169: to the negentropy term.
170: 
171: The inversion proceeds as follows: For a fixed value of $\lambda$, the
172: solution is determined by searching through the multi--parameter space
173: for the minimum of the merit function. The number of dimensions of the
174: parameter space to search is the sum of the number of source pixels
175: and the number of parameters used for the mass. This search is most
176: efficiently achieved with a pair of nested cycles. The inner (source)
177: cycle searches for the best source light profile for a fixed mass
178: profile. The outer (mass) cycle adjusts the mass profile. Outside this
179: cycle is a third ($\lambda$) cycle where the multiplier is
180: adjusted. Because the negentropy term acts to smooth the source, as
181: $\lambda$ increases, the $\chi^2_{im}$ term also increases, i.e. the
182: fit becomes worse. The principle for reaching the final solution
183: (e.g. Press et al., 2001, \S 18.4) is to start with $\lambda$ large,
184: then progressively reduce the weight of the regularizing term until
185: the $\chi^2_{im}$ becomes satisfactory. In other words the solution
186: has the smoothest source that provides a satisfactory fit to the
187: image. `Satisfactory' is usually interpreted as reaching the criterion
188: for the $\chi^2$ for the image
189: $\chi^2_{im}=min(\chi^2_{im})+\sigma(\chi^2_{im})$.  With three nested
190: cycles, the $\lambda$, mass, and source cycles, the routine can be
191: slow.
192: 
193: In this paper we describe a new technique which we suggest simplifies
194: and clarifies the problem in a number of ways. In purely formal terms,
195: the method is very similar to the maximum entropy method of Wallington
196: et al.: Algebraically, we simply replace the negentropy term in the merit
197: function (\ref{eq:eqa}) with a linear regularization term. However,
198: the insight we bring is to show that for a fixed mass distribution,
199: the minimization of the merit function is now a linear problem
200: i.e. can be solved by matrix inversion. In other words the source
201: cycle \---\ the innermost of the three minimization cycles \---\ is
202: eliminated. This has major benefits. In the first place the inversion
203: is much quicker, thereby allowing a more thorough search for the best
204: fit mass model. At the same time, the uncertainty of identifying the
205: true minimum has been removed. The method also greatly simplifies
206: calculation of the uncertainties, as we show below. More generally,
207: the formalism clarifies the essence of the problem: The source
208: parameters are linear dimensions and the mass parameters are
209: non--linear dimensions. For this reason we call the method
210: `semi--linear'.
211: 
212: At this point it is worth noting that, because of magnification and
213: multiple imaging, the number of constraints to the solution can be
214: much greater than the number of parameters to solve for. In this
215: respect the lens inversion problem differs from many inversion
216: problems encountered in astronomy (for example image
217: deconvolution). We find, as a consequence, that in many circumstances
218: the regularization term can be removed altogether. So the $\lambda$
219: cycle is also eliminated. The merit function is then just
220: $\chi^2_{im}$, and this is our starting point for the presentation of the
221: theory. For a fixed mass profile, the pixelized source light
222: distribution that produces the min.$-\chi^2_{im}$ fit is obtained by
223: linear inversion. The mass profile is then adjusted to find the
224: minimum of these min.$-\chi^2_{im}$ fits. The advantage, besides speed
225: (only the mass cycle remains), is that the solution is unbiased, since
226: there is no smoothing of the source.
227: 
228: The outline of the remainder of the paper is as follows. In
229: \S\ref{sec:theory} we explain the basic theory, demonstrating that for
230: a fixed mass profile, the problem of obtaining the source light profile
231: by $\chi^2$ minimization in the image plane is a linear one, and
232: obtaining the covariance matrix for the counts in the source
233: pixels. We then extend the basic theory to include a linear
234: regularization term. In \S\ref{sec:sims} we apply the method to a
235: realistic problem, assessing the performance for different psf widths,
236: and different source pixel sizes, with and without linear
237: regularization. In \S\ref{sec:conc} we provide a summary of the main
238: points, together with recommendations for applying the method.
239: 
240: \section{Theory}
241: \label{sec:theory}
242: 
243: In this section we present the theory of semi--linear inversion,
244: firstly without regularization (\S\ref{sec:theory.unreg}), and then
245: with regularization (\S\ref{sec:theory.reg}). In each sub--section we
246: begin with the case where the mass is fixed, and then treat the
247: general case, minimizing also on the mass parameters.
248: 
249: \subsection{Semi--linear inversion without regularization}
250: \label{sec:theory.unreg}
251: 
252: \subsubsection{Fixed mass: Eliminating the source cycle}
253: \label{sec:theory.unreg.fix}
254: 
255: Without any regularizing term, the merit function is simply
256: $G=\chi^2_{im}$. The basic problem is to find the counts in the source
257: pixels that, for a given mass distribution, minimize the merit
258: function $G$, i.e. give the best fit to the observed image. Pixels in
259: the source plane are labeled $i=1,I$. There is no restriction on how
260: the source plane is tessellated. In principle, the pixels could change
261: in both size and shape across the source region, which itself could be
262: of any shape. Pixels in the image plane are labeled $j=1,J$. It is
263: assumed in the following that the image pixels include counts from the
264: image of the lensed source only i.e. the images of any lensing
265: galaxies, and the mean sky count, have been subtracted. Also we suppose
266: that the data in each image pixel are independent i.e. are
267: characterized by the counts $d_j$, and dispersion $\sigma_j$, with no
268: covariance between pixels (appropriate for CCD data).
269: 
270: The inversion proceeds as follows: Choose the mass model parameters,
271: then, for each source pixel $i$, in turn, form the image for unit
272: counts (surface brightness) by appropriate ray tracing and convolution
273: with the known point spread function i.e. compute the counts in the
274: $i$th image $f_{ij},j=1,J$. The problem now is to combine these $I$
275: images with scalings $s_i,i=1,I$, to minimize $G$. These scalings are
276: the deconvolved intrinsic source surface--brightness distribution.
277: 
278: The problem is of a standard type. The merit function is written
279: \be
280: \label{eq:eqb}
281:   G=\chi^2_{im}=\sum_{j=1}^{J}
282:   \left[\frac{\sum_{i=1}^{I}s_if_{ij}-d_j}{\sigma_j}\right]^2.
283: \ee
284: 
285: Minimizing with respect to each of the source terms we have a set of
286: $I$ simultaneous equations of the form
287: \be
288: \label{eq:eqc}
289:   \frac{1}{2}\frac{\partial{G}}{\partial{s_i}}=0=\sum_{j=1}^{J}
290:   \left[\frac{f_{ij}\sum_{k=1}^{I}s_kf_{kj}-f_{ij}d_j}{\sigma_j^2}\right]
291: \ee
292: where the reason for the factor $\frac{1}{2}$ will soon become clear.
293: These equations may be written in matrix form
294: \be
295: \label{eq:eqd}
296:   \bsf{F}\bsf{S}=\bsf{D}.
297: \ee
298: Here \bsf{S} is a column matrix of length $I$ containing the elements
299: $s_i$, to be solved for. \bsf{F} is a symmetric $I\times I$ matrix,
300: with elements
301: $\bsf{F}_{ik}=\sum_{j=1}^{J}f_{ij}f_{kj}/\sigma_j^2$. Finally \bsf{D}
302: is a column matrix of length $I$ containing the elements
303: $\bsf{D}_i=\sum_{j=1}^{J}f_{ij}d_{j}/\sigma_j^2$.
304: 
305: The solution for the counts in the source pixels is then simply
306: obtained by matrix inversion
307: \be
308: \label{eq:eqe}
309:   \bsf{S}=\bsf{F}^{-1}\bsf{D}
310: \ee
311: thus eliminating the source cycle.
312: 
313: The solution for the errors has a particularly simple form. We seek the
314: covariance matrix for the source pixels. Noting that
315: \be
316:   \label{eq:eqf}
317:   \bsf{F}_{ik}=\frac{1}{2}\frac{\partial{^2G}}{\partial{s_i}\partial{s_k}},
318: \ee
319: we see that the matrix \bsf{F} is one half times the Hessian matrix of
320: $\chi^2_{im}$, which is to say that \bsf{F} is the {\em curvature
321: matrix} of the problem (Press et al., 2001, \S\S15.4, 15.5) \---\ this
322: was the reason for using the factor $\frac{1}{2}$ in equation
323: (\ref{eq:eqc}). We now show that the matrix $\bsf{C}=\bsf{F}^{-1}$ is
324: the required covariance matrix of $\bsf{S}$.
325: 
326: For independent image pixels, the covariance between source pixels $i$ 
327: and $k$ is given by
328: \be
329:   \label{eq:eqg}
330:   \sigma_{ik}^2=\sum_{j=1}^{J}\sigma_{j}^2\frac{\partial{s_i}}
331:   {\partial{d_j}}\frac{\partial{s_k}}{\partial{d_j}}.
332: \ee
333: Using equation (\ref{eq:eqe}) this becomes
334: \be
335:   \label{eq:eqi}
336:   \sigma_{ik}^2=\sum_{j=1}^{J}\sigma_{j}^2
337:   \sum_{l=1}^{I}\bsf{C}_{il}\frac{f_{lj}}{\sigma_j^2}
338:   \sum_{m=1}^{I}\bsf{C}_{km}\frac{f_{mj}}{\sigma_j^2}.
339: \ee
340: Multiplying this out gives
341: \be
342:   \label{eq:eqj}
343:   \sigma_{ik}^2=\bsf{C}_{ik}
344: \ee
345: as required.
346: 
347: We see that for the case of fixed mass, the covariance matrix for the
348: source pixel counts is computed in the inversion step, without the
349: need for further calculation. We shall refer to this $I\times I$
350: matrix as the {\em source covariance matrix} hereafter. Even though it
351: is not the complete solution for the source pixel errors (because the
352: mass parameters have been fixed), the source covariance matrix is
353: extremely useful, for example, in exploring different mass models and
354: pixelizations (\S\ref{sec:sims}).
355: 
356: It is worth noting that the semi--linear inversion solution, either
357: with or without regularization, differs in character from the maximum
358: entropy solution. With the semi--linear method the counts in any
359: source pixel are unbounded, so the best--fit value could be negative,
360: since some image pixels contain negative counts (i.e. are below mean
361: sky). With the maximum entropy method all source counts must be
362: positive. The semi--linear method provides the best estimate of the
363: counts in a source pixel, and the solution is satisfactory provided
364: the result is consistent with being positive. If the counts in any
365: source pixel are significantly negative (e.g. $<-4\sigma$) this would
366: indicate a bad mass model. This possibility of testing the suitability
367: of the mass model with a source--plane statistic can be viewed as an
368: extra positive feature of the semi--linear method.
369: 
370: \subsubsection{Mass cycle}
371: \label{sec:theory.unreg.mass}
372: 
373: The full solution proceeds by searching through the mass--distribution
374: parameter space, at each point minimizing $\chi^2_{im}$ by linear
375: inversion, to find the smallest of these min.$-\chi^2_{im}$ values,
376: the global minimum. Because the number of dimensions of the parameter
377: space for the non--linear search has been greatly reduced, it is now a
378: much simpler problem to locate the true minimum securely. 
379: 
380: The solution for the errors is more complicated than above, since we
381: have added in the non--linear mass dimensions. If there are $L$
382: parameters that describe the mass, labeled $m_l$, we need to form the
383: $(I+L)\times(I+L)$ (symmetric) curvature matrix of the problem. But
384: note that the majority of the terms, the $I\times I$ terms
385: $\frac{1}{2}\frac{\partial{^2G}}{\partial{s_i}\partial{s_k}}$, have
386: already been computed and are the elements of the matrix \bsf{F} at
387: the global minimum. The remaining terms, the $L$ rows (and columns) of
388: terms such as
389: $\frac{1}{2}\frac{\partial{^2G}}{\partial{m_l}\partial{m_n}}$, and
390: $\frac{1}{2}\frac{\partial{^2G}}{\partial{m_l}\partial{s_i}}$, can be
391: filled in by simple measurement of the shape of the $\chi^2_{im}$
392: surface about the global minimum. The covariance matrix for the mass
393: and source parameters is the inverse of this curvature matrix.  We
394: shall refer to this $(I+L)\times(I+L)$ matrix as the {\em full
395: covariance matrix} hereafter.
396: 
397: \subsection{Semi--linear inversion with regularization}
398: \label{sec:theory.reg}
399: 
400: \subsubsection{Fixed mass: Eliminating the source cycle}
401: \label{sec:theory.reg.fix}
402: 
403: The possibility of replacing the negentropy term in equation
404: (\ref{eq:eqa}) by a term (a linear regularization term) which
405: preserves the linearity of the min.$-\chi^2$ approach is made evident
406: by the linearity of equation (\ref{eq:eqc}) with respect to the source
407: parameters. Clearly we can form a merit function by adding to
408: $\chi^2_{im}$ any term $G_L$ which is a linear combination of terms
409: $s_is_k$
410: \be
411:   \label{eq:eqk}
412:   G_L=\sum_{i,k}a_{ik}s_is_k
413: \ee
414: since the partial differentials of these additional terms will also be
415: linear. One example of a linear regularization term is
416: $G_L=\sum_{i=1}^{I}s_i^2$. The choice of $G_L$ is discussed below.
417: 
418: Writing the merit function generally as
419: \be
420:   \label{eq:eql}
421:   G=\chi^2_{im}+\lambda G_L
422: \ee
423: then, following through the same analysis as in
424: \S\ref{sec:theory.unreg.fix}, the solution for the counts in the
425: source pixels can be written
426: \be
427: \label{eq:eqm}
428:   \bsf{S}=[\bsf{F}+\lambda\bsf{H}]^{-1}\bsf{D}.
429: \ee
430: We call the matrix \bsf{H} the regularization matrix. The elements of
431: \bsf{H} are 
432: \be 
433:   \label{eq:eqn}
434:   \bsf{H}_{ik}=\frac{1}{2}\frac{\partial{^2G_L}}{\partial{s_i}\partial{s_k}}.
435: \ee
436: For example, if the regularization term is $G_L=\sum_{i=1}^{I}s_i^2$,
437: then we have $\bsf{H}=\bsf{I}$, the identity matrix.
438: 
439: The form of $G_L$ should be chosen to penalize noisy solutions. The
440: choice $G_L=\sum_{i=1}^{I}s_i^2$, termed ``zeroth--order''
441: regularization in the literature, is one attempt to achieve
442: this. Other widely--used linear regularization terms include {\em
443: gradient} and {\em curvature} forms. These three regularization forms
444: correspond, loosely speaking, to the prejudice that the source light
445: profile is, respectively, approximately zero, constant, or planar (see
446: Press et al., 2001, \S18.5, for a detailed account of the theory of
447: linear regularization and its implementation). In practice, if
448: $\lambda$ is not too large, all three terms serve to smooth the source
449: in a rather similar way, and there is little to distinguish between
450: the solutions.
451: 
452: The gradient and curvature forms consider the differences between
453: counts in neighboring pixels. Until now we have used a
454: one--dimensional numbering scheme for the source pixels. In this case,
455: since we need to take account of the relative spatial locations of
456: pixels in the source plane we use coordinates $x, y$. The simplest
457: gradient term is
458: \be 
459:   \label{eq:eqo}
460:   G_L=\sum_{x,y}[s_{x,y}-s_{x+1,y}]^2+\sum_{x,y}[s_{x,y}-s_{x,y+1}]^2.
461: \ee 
462: Another form uses $[s_{x,y}-0.5(s_{x+1,y}+s_{x,y+1})]^2$. In forming
463: the sum it is necessary to translate the indices $x,y$ to the index
464: $i$, and then equation (\ref{eq:eqn}) is used to form the matrix
465: \bsf{H}. Note that zeroth--order regularization is computationally by
466: far the simplest method, since it does not involve this step of
467: translation of indices.
468: 
469: The simplest curvature form is
470: \ba 
471:   \label{eq:eqp}
472:   G_L=\sum_{x,y}[s_{x,y}-0.5(s_{x-1,y}+s_{x+1,y})]^2 \nonumber \\
473:   +\sum_{x,y}[s_{x,y}-0.5(s_{x,y-1}+s_{x,y+1})]^2.
474: \ea
475: Another form uses
476: $[s_{x,y}-0.25(s_{x-1,y}+s_{x+1,y}+s_{x,y-1}+s_{x,y+1})]^2$.
477: 
478: The source covariance matrix for the regularized case is fortunately
479: only slightly more difficult to compute than for the unregularized
480: case. Writing $\bsf{R}=[\bsf{F}+\lambda\bsf{H}]^{-1}$, and following
481: the same line of reasoning as in \S\ref{sec:theory.unreg.fix}, we
482: obtain the analogous equation to equation (\ref{eq:eqi})
483: \be
484:   \label{eq:eqq}
485:   \sigma_{ik}^2=\sum_{j=1}^{J}\sigma_{j}^2
486:   \sum_{l=1}^{I}\bsf{R}_{il}\frac{f_{lj}}{\sigma_j^2}
487:   \sum_{m=1}^{I}\bsf{R}_{km}\frac{f_{mj}}{\sigma_j^2}.
488: \ee
489: Multiplying out we obtain
490: \be
491:   \label{eq:eqr}
492:   \sigma_{ik}^2=\bsf{R}_{ik}-
493:   \lambda\sum_{l=1}^I \bsf{R}_{il}[\bsf{RH}]_{kl}.
494: \ee
495: 
496: \subsubsection{Mass cycle}
497: \label{sec:theory.reg.mass}
498: 
499: The procedure for the full solution is the same as for the
500: unregularized case. One searches through the mass--distribution
501: parameter space, at each point minimizing $G$ by linear inversion, to
502: find the smallest of these min.$-G$ values, the global minimum. In the
503: regularized case there is no simple solution for the full covariance
504: matrix however. In the unregularized case, we were able to use the
505: fact that the inverse of the full curvature matrix is the full
506: covariance matrix. But in the regularized case this is no longer true
507: since we are minimizing $G=\chi^2_{im}+G_L$. Instead, an alternative
508: approach must be used, for example, a Monte Carlo method which inverts
509: an ensemble of realisations of the image by adding noise to the
510: original image.
511: 
512: \section{Simulations}
513: \label{sec:sims}
514: 
515: In this section we apply the semi--linear inversion method to a
516: realistic test problem. To validate the linear inversion step, we
517: begin with the case of fixed mass. We quantify the effectiveness of
518: the method under variations of the image $S/N$, psf width, and source
519: pixel size, for both the unregularized and regularized cases. We then
520: present an analysis of the full problem, allowing the mass parameters
521: to vary. Finally we debate the advantages of the unregularized and
522: regularized approaches, for different practical applications.
523: 
524: \subsection{Test problem}
525: \label{sec:sims.test}
526: 
527: To make the computations more useful we have based our investigation
528: on a realistic simulation of a deep image of an Einstein ring
529: gravitational lens system observed with the Advanced Camera for
530: Surveys (ACS) aboard HST. The camera has a pixel size of
531: $0.05\arcsec$.  We have used cosmological parameters $\Omega_m=0.3$,
532: $\Omega_{\Lambda}=0.7$. The lens is placed at $z=0.3$ and the source
533: lies at $z=3.0$.
534: 
535: Figure 1 illustrates the test problem. The lens (not shown) is
536: modelled as a singular isothermal ellipsoid with one--dimensional
537: velocity dispersion $260$ km s$^{-1}$ and ellipticity
538: $e=1-b/a=0.25$. The semi--major axis of the lens is aligned at
539: $40^{\circ}$ counterclockwise from the vertical. The Einstein angle is
540: $\theta_E=4\pi\sigma^2D_{ds}/(c^2D_s)=1.58\arcsec$. The source, shown
541: in the top left panel, is contained within a square of size
542: $0.75\arcsec$ and is modelled as two circular sources of Gaussian
543: profile, binned in $0.05\arcsec$ pixels, the same as the image pixel
544: size. The peak surface brightness of each source is 1.0, in arbitrary
545: units.  One source lies inside the inner caustic, while the second
546: source straddles the inner caustic. This source configuration
547: resembles that inferred for the gravitational lens $0047-2808$ (Wayth
548: et al., 2003). To create a realistic ACS simulation the image was
549: formed by ray tracing, then convolved with the point spread function,
550: and noise added (in the manner described in the following
551: paragraph). For simplicity we modelled the psf as a Gaussian, and
552: chose FWHM $0.08\arcsec$ which is the resolution of a
553: diffraction--limited telescope of the same diameter as HST, at a
554: wavelength $\lambda=800$nm. Because of the slight undersampling, the
555: convolution is made on a sub--pixelized grid and then binned up to the
556: full pixel size.
557: 
558: The data pixels used for the inversion were the 3626 pixels within the
559: annulus in the image plane marked in the figure. This annulus is
560: defined by the region covered by imaging the entire source
561: plane.\footnote{The region of the central image should also be
562: included for non--singular mass models.} An important point to note is
563: that the analysis region must at least cover this annulus, otherwise
564: the counts in some source pixels will be unconstrained and the
565: inversion will fail. A larger region may be used, but if it becomes
566: too large the usefulness of the $\chi^2$ statistic is diminished, as
567: then most of the pixels are in the background. The final step in the
568: simulation is to apply uniform Gaussian random noise over the image
569: plane. The noise level is defined in terms of the total $S/N_{im}$
570: integrated over the annulus. The same noise realisation was used for
571: all the simulations, but scaled in order to vary $S/N_{im}$.
572: 
573: The upper middle panel shows the final simulated ACS image. This
574: image, with source pixel size $0.05\arcsec$, $S/N_{im}=60$, and psf
575: FWHM$=0.08\arcsec$, is the reference test problem to invert. We later
576: vary these three parameters. The parameters of the different models we
577: have run are listed in Table \ref{table1}. Col. (1) provides the
578: simulation number. The reference problem is numbered 1. Col. (2) gives
579: the source pixel size in arcsec, col. (3) the summed $S/N$ in the
580: image, and col. (4) the psf FWHM in arcsec. Col. (5) is a label U or R
581: depending on whether the inversion was unregularized or
582: regularized. Then, in the case of regularized inversion, col. (6)
583: provides the degree of regularization, quantified by the parameter
584: $N_\lambda$, which is the increase of $\chi^2_{im}(\nu)$ from the
585: minimum in units of the standard deviation $\sigma(\chi^2_{im}(\nu))$.
586: Recalling the discussion in \S1, a value $N_\lambda=1$ in this column
587: corresponds to the usual criterion for the maximum allowed degree of
588: regularization. The other columns are explained in the next section.
589: 
590: \clearpage
591:   
592: \begin{figure}
593: \label{fig:lowres.unreg}
594: \plotone{f1.eps}
595: \caption{This plot shows the unregularized solution for the reference
596: problem, line (1), Table 1. The source plane, top left panel and
597: bottom row, is $0.75\arcsec\times0.75\arcsec$ with $0.05\arcsec$
598: pixels, and is centered on the optic axis. The source comprises two
599: circular Gaussian components and is shown top left. Also marked is the
600: line of the inner caustic for the isothermal ellipsoid lens. The
601: image, convolved with the psf, FWHM $0.08\arcsec$, and with noise
602: added, is shown upper middle. The image pixel size is $0.05\arcsec$
603: and the image box size is $5.0\arcsec\times5.0\arcsec$.  The lower
604: left panel is the source light distribution reconstructed from the
605: image by semi--linear inversion without regularization. The upper
606: right panel is the image of this source, convolved with the psf. The
607: lower right panel displays the $1\sigma$ uncertainty for the source
608: pixels, and the lower middle panel is the source $S/N$ image. The
609: dotted square is the region over which $S/N_{so}$ is measured. In this
610: and the following two figures counts in pixels in both the image and
611: source plane are in units of surface brightness.}
612: \end{figure}
613: 
614: \clearpage
615: 
616: \begin{table}
617: \footnotesize
618: \centering
619: \begin{tabular}{|rcccccccrccr|}
620: %
621: \hline
622: %
623: % Column headings:
624: %
625: (1) & (2) & (3) & (4) & (5) & (6) & (7) & (8) & (9) & (10) & (11) &
626: (12) \\
627: sim. &
628: source                           & 
629: $S/N_{im}$     & 
630: psf	           &  
631: U/R &
632: $N_\lambda$ &
633: $\chi^2_{im}(\nu)$                    &  
634: $\chi^2_{so}(\nu)$                     &   
635: $S/N_{so}$      & 
636: $|\Delta s/\sigma|$          & 
637: $\Delta s_{rms}$             &
638: note   \\
639: no. & pix. size & & FWHM & & & & & & max. & & \\
640:  & $''$ & & $''$ & & & & & & & & \\
641: %
642: \hline
643: %
644: % entries:
645: % unreg:
646:  1 & 0.050 & 60.0 & 0.08 & U &   & $0.956\pm0.024$ & $1.088\pm0.094$ &
647:  79.9 & 2.80 & 0.082 & Fig. 1 \\
648: % unreg:
649:  2 & 0.050 & 30.0 & 0.08 & U &   & $0.956\pm0.024$ & $1.088\pm0.094$ & 
650: 40.6 & 2.80 & 0.163 & \\
651: % NO PSF, unreg:
652:  3 & 0.050 & 60.0 & 0.00 & U &   & $0.958\pm0.024$ & $1.052\pm0.094$ & 
653: 111.2& 2.64 & 0.040 & \\
654: % unreg:
655:  4 & 0.050 & 60.0 & 0.16 & U &   & $0.956\pm0.024$ & $1.090\pm0.094$ & 
656: 24.6 & 2.73 & 0.252 & \\
657: % unreg:
658:  5 & 0.025 & 60.0 & 0.08 & U &   & $0.952\pm0.027$ & $1.003\pm0.047$ &
659:  20.2 & 2.84 & 0.634 & Fig. 2 \\ \hline
660: % regularised with lambda=0.42 to give delta chi^2 of 1*sig(chi^2):
661:  6 & 0.050 & 60.0 & 0.08 & R & 1 & $0.980\pm0.024$ & $1.138\pm0.094$ & 
662: 111.8& 3.02 & 0.031 & \\
663: % regularised with lambda=0.94 to give delta chi^2 of 2*sig(chi^2):
664:  7 & 0.050 & 60.0 & 0.08 & R & 2 & $1.004\pm0.024$ & $1.461\pm0.094$ & 
665: 120.8& 4.35 & 0.028 & \\
666: %
667: % regularised with lambda=0.015 to give delta chi^2 of 1*sig(chi^2):
668: 8 & 0.025 & 60.0 & 0.08 & R & 1 & $0.979\pm0.027$ & $1.003\pm0.047$ &
669: 64.5 & 3.14 & 0.242 & Fig. 3 \\
670: % regularised with lambda=0.100  to give delta chi^2 of 3*sig(chi^2):
671: 9 & 0.025 & 60.0 & 0.08 & R & 3 & $1.033\pm0.027$ & $1.003\pm0.047$ & 
672: 86.6& 3.40 & 0.123 & Fig. 3 \\
673: % regularised with lambda=0.380  to give delta chi^2 of 5*sig(chi^2):
674: 10 & 0.025 & 60.0 & 0.08 & R & 5 & $1.088\pm0.027$ & $1.004\pm0.047$ & 
675: 98.2& 3.34 & 0.070 & \\
676: \hline
677: %
678: \end{tabular}
679: \caption{\normalsize Dependence of reconstruction performance on
680: source plane pixel size, simulated ring image noise, psf width, and
681: degree of regularization.}
682: \label{table1}
683: \end{table}
684: 
685: \clearpage
686: 
687: \subsection{Fixed mass}
688: \label{sec:sims.fix}
689: 
690: The main purpose of the simulations is to illustrate the linear
691: inversion step, the `inner cycle', i.e. that part of the semi--linear
692: inversion method that differs from previous methods. For this reason
693: in this sub--section we fix the mass parameters at the input
694: values. The image inversion, therefore, is achieved in a single step
695: using equations (\ref{eq:eqe}) and (\ref{eq:eqm}), for the
696: unregularized and regularized cases respectively, and using equations
697: (\ref{eq:eqj}) and (\ref{eq:eqr}) for the source covariance matrix. We
698: consider the full problem, solving also for the mass parameters in
699: \S\ref{sec:sims.mass}.
700: 
701: \subsubsection{Unregularized inversion}
702: \label{sec:sims.fix.unreg}
703: 
704: The unregularized inversion of the reference problem is provided in
705: the remaining panels of Figure 1. The lower left panel shows the
706: reconstructed source and the upper right panel shows the image of the
707: reconstructed source convolved with the psf, i.e. the
708: min.$-\chi^2_{im}$ model fit to the simulated image. The bottom right
709: panel shows the source $\sigma$ image i.e. the standard deviation in
710: each pixel. This provides a visual impression of the uncertainties
711: \---\ note how the region of lowest $\sigma$ is bounded by the inner
712: caustic. However the whole covariance matrix is required for a proper
713: interpretation of the results.  The lower middle panel is the source
714: $S/N$ image. In all the Figures $1-3$, for the source $\sigma$ and
715: $S/N$ images the grayscale covers the full range of numbers in the
716: panel. For the other panels the same grayscale range is used in each
717: figure, to allow comparison of the relative noise levels.
718:  
719: We measure several quantities to assess the quality of the inversion,
720: listed in the remaining columns of Table \ref{table1}. The reduced
721: $\chi^2$ in the image plane, $\chi^2_{im}(\nu)$ is provided in
722: col. (7). The quoted uncertainty is given by $\sqrt{2/\nu}$ where
723: $\nu$ is the number of degrees of freedom i.e. the number of image
724: pixels (3626) minus the number of source pixels (225 or 900).  The
725: reduced $\chi^2$ in the source plane, $\chi^2_{so}(\nu)$, and its
726: uncertainty, is provided in col. (8). To account for the covariance
727: terms this is computed using
728: \be
729: \label{eq:eqs}
730: \chi^{2}_{so}=\sum_{i,k} \Delta s_i \bsf{C}^{-1}_{ik} \Delta s_k
731: = \sum_{i,k} \Delta s_i \bsf{F}_{ik} \Delta s_k 
732: \ee 
733: where $\Delta s_i$ is the residual in the $i$th pixel.  Here the
734: number of degrees of freedom is the number of source pixels. Col. (9)
735: provides the $S/N$ summed over the small box in the source plane shown
736: in Figure 1. The noise is computed as the square root of the sum
737: of the elements in the covariance matrix, formed by stripping out from
738: the source covariance matrix $\bsf{C}$ the rows and columns
739: corresponding to the pixels in the box. Col. (10) provides the
740: absolute value of the significance $\Delta s/\sigma$ of the worst--fit
741: source pixel, and col. (11) lists the $r.m.s.$ of the residuals in the
742: source plane.
743: 
744: The results for the reference problem, line (1) in Table \ref{table1},
745: are all satisfactory: The reduced $\chi^2$ values in the image and
746: source planes are both consistent with $1.0$, and the significance of
747: the worst pixel $2.80\sigma$ (col. 10) is not unexpected given that
748: there are 225 source pixels. The summed $S/N$ in the source box is an
749: improvement on $S/N_{im}$. This might be expected since the box is
750: restricted to the small region of the source plane containing nearly
751: all the signal. At the same time it shows that the $S/N$ is not
752: greatly degraded by amplification of noise in the deconvolution
753: step. We return to this issue below. We interpret these results as
754: meaning that the inversion has succeeded and produced the correct
755: solution to the well--posed problem of finding the source--pixel
756: counts that give the best fit to the image.
757: 
758: In simulation 2 we doubled the noise in the image plane. Comparing
759: lines (1) and (2) in the table, the effect of this is to double the
760: noise in the source plane (col. 11), and so halve the S/N of the
761: detected source (col. 9), as expected.
762: 
763: \clearpage
764: 
765: \begin{figure}
766: \label{fig:hires.unreg}
767: \plotone{f2.eps}
768: \caption{The plot shows the unregularized solution for the same
769: problem as in Figure 1, but with source pixels half as large, and
770: corresponds to line (5), Table 1.  The source plane, top left panel
771: and bottom row, is $0.75\arcsec\times0.75\arcsec$ with $0.025\arcsec$
772: pixels, and is centered on the optic axis. The source comprises two
773: circular Gaussian components and is shown top left. Also marked is the
774: line of the inner caustic for the isothermal ellipsoid lens. The
775: image, convolved with the psf, FWHM $0.08\arcsec$, and with noise
776: added, is shown upper middle. The image pixel size is $0.05\arcsec$
777: and the image box size is $5.0\arcsec\times5.0\arcsec$.  The lower
778: left panel is the source light distribution reconstructed from the
779: image by semi--linear inversion without regularization. The grayscale
780: range is the same as in Figure 1. The reconstruction is poor, because
781: the source pixel size is too small. The upper right panel is the image
782: of this source, convolved with the psf. The lower right panel displays
783: the $1\sigma$ uncertainty for the source pixels, and the lower middle
784: panel is the source $S/N$ image. The dotted square is the region over
785: which $S/N_{so}$ is measured. In each of Figures 1--3, counts in
786: pixels in both the image and source plane are in units of surface
787: brightness.}
788: \end{figure}
789: 
790: \clearpage
791: 
792: % - all numbers from either recon6_sntest.f or lens3.f
793: % - s/n in box figures from recon6_sntest.f. In lowres case, box has
794: %   limits x=7-14 y=3-10 inc. Hi res, box is x=13-28, y=5-20
795: % - s/n in lensed image ring from lens3.f ('ring'=region in lens plane 
796: %      which traces to source plane)
797: % - for PSFs, Gaussian PSF of 1sig width=0.034'' (=ACS WFC) has FWHM 
798: %   of 0.08'' (or 1sig width = 0.68 pixels for 0.05 pix size)
799: %
800: % For comparison, the total signal to total noise in the annulus traced
801: % by the best fit lens model for our HST image of 0047 gives S/N=44.
802: %
803: % To get integrated S/N of exactly 60 and 30, need to use noise
804: % levels in lens3.f (ie. 'noise0' parameter) of:
805: %       0.092 and 0.184 respectively in lo-res case
806: %       0.082 and 0.163 respectively in hi res case
807: 
808: {\em Variation of psf FWHM.} We have investigated the effect of
809: varying the width of the psf. Line (3) provides the results for no
810: psf, and line (4) provides the results for a psf FWHM of
811: $0.16\arcsec$, double the reference value. Comparison of lines (1),
812: (3), and (4) shows that as the psf FWHM increases the noise in the
813: source plane, col. (11), increases and the source detection $S/N$,
814: col. (9), decreases.  This is as expected: In Fourier space the effect
815: of the psf is to suppress the amplitude of the power spectrum of the
816: source for large wave numbers. Therefore in the deconvolution process
817: the noise on these scales is amplified. As the psf is broadened the
818: power suppression is greater, and so the noise amplification in the
819: deconvolution step is greater. The reduction in $S/N_{so}$ from 111.2
820: (line 3) to 79.9 (line 1), in going from no psf to psf FWHM of
821: $0.08\arcsec$, is quite modest. This demonstrates that satisfactory
822: inversion of ACS images using $0.05\arcsec$ source pixels is possible
823: without regularization.
824: 
825: Regardless of the degree of amplification of noise the various
826: statistical quantities in cols (7)--(11) of Table 1, lines (1)--(4),
827: are all reasonable. This shows that in these cases the inversion is
828: well behaved, and in none of the cases is the matrix ${\bsf F}$
829: singular. This contrasts with the usual inversion problem, for example
830: image deconvolution. With image deconvolution the number of parameters
831: to solve for (the counts in the deconvolved image pixels) is typically
832: the same as or greater than the number of constraints (the number of
833: image pixels). In lensing, because of magnification, the number of
834: image pixels may be much greater than the number of source
835: pixels. This suggests, further, that in regions where the
836: magnification is greatest it would be possible to use source pixels
837: smaller than the image pixels. We consider this issue below.
838: 
839: The results of these first four simulations indicate that, in some
840: circumstances, provided the psf is not too broad, unregularized
841: semi--linear inversion provides a useful solution.
842: 
843: {\em Variation of source pixel size.} Figure 2 shows the same problem
844: as Figure 1 but with $0.025\arcsec$ source pixels rather than
845: $0.05\arcsec$ pixels. The results are summarized in line (5) of Table
846: \ref{table1}. The quality of the reconstruction, lower left panel, is
847: now dramatically worse, and outside the central region is clearly
848: unsatisfactory. (The grayscale range of this panel is the same as in
849: Figure 1.) Compared to line (1) the noise in the source (col. 11) has
850: risen by a factor 8, whereas intuitively one would expect only a
851: factor 2 increase (4 times as many pixels). This is indicative of
852: large amplification of noise, because the psf has suppressed the
853: signal on these scales. This is a consequence of the fact that the
854: separation in the image plane of the images of two adjacent source
855: pixels is smaller than the psf size. Put another way, a resolution
856: element in the image plane, traced back to the source plane, is
857: oversampled by the source pixel size. The source covariance matrix now
858: contains large, predominantly negative, covariance terms which
859: correspond to the odd/even appearance in the outer regions of the
860: source plane.
861: 
862: The high noise level in the outer parts of the source--plane belies
863: the usefulness of this image. In fact the source is strongly detected,
864: albeit at reduced $S/N_{so}=20$, even though not readily apparent to
865: the eye. The source is clearly visible in the $S/N$ image, however. At
866: the same time the $\chi^2$ values in both the image and source planes
867: remain satisfactory. Because of the larger magnification, the
868: reconstruction is much better within the caustic line. This suggests
869: it would be advantageous to use a variable pixel size across the
870: source plane. For example, with reference to Figure 2, a scheme where
871: the pixel size is $0.05\arcsec$ outside the caustic and $0.025\arcsec$
872: inside might be appropriate. We need to identify a criterion for
873: choosing the pixel size that avoids the excessive amplification of
874: noise evident in Figure 2. There are clearly three variables which
875: determine the minimum source pixel size: The image pixel size, $a$,
876: the psf FWHM, $b$, and the magnification, $c$. We have had some
877: success with a scheme which relates the source pixel size to the
878: variable $max(a,b/2)/c^{1/2}$. The results will be reported elsewhere
879: (Dye and Warren, in prep.).
880: 
881: \clearpage
882: 
883: \begin{figure}
884: \label{fig:hires.reg1}
885: \plotone{f3.eps}
886: \caption{\footnotesize The plot shows regularized solutions for the same problem as
887: in Figure 2, with different degrees of regularization. The middle row
888: is for $N_\lambda=1$ (corresponding to line (8) of Table 1), and the
889: bottom is for $N_\lambda=3$ (line (9) of Table 1). The source plane,
890: top left panel, and middle and bottom rows, is
891: $0.75\arcsec\times0.75\arcsec$ with $0.025\arcsec$ pixels, and is
892: centered on the optic axis. The source comprises two circular Gaussian
893: components and is shown top left. Also marked is the line of the inner
894: caustic for the isothermal ellipsoid lens. The image, convolved with
895: the psf, FWHM $0.08\arcsec$, and with noise added, is shown upper
896: middle.  The image pixel size is $0.05\arcsec$ and the image box size
897: is $5.0\arcsec\times5.0\arcsec$.  The left middle panel is the source
898: light distribution reconstructed from the image by semi--linear
899: inversion with regularization, $N_\lambda=1$. The solution is much
900: less noisy than the unregularized solution, Figure 2. The upper right
901: panel is the image of this source, convolved with the psf. The middle
902: right panel displays the $1\sigma$ uncertainty for the source
903: pixels. The center panel of the middle row
904: is the source $S/N$ image. The bottom row is the set of corresponding
905: source-plane images for the case $N_\lambda=3$. Note the larger errors
906: in the outermost band in the bottom right panel. This is a
907: consequence of the choice of a gradient regularization term, since
908: these pixels have fewer neighbours. The dotted square is
909: the region over which $S/N_{so}$ is measured. In each of Figures 1--3,
910: counts in pixels in both the image and source plane are in units of
911: surface brightness.}
912: \end{figure}
913: 
914: \clearpage
915: 
916: \subsubsection{Regularized inversion}
917: \label{sec:sims.fix.reg}
918: 
919: We now include linear regularization in the inversion. All the results
920: reported here used the gradient form, equation (\ref{eq:eqo}). The results
921: are quite similar for the different linear regularizing schemes
922: described in \S\ref{sec:theory.unreg.fix}, however.
923: 
924: Lines (6) and (7) in Table \ref{table1} are the results for the
925: reference problem with different degrees of regularization.  As the
926: regularizing term increases, the source becomes smoother and
927: $\chi^2_{im}$ increases.  Line (6) is for $N_\lambda=1$. Comparing
928: line (6) to line (1) we see that the effect of regularization is to
929: suppress the noise in the reconstructed source and to increase
930: substantially the source $S/N$, col. (9). This is at the expense of a
931: poorer match to the true source light profile, as measured by cols (8)
932: and (10). For $N_\lambda=2$ the agreement with the input source is no
933: longer acceptable.
934: 
935: In lines (8) to (10) we provide solutions for source pixel size
936: $0.025\arcsec$, and psf FWHM of $0.08\arcsec$, and different degrees
937: of regularization, $N_\lambda=1, 3, 5$. These results compare directly
938: to the unregularized solution to the same problem, line (5). The
939: solutions for simulation no. 8, $N_\lambda=1$, and no. 9,
940: $N_\lambda=3$, are shown in Figure 3. The visual improvement,
941: comparing the sequence of Figure 2 (unregularized), Figure 3 middle
942: row (regularzsed, $N_\lambda=1$), and Figure 3 bottom row (regularized,
943: $N_\lambda=3$), is dramatic.
944: 
945: Comparing lines (8) to (10) against line (5) we see that, again,
946: regularization successfully suppresses noise, increasing the $S/N$ of
947: the detection of the source.  As $N_\lambda$ increases, in this case
948: $\chi^2_{so}$ increases only very slowly, much more slowly than in the
949: case for larger pixels. This is partly due to the fact that we chose a
950: smooth source, and the results would be different for a source with
951: more small--scale structure. Nevertheless, it indicates that the
952: standard criterion for the degree of regularization to apply,
953: $N_\lambda=1$, is somewhat arbitrary.
954: 
955: To summarise this sub--section, using a realistic problem, we have
956: validated the theory of the linear inversion step set out in
957: \S\ref{sec:theory}. This is the step that differs from the
958: maximum--entropy method of Wallington et al. (1996), and therefore is
959: the main point of the paper.
960: 
961: \subsection{Mass cycle}
962: \label{sec:sims.mass}
963: 
964: In the present sub-section we report the results of solving the complete
965: problem i.e. determining both the mass profile and the source light
966: distribution.
967: 
968: We first consider the unregularized case. Referring back to the
969: example of Figure 1, the problem is to invert the image at upper
970: middle. The free parameters are the five parameters describing the mass:
971: $x$, $y$, ellipticity, position angle, and velocity dispersion. We
972: searched through the parameter space to find the min.$-\chi^2$ fit. At
973: the minimum the matrix ${\bsf F}$ supplies most of the terms of the
974: curvature matrix. Following the precepts of \S
975: \ref{sec:theory.unreg.mass}, the remaining terms were filled in by
976: measuring the relevant second partial derivatives of the $\chi^2$
977: surface. We found the surface to be completely smooth and parabolic
978: near the minimum. The full curvature matrix was inverted to obtain the full
979: covariance matrix for all the parameters, the mass terms as well as
980: the counts in the source pixels. We found the input mass parameters
981: were correctly recovered to within the uncertainties. We checked the
982: full covariance matrix against the results of Monte--Carlo simulations
983: and found excellent agreement. This confirms that, provided the chosen
984: source pixel size is not too small, the unregularized semi--linear
985: inversion method is a practical solution to the problem of inversion
986: of a gravitational lens image with a resolved source.
987: 
988: We also compared the terms in the {\em source covariance matrix}
989: ${\bsf C}={\bsf F}^{-1}$ against the corresponding terms in the {\em
990: full covariance matrix}. The differences are relatively
991: small. Therefore the matrix ${\bsf C}$, at the global minimum,
992: provides an approximation to the true source--pixel errors that may be
993: very useful in the exploration stage, when considering different mass
994: models and different pixelizations.
995: 
996: In the regularized case we found, generally, that the procedure
997: converged more rapidly than in the unregularized case. Regularized
998: inversion can produce solutions which are not true representations of
999: the source (\S \ref{sec:sims.disc}, Table 1). Nevertheless, we found,
1000: in contrast, that the solution for the mass parameters is very
1001: insensitive to the degree of regularization. In the regularized case,
1002: the curvature of the merit function cannot be used to obtain the full
1003: covariance matrix so that an alternative approach such as a Monte Carlo
1004: method must be adopted (\S\ref{sec:theory.reg.mass}). So the source
1005: covariance matrix, equation (\ref{eq:eqr}), at the global minimum, is
1006: particularly useful here as an approximation to the true source--pixel
1007: errors.
1008: 
1009: \subsection{Regularized {\em vs} unregularized}
1010: \label{sec:sims.disc}
1011: 
1012: We now include a debate on the relative advantages of the
1013: unregularized and regularized approaches. This may seem surprising
1014: given the excellent results achieved with regularization (comparing
1015: Figures 2 and 3). The weakness of the unregularized inversion is that
1016: in deconvolving the psf, the noise at large wavenumbers is
1017: boosted. The regularization term in the merit function imposes
1018: smoothness on the solution. In effect, the deconvolution (division in
1019: Fourier space) is limited to the smaller wave numbers. However, this
1020: means that any real structure in the source at large wavenumbers is
1021: also suppressed. We are imposing a prejudice that the source is smooth
1022: and this might not be justified (see comments in \S1). Regularization
1023: introduces $+$ve covariance between adjacent pixels, forcing the
1024: counts to be similar. The regularized solution, then, is not so
1025: different to the unregularized solution with larger pixels. In this
1026: respect, it is interesting to compare the unregularized solution with
1027: $0.05\arcsec$ source pixels (Figure 1), with the $N_\lambda=1$
1028: regularized solution with $0.025\arcsec$ source pixels (Figure
1029: 3). Noting that the two values of the source $S/N$ are quite similar
1030: (79.9, 64.5, respectively, Table \ref{table1}), it is a debatable point
1031: whether there is more information in the latter figure.
1032: 
1033: A further point to note is that the regularized inversion can produce
1034: solutions which are satisfactory in terms of the fit to the image but
1035: which are not true representations of the source in the sense that
1036: the $\chi^2_{so}$ is unsatisfactory. For example in line (7), Table
1037: \ref{table1}, both $\chi^2_{so}$ and $|\Delta s/\sigma|$ are
1038: unsatisfactory. Measured by the same statistics, none of the
1039: unregularized inversions in the Table is unsatisfactory. The
1040: unregularized inversion gives a noisier but unbiased solution for the
1041: source light distribution, while the regularized inversion gives a
1042: smoother but biased solution.
1043: 
1044: Despite regularization biasing the source, in the full mass cycle we
1045: find that the minimized mass parameters show little sensitivity to the
1046: degree of regularization. Furthermore, the regularized solution has
1047: the advantage that it converges more quickly. We discuss the practical
1048: significance of these two points in \S\ref{sec:conc}. Associated with
1049: this is the fact that regularization allows source pixel sizes of
1050: almost any size, unlike the unregularized case when pixel size must be
1051: chosen carefully. This can yield further speed advantages in the
1052: initial stages of an analysis, before the solution is refined.
1053: 
1054: Overall we consider there are important advantages to using both
1055: regularized and unregularized inversion in exploring the solution to a
1056: particular problem, and the choice will depend on the question being posed
1057: and any a priori knowledge concerning the source. Perhaps equally
1058: importantly, however, it makes sense to match the source pixel size to
1059: the data information content in terms of the $S/N$ at different
1060: wavenumbers, or, in other words, to vary the pixel size depending on
1061: the magnification, as suggested in \S\ref{sec:sims.fix.unreg}.
1062: 
1063: \section{Summary and recommendations}
1064: \label{sec:conc}
1065: 
1066: We have developed a new method for the inversion of
1067: gravitationally--lensed images of extended sources for the case where
1068: the source light profile is pixelized. The method separates the linear
1069: dimensions of the problem (the counts in the source pixels) from the
1070: non--linear dimensions (the mass parameters). The method has been
1071: extended in a natural way to allow linear regularization of the
1072: inversion. The core of the routine is the procedure for inverting an
1073: image given a fixed mass profile. We have shown that this step,
1074: including deconvolution of the psf, with or without regularization, is
1075: a linear one. Since this step is usually achieved by searching the
1076: source parameter space for the merit--function minimum, the solution
1077: is reached much more quickly. The non--linear part of the problem has
1078: been reduced to the search for a minimum in the space of the mass
1079: parameters only. In the case of unregularized inversion, the full
1080: covariance matrix for all the (source$+$mass) parameters can be
1081: obtained very quickly. In the case of regularized inversion, a useful
1082: approximation to the covariance matrix for the source counts is
1083: obtained very simply, but Monte Carlo methods are needed to obtain the
1084: full covariance matrix.
1085: 
1086: How the semi--linear method should be applied in practice depends on
1087: the problem posed. If one is interested in the quantitative details of
1088: the source light profile, for example, whether some apparent feature
1089: is real, then we recommend the unregularized solution. This is because
1090: regularization produces source profiles which are too smooth.  Without
1091: regularization, an optimal source pixel size should be chosen. Having
1092: too large a source pixel may cause interesting detail to be
1093: lost. However, if the source pixel size is too small, the inverted
1094: image may have low $S/N$ because of amplification of noise in the
1095: deconvolution step.  If, on the other hand, one is interested only in
1096: the mass parameters, a regularized solution would be the appropriate
1097: choice: The mass parameters are rather insensitive to the degree of
1098: regularization and one benefits from an increase in inversion speed.
1099: 
1100: Another consideration is that pixelizing the source uses a large
1101: number of parameters. As a rule, one is interested in finding the
1102: model with the smallest number of parameters that provides a
1103: satisfactory fit to an image. Therefore, in many cases, one might
1104: simply use the semi--linear method of inversion to provide an image of
1105: the source to guide the choice of parameterization. Here, again, the
1106: regularized solution might be the preferred option.
1107: 
1108: In general, because it is so much easier to implement (\S
1109: \ref{sec:theory.reg.fix}), we recommend using zeroth--order
1110: regularization. Nevertheless, other considerations may override
1111: simplicity. The zeroth--order regularization term, in common with the
1112: maximum--entropy regularization term, is a local measure, independent
1113: of the counts in adjacent pixels. This can be an advantage or a
1114: disadvantage, depending on the actual light profile in the source.
1115: 
1116: In the simulations presented here, we have used square source pixels
1117: which form a regular grid. However, since the resolution across the
1118: image plane is fixed while the magnification varies, the resolution
1119: across the source varies. Therefore, to maximize the information
1120: content in the reconstruction of the source it is necessary to use a
1121: variable source pixel size. We will present an analysis of
1122: semi--linear inversion with variable source pixel size in a future
1123: paper (Dye and Warren, in prep.).
1124: 
1125: \acknowledgments
1126: We have benefited from discussions with Paul Hewett, Geraint Lewis,
1127: Leon Lucy, and Randall Wayth.
1128: 
1129: \begin{thebibliography}{}
1130: 
1131: \bibitem[Kayser \& Schramm(1988)]{ks88} Kayser, R. \& Schramm, T.,
1132: 1988, A\&A, 191, 39 
1133: 
1134: \bibitem[Kochanek et al.(1989)]{k89} Kochanek, C.S., Blandford R.D.,
1135: Lawrence C.R., \& Narayan, R., 1989, \mnras, 238, 43
1136: 
1137: \bibitem[Press et al.(2001)]{nr01} Press, W.H., Teukolsky, S.A.,
1138: Vetterling, W.T., \& Flannery, B.P., 2001, 'Numerical Recipes in 
1139: Fortran 77, 2nd Edition', Cambridge University Press
1140: 
1141: \bibitem[Tyson et al.(1998)]{ty98} Tyson, J.A., Kochanski, G.P., \&
1142: dell'Antonio, I.P., 1998, \apj, 498, 107
1143: 
1144: \bibitem[Wallington et al.(1995)]{wkk95} Wallington, S., Kochanek,
1145: C.S., \& Koo, D., 1995, \apj, 441, 58
1146: 
1147: \bibitem[Wallington et al.(1996)]{wkn96} Wallington, S., Kochanek, C.S.,
1148: \& Narayan, R., 1996, \apj, 465, 64
1149: 
1150: \bibitem[Wayth et al.(2003)]{wa03} Wayth, R.S., Warren, S. J., Lewis, G.F., 
1151: \& Hewett, P.C., 2003, \mnras, in preparation
1152: 
1153: \end{thebibliography}
1154: 
1155: \end{document}
1156: 
1157: 
1158: