0712.4140/ms.tex
1: %%
2: %% Beginning of file 'sample.tex'
3: %%
4: %% Modified 2005 December 5
5: %%
6: %% This is a sample manuscript marked up using the
7: %% AASTeX v5.x LaTeX 2e macros.
8: 
9: \documentclass[12pt,preprint]{aastex}
10: %\documentclass[preprint2]{aastex}
11: 
12: %% manuscript produces a one-column, double-spaced document:
13: 
14: %\documentclass[manuscript]{aastex}
15: %\usepackage{amsmath}
16: 
17: \newcommand{\vdag}{(v)^\dagger}
18: 
19: \begin{document}
20: 
21: \title{Bayesian Image Reconstruction \\
22:  Based on Voronoi Diagrams}
23: 
24: \author{G. F. Cabrera\altaffilmark{1, 2}, S. Casassus\altaffilmark{1}}
25: %\affil{Departamento de Astronom\'ia, Universidad de Chile, Santiago, Casilla 36-D, Chile}
26: \and
27: \author{N. Hitschfeld\altaffilmark{2}}
28: \email{guille@das.uchile.cl}
29: 
30: \altaffiltext{1}{Departamento de Astronom\'ia, Universidad de Chile,
31:   Santiago, Casilla 36-D, Chile} 
32: \altaffiltext{2}{Departamento de Ciencias de la Computaci\'on, 
33:   Universidad de Chile, Santiago}
34: 
35: %\affil{Departamento de Ciencias de la Computaci\'on, Universidad de Chile, Santiago}
36: 
37: \begin{abstract}
38: We present a Bayesian Voronoi image reconstruction technique (VIR) for
39: interferometric data. Bayesian analysis applied to the inverse problem
40: allows us to derive the \emph{a-posteriori} probability of a novel
41: parameterization of interferometric images. We use a variable Voronoi
42: diagram as our model in place of the usual fixed pixel grid.
43: % To describe the probability we calculate the likelihood
44: % and \emph{a-priori} probabilities, which need the intensity
45: % distributions to be discretized. We use a $\sigma_\mathrm{q}$ quantum
46: % for this purpose. 
47: A quantization of the intensity field allows us to calculate the
48: likelihood function and \emph{a-priori} probabilities.  The Voronoi
49: image is optimized including the number of polygons as free
50: parameters.
51: % We show reconstructions made over visibilities simulated using the
52: % Cosmic Background Imager (CBI) and compare them with a MEM
53: % reconstruction.
54: We apply our algorithm to deconvolve simulated interferometric data.
55: Residuals, restored images and $\chi^2$ values are used to compare our
56: reconstructions with fixed grid models. VIR has the advantage of
57: modeling the image with few parameters, obtaining a better image from
58: a Bayesian point of view.
59: %as required by Bayesian theory.
60: \end{abstract}
61: 
62: \keywords{methods: data analysis --- methods: numerical --- methods:
63: statistical --- techniques: image processing --- techniques:
64: interferometric}
65: 
66: \section{Introduction}
67: %\subsection{Interferometric Reconstruction Basics}
68: % \subsection{Introduction to Image Reconstruction for Interferometric Data}
69: 
70: % Interferometers sample the Fourier transform of the sky intensity
71: % field.
72: 
73: % A general problem in astronomy is to obtain the sky image from the
74: % data. The data consists in the image convolved with the instrument
75: % response. 
76: 
77: % Astronomical interferometric data are the sum of instrumental noise
78: % with the convolution of the sky image and the instrumental response.
79: Astronomical interferometric data result from the addition of
80: instrumental noise to the convolution of the sky image and the
81: instrumental response.  Because of incomplete sampling in the $(u, v)$
82: plane, obtaining sky images from interferometric data is an instance
83: of the inverse problem, and involves reconstruction algorithms.
84: 
85: %\subsection{Existing Methods} 
86: 
87: % TESIS
88: %The CLEAN algorithm was created by \cite{CLEAN}. It starts with a
89: %direct Fourier transform of the visibilities by which it obtains the
90: %dirty map (DM). Then, on each iteration it subtract a dirty beam (DB)
91: %image centered in the maximum intensity point of the image, normalized
92: %by a $\gamma I_0$ value, where $I_0$ is the maximum intensity value
93: %and $\gamma$ is called the \emph{loop gain}. This is repeated until
94: %the noise level on the map is reached. The final reconstructed map
95: %consists of the sum of the resulting map with all the components
96: %removed on previously iterations but in the form of clean beams. 
97: 
98: The CLEAN method consists of modeling the side-lobe disturbances and
99: subtract them iteratively from the dirty map \citep[][]{CLEAN}. The
100: CLEAN method works well for low noise and simple sources. But if the
101: source has many complex features, or if the data is too noisy, CLEAN
102: will do only a few iterations returning a noisy image
103: \citep[][]{CLEAN}. Another shortcoming is that CLEAN involves some
104: ad-hoc parameters (the loop gain, stopping criteria, clean beam) that
105: bias the final reconstruction, in the sense that CLEAN can give many
106: different reconstructions for the same dataset.
107: % image user-dependent
108: % (sesgada?).
109: 
110: The maximum entropy method (MEM) finds the image that simultaneously
111: best fits the data, within the noise level, and maximizes the entropy
112: $S$. This is done by minimizing
113: \begin{equation}
114:   L_\mathrm{MEM} = \chi^2 - \lambda S,\label{eq:LMEM}
115: \end{equation}
116: where, for the case of interferometric data, $\chi^2$ can be
117: calculated as
118: \begin{equation}
119:   \chi^2 = \sum_{k=1}^{N_\mathrm{Vis}}\frac{||V_k^\mathrm{obs} -
120:     V_k^\mathrm{mod}||^2}{\sigma_k^2},
121: \end{equation}
122: where the sum runs over all the $N_\mathrm{Vis}$ visibilities, the
123: symbol $||z||$ stands for the modulus of the complex number $z$ and
124: $\sigma_k$ is the root mean square (rms) noise of the corresponding
125: visibility. $\lambda$ is a control parameter and the entropy $S$
126: varies for different implementations \citep[e.g.][]{Nar&Nit}. The
127: entropy is used as a regularizing term in a degenerate inverse
128: problem, when there are more free parameters than data. Different
129: formulations for $S$ appear in the literature. Some examples are
130: $\sum_i\ln(I_i)$, $\sum_iI_i\ln(I_i)$, $\sum_i\ln(p_i)$,
131: $\sum_ip_i\ln(p_i)$, where $I_i$ is the specific intensity value at
132: pixel $i$ and $p_i = I_i/\sum_iI_i$ \citep[see][and references
133: therein]{P&P}.
134: 
135: \cite{Corn&Ev} used MEM in the AIPS VM task. Their method makes some
136: approximations that diagonalize the Hessian matrix required to
137: optimize their merit function. They used an entropy of the form $S =
138: -\sum_iI_i\log{(I_i/ m_i)}$, where the sum extends over all the pixels
139: $i$, $\{I_i\}_{i=1}^n$ is the model image and $\{m_i\}_{i=1}^n$ is a
140: prior image. However, the neglect of the side-lobe contribution to the
141: Hessian may lead the optimization to local minima that still bear
142: instrumental artifacts. \cite{Casassus}implemented a MEM algorithm
143: based on the conjugate gradient method, without the use of the
144: Cornwell and Evans approximation.
145: %\cite{Casassus} have used a similar approach but without the use of
146: %any approximation. 
147: They used an entropy of the form $S = -\sum_iI_i\log{(I_i/ M)}$, where
148: $\{I_i\}_{i=1}^N$ is the model image and $M$ is a small intensity
149: value, i.e they start with a blank image prior, and $M$ is an
150: intensity value much smaller than the noise.
151: % Casassus et al. created their own MEM
152: % because the neglect of the side-lobes (visible in the dirty map
153: % surrounding the primary beam half maxima contour) contribution to the
154: % Hessian of the merit function in Cornwell and Evans MEM was not well
155: % suited to their ends.
156: % They did not use Cornwell and Evans MEM because their neglect of the
157: % side-lobes (visible in the dirty map surrounding the primary beam half
158: % maxima contour) contribution to the Hessian of the merit function was
159: % not well suited to their ends.
160: 
161: % The idea of introducing the entropy is to smooth the final image.
162: 
163: Bayesian analysis is a powerful tool for image reconstruction
164: techniques. In this application, our goal is to find the most probable
165: image by maximizing its \emph{a-posteriori} probability. For Bayesian
166: methods, the \emph{a-priori} and likelihood distributions are
167: needed. To derive the \emph{a-priori} probability the definition of an
168: intensity quantum is needed. This quantum represents the minimum
169: measurable intensity unit. The intensity in each pixel can be
170: interpreted as a number of quanta $I_i = \sigma_\mathrm{q}N_i$, where
171: $I_i$ is the intensity in pixel $i$, $\sigma_\mathrm{q}$ is the
172: quantum size and $N_i$ the number of quanta in pixel $i$.
173: 
174: \cite{P&P} used Bayesian analysis in the Pixon algorithm. They use a
175: variable model and maximize $P(I,M|D)$, that is, the probability of
176: the image $I$ and model $M$ given the data $D$. In their approach the
177: model used to parameterize the image is a set of Gaussians which are
178: used to average a pseudo-image. The pseudo-image starts as a maximum
179: residual likelihood reconstruction and a local Gaussian pixon is
180: assigned to each of its pixels. The number of pixons, and hence the
181: number of free parameters, is reduced in each iteration. 
182: 
183: \cite{W&S} have used Bayesian analysis for interferometric data, but
184: using a fixed pixel grid to parameterize the model image. They use
185: Gibbs sampling to determine the posterior density distribution.
186: 
187: The most typical model used in astronomy to represent the sky
188: brightness distribution consists of a pixel grid.  A big disadvantage
189: of this grid is that the number of pixels remains fixed as well as
190: their size. Often, uniform pixel grids involve more free parameters
191: than really needed to fit the data.
192: 
193: The purpose of this paper is to explore Bayesian reconstruction with
194: image models based on Voronoi tessellations in place of the usual
195: pixelated image. We call this new deconvolution method ``Voronoi image
196: reconstruction'' (VIR, hereafter). The advantage of using Voronoi
197: models is that it is possible to use a smaller number of free
198: parameters, as required by Bayesian theory. Our purpose is not optimal
199: CPU efficiency; we search for the optimal image and model from a
200: Bayesian point of view.
201: 
202: We used the Cosmic Background Imager \citep[CBI,][]{pad02} to
203: illustrate our method. The CBI is a planar interferometer array with
204: 13 antennas, each 0.9~m in diameter, mounted on a 6~m tracking
205: platform. An example of CBI baselines is shown in Figure
206: \ref{fig:baselines}. The radius of the hole at the center of the $(u,
207: v)$ plane is the reciprocal of the minimum distance between two
208: antennas, measured in wavelengths. The side-lobes of the CBI are
209: caused mainly by this central hole in the $(u, v)$ baselines.
210: % This central hole in the $(u, v)$ baselines causes the side-lobes of
211: % the CBI.
212: 
213: \begin{figure}[h]
214: \epsscale{.50}
215: \plotone{f1.eps}
216: \caption{Coverage in the $(u, v)$ plane of the CBI in the
217:   configuration used for our simulations.\label{fig:baselines}}
218: \end{figure}
219: 
220: 
221: We briefly summarize the elements of Bayesian theory that determine
222: the probability distributions concerning our problem
223: (Section~\ref{sec:Bayes}).  The new model based on Voronoi
224: tessellations is described (Section~\ref{sec:Voronoi}), as well as
225: optimization issues involved in our problem
226: (Section~\ref{sec:Optimization}).  We discuss implementation details
227: such as the optimal quantum size and number of Voronoi polygons
228: (Section~\ref{sec:Implementation}), compare reconstructions made with
229: MEM and VIR (Section~\ref{sec:Results}) and finally summarize our
230: results (Section~\ref{sec:Conclusions}).
231: 
232: % The Cosmic Background Imager (CBI, \cite{pad02}) data is a particular
233: % case where image reconstruction is difficult. The CBI is a planar
234: % interferometer array with 13 antennas, each 0.9~m in diameter, mounted
235: % on a 6~m tracking platform, which rotates in parallactic angle to
236: % provide uniform $uv$-coverage. The CBI receivers operate in 10
237: % frequency channels, with 1~GHz bandwidth each, giving a total
238: % bandwidth of 26--36~GHz. It is located in Llano de Chajnantor,
239: % Atacama, Chile. An example of CBI baselines is shown in Figure
240: % \ref{fig:baselines}. The hole in the center of the $(u, v)$ plane is
241: % caused because of the minimum distance that two antennas can be
242: % arranged. This sparse distribution of the $(u, v)$ baselines causes
243: % the CBI to have strong side-lobes.
244: 
245: \section{Bayesian Theory} \label{sec:Bayes}
246: 
247: An image model is required to parameterize the sky brightness
248: distribution. The most typical model used in astronomy is a
249: rectangular grid of uniform pixels. That configuration of pixels is
250: the model $M$, and the distribution of brightness in the model is
251: called an image $I$.  We search for the image that represents as
252: accurately as possible the visibility data $D$.  The Bayesian image
253: reconstruction approach, using a fixed model, tries to find the image
254: that maximizes the probability $P(I | D, M)$, i.e. find the most
255: probable image given the data and the model.
256: 
257: Using the Bayes theorem, we obtain
258: \begin{equation}
259:   P(I | D, M) = \frac {P (D| I, M)P(I | M)}{P (D | M)}.
260: \end{equation}
261: Since the data is fixed, $P(D | M)$ is a constant in the problem when
262: the model is not considered as a variable. Thus, the fixed image model
263: optimization problem reduces to
264: \begin{equation}
265:   \max_I P(I | D, M) = \max_I P (D| I, M)P(I | M). \label{eq:P(I|D|M)}
266: \end{equation}
267: The first term, $P (D|I, M)$ is called the likelihood, and measures
268:  how well our data represents our image. The second term, $P (I | M)$
269:  is called the image prior, and gives the \emph{a-priori} probability
270:  of the image given the model, i.e. how probable is the image given
271:  only the model.
272: 
273: In the case of having a variable model, what we would like to find is
274: the image and model that maximize $P(I, M| D)$, i.e. find the most
275: probable image and model given the data. In this case we find
276:  \begin{eqnarray}
277:    P(I, M| D) & = & P(I|D,M)P(M|D)\nonumber\\
278:    & = & \frac {P (D| I, M)P(I| M)P(M|D)}{P (D | M)}\nonumber\\
279:    & = & \frac {P (D| I, M)P(I| M)P(M)}{P (D)}.
280:  \end{eqnarray}
281: Since the data is fixed, $P(D)$ is constant in our problem. As we
282: cannot privilege one model over another in the absence of image and
283: data, $P(M)$ is the same for all models, so it is not important for
284: our analysis. This way, our optimization problem reduces to
285: % the same one we would have with a fixed model in Eq. \ref{eq:P(I|D|M)}
286:  \begin{equation}
287:   \max_{I, M} P(I, M| D) = \max_{I, M} P (D| I, M)P(I | M).
288:  \end{equation}
289: 
290: \subsection {Probability Distributions}
291: 
292: Our data is a set of $N_\mathrm{Vis}$ observed visibilities
293: $\{V_1^\mathrm{obs}, V_2^\mathrm{obs}, \cdots,
294: V_{N_\mathrm{Vis}}^\mathrm{obs}\}$.  If we have a certain model $M$
295: and image $I$, we obtain model visibilities $\{V_k^\mathrm{mod}\}$
296: by simulating the interferometric observations over our image:
297: \begin{equation} 
298: V^\mathrm{mod}_k = V^\mathrm{mod}(u_k,v_k) = \int_{-\infty}^{+\infty} A(x,y)
299: I(x,y)\exp\left[2\pi i (u_kx+v_ky)\right]
300: \frac{dx\,dy}{\sqrt{1-x^2-y^2}} ~, \label{eq:vmodel}
301: \end{equation}
302: where $\{u_k, v_k\}$ are the coordinates of baseline $k$ in the $(u,
303: v)$ plane and $A$ is the primary beam. We thus have a set of
304: $N_\mathrm{Vis}$ model visibilities.  Assuming that each visibility is
305: independent from the others and Gaussian noise, the likelihood is
306: \begin{eqnarray}
307:   P (D|I, M) & = &
308:   P(\{V_k^\mathrm{obs}\}_{k=1}^{N_\mathrm{Vis}}|\{V_k^\mathrm{mod}(I, M)\}_{k=1}^{N_\mathrm{Vis}})
309:   = \prod_{k=1}^{N_\mathrm{Vis}} P (V_k^\mathrm{obs}|V_k^\mathrm{mod}) \nonumber\\
310:   & = &
311:   \prod_{k=1}^{N_\mathrm{Vis}}\frac{1}{2\pi\sigma_k^2}e^{-||V_k^\mathrm{obs}
312:   - V_k^\mathrm{mod}||^2/2\sigma_k^2}.
313: \end{eqnarray}
314: % where the symbol $||z||$ stands for the modulus of the complex number
315: % $z$ and $\sigma_k$ is the rms noise of the corresponding visibility.
316: 
317: To obtain the image prior, $P (I | M)$, we calculate the statistical
318: weight of a given distribution of counts \citep[as
319: in][]{P&P,W&S}. Consider a model consisting of $n$ cells. In the case
320: of a traditional image, each pixel would be a cell. There is a number
321: of $N$ quanta falling into these cells. These are intensity quanta of
322: some size $\sigma_\mathrm{q}$.
323: % These quanta are used to determine the number of distinguishable
324: % events in each cell.
325: In the case of a pixelated image, the intensity in each pixel $i$
326: would be $I_i = \sigma_\mathrm{q} N_i$, where $I_i$ is the intensity
327: in cell $i$. Each quantum could fall into any of the $n$ cells, so the
328: total number of possible configuration for the $N$ quanta will be
329: $n^N$. The probability of the image given the model is the probability
330: of a certain state $\{N_1, N_2, \cdots, N_n\}$ that represents that
331: image, where $N_i$ is the number of quanta in cell $i$.  Consider a
332: given image configuration defined by a particular distribution
333: $\{N_i\}$. The image distribution is not changed in the $N!$ possible
334: redistributions of counts between cells, provided each $N_i$ is
335: constant. The $\prod_i N_i!$ swaps of counts within each cell keep the
336: same image configuration.
337: % If we place all the quanta in the desired configuration, it is
338: % possible to swap them and still have the same configuration
339: % $\{N_i\}$. The number of possible swaps are $N!$. But swapping between
340: % elements of the same cell would leave the same state.
341: The model $M$ consists of the Voronoi diagram and the total number of
342: quanta (i.e. n, the position of the generators and N), thus the
343: \emph{a-priori} probability is
344: % reduces to $n$, thus the \emph{a-priori} probability is
345: \begin{equation}
346: P (I | M) = P (\{N_i\}|n, N) = \frac{N!}{n^N\prod_i N_i!}.
347: \label{eq:apriori}
348: \end{equation}
349: % \cite{P&P} and \cite{W&S} have used this same Bayesian approach for
350: % defining their regularizing term.
351: 
352: As explained above, $\sigma_\mathrm{q}$ is an intensity quantum. It is
353: also possible to describe the number of quanta per cell using a flux
354: quantum $\sigma_i^\mathrm{F}$, where $i$ is the index of the cell to
355: which we associate the quantum. This flux quantum can be expressed in
356: terms of the intensity quantum as $\sigma_i^\mathrm{F} =
357: \sigma_\mathrm{q}A_i$, where $A_i$ is the area of cell $i$. In this
358: case, the number of quanta per cell is $N_i =
359: F_i/\sigma_i^\mathrm{F}$, where $F_i = I_i A_i$ is the flux of cell
360: $i$. This leads to $N_i = I_i/\sigma_\mathrm{q}$, which is the same
361: expression for $N_i$ obtained using the intensity quantum
362: $\sigma_\mathrm{q}$. Using these cell-dependent flux quanta, the
363: probability of a quantum falling into each cell will be $\frac{1}{n}$
364: for every cell, leaving the \emph{a-priori} probability the same as
365: Eq. \ref{eq:apriori}.
366: 
367: \subsection {MEM and Natural Entropy}
368: 
369: % As said before, MEM tries to obtain an image that adjusts to the data
370: % within the noise level while maximizes the entropy. This is done by
371: % minimizing $L = \chi^2 - \lambda S$.
372: 
373: In Bayesian theory, for a fixed model, the image $I$ can be found by
374: optimizing the \emph{a-posteriori} probability:
375: \begin{eqnarray}
376:   \max_I P (I | D, M) & = & \min_I(-\ln{P (D| I, M)P(I | M)})\nonumber\\
377:   & = & \min_I\sum_{k=1}^{N_\mathrm{Vis}}\frac{||V_k^\mathrm{obs} -
378:   V_k^\mathrm{mod}||^2}{2\sigma_k^2} - \ln\bigg(\frac{N!}{n^N\prod_i
379:   N_i!}\bigg)\nonumber\\
380:   & = & \min_I\frac{1}{2}\chi^2 - S,\label{eq:funcL}
381: \end{eqnarray}
382: where we have defined the natural entropy $S =
383: \ln\bigg(\frac{N!}{n^N\prod_i N_i!}\bigg)$. \cite{W&S} call the term
384: $\ln{(N!/\prod_i N_i!)}$ the multiplicity prior. In the limit of large
385: $N_i$,
386: \begin{eqnarray}
387: S 
388: % TESIS
389: % & = & \ln{N!} - N\ln{n} - \sum_i\ln{N_i!}\nonumber\\
390: % & \simeq & N\ln{N} - N -N\ln{n} - \sum_i(N_i\ln{N_i} - N_i)\nonumber\\
391: % & = & N\ln{N} - N\ln{n} - \sum_i N_i\ln{N_i}\nonumber\\
392: & \simeq & N\ln\frac{N}{n} - \sum_i N_i\ln{N_i},\label{eq:SStirling}
393: \end{eqnarray}
394: and it can be seen that the Bayesian method is very similar to MEM in
395: the sense that we are adjusting the image to the data while maximizing
396: an entropy of the form of Eq. \ref{eq:SStirling}.
397: % \cite{W&S} made a similar analysis for the entropic term, where they
398: % call the term $\ln{(N!/\prod_i N_i!)}$ the multiplicity prior.
399: % We used the exact version of the natural entropy, but did this entropy
400: % analysis to compare the MEM algorithm and the Bayesian method.
401: VIR uses the natural entropy as a regularizing term.
402: 
403: % Write about MEM, the election of the entropy function and the Bayesian entropy.
404: 
405: \section {A New Image Model based on Voronoi Diagrams} \label{sec:Voronoi}
406: 
407: A Voronoi diagram is a division of the Euclidian plane into $n$
408: regions $\mathcal{V}_i$ defined by $n$ points $\vec{x_i}$ (called
409: sites or generators) such that every coordinate $\vec{x}$ in the space
410: belongs to $\mathcal{V}_i$ if and only if $||\vec{x} - \vec{x_i}|| <
411: ||\vec{x} - \vec{x_j}||\ \forall\ j\neq i$. The result of the above
412: definition is a set of polygons defined by the generators. Figure
413: \ref{fig:Voronoi} shows an example of a Voronoi diagram. For further
414: details on Voronoi diagrams see \cite{Voronoi}.
415: 
416: \begin{figure}[h]
417: \epsscale{.50}
418: \plotone{f2.eps}
419: \caption{Example of Voronoi diagram.\label{fig:Voronoi}}
420: \end{figure}
421: 
422: We propose a 2D Voronoi diagram in place of the usual pixelated,
423: uniform grid, image as our model. We associate an intensity $I_i$ to
424: each of these polygons. The advantage of using a Voronoi diagram is
425: that we can use just as many cells (i.e. free parameters) as the data
426: requires. Our optimization parameters will be the position of each
427: generator $\vec{x_i} = (x_i, y_i)$, and the intensity at each cell,
428: $I_i$.
429: 
430: With our new model $M$ consisting of $n$ generators ($3\times n$
431: parameters, $x_i, y_i$ and $I_i$ for each generator), we can vary the
432: number of free parameters as required by the optimization problem. We
433: can see in equation (\ref{eq:funcL}) that the
434: % QUANTUM importance of the 
435: entropy $S$ increases as the number of cell $n$ decreases.
436: 
437: \section{Optimization} \label{sec:Optimization}
438: 
439: The optimization problem can be seen as a maximization of the
440: \emph{a-posteriori} probability $\max_{I, M} P (I, M | D)$, or
441: equivalently as a minimization of the more convenient merit function
442: $L = \frac{1}{2}\chi^2 - S$.
443: %There are different approaches for optimizing functions. 
444: The conjugate gradient method (CG) is often used for minimization
445: problems where derivatives can be easily calculated. Though it is
446: usually fast in convergence, CG has the problem of converging on local
447: minima depending on the initial condition.
448: % We also tried the use of
449: % Monte Carlo methods, such as Metropolis-Hasting, Gibbs sampling and
450: % some efficient Monte Carlo methods, and Genetic algorithms with no
451: % satisfactory results.  
452: The use of other optimization algorithms is postponed to future work.
453: 
454: % TESIS
455: % Monte Carlo methods (MM) and genetic algorithms (GAs) are maximization
456: % algorithms that try to find the global maximum of the merit
457: % function. MM are used for generating samples that follow a certain
458: % density distribution, while GAs are heuristics based on evolution of
459: % species that find the vector of parameters that maximize certain
460: % function. GA doesn't have a proof of why they work, but they often do.
461: 
462: %Write about the different approaches for the optimization function
463: %(maximization of the probability or minimization of the object
464: %function).
465: 
466: % TESIS
467: % \subsection{Conjugate Gradient}
468: 
469: The CG method searches parameters space using the gradient of the
470: function to be minimized. The derivatives of this function are
471: \begin{eqnarray}
472:   \frac{\partial L}{\partial x} & = & \frac{1}{2}\frac{\partial \chi^2
473:   }{\partial x} - \frac{\partial S}{\partial x},\\ 
474:   \frac{\partial \chi^2}{\partial x} & = & 2\sum_{k=1}^{N_\mathrm{Vis}}
475:   \frac{1}{\sigma_k^2} \mathrm{Re}\left((V_k^{\mathrm{mod}}
476:   - V_k^{\mathrm{obs}})^* \frac{\partial
477:   V_k^{\mathrm{mod}}}{\partial x}\right) \label{dLdx},
478: \end{eqnarray}
479: where $x$ is any of the optimization parameters ($x_i$, $y_i$ or
480: $I_i$). The derivatives of the visibilities with respect to the
481: position $\vec{x}_i = (x_i, y_i)$ of the $i$ generator are
482: \begin{eqnarray}
483:   \frac{\partial V_k^{\mathrm{mod}}}{\partial x_i} & = & 
484:   \sum_{j\in J_i} \bigg[(I_i - I_j)
485:     \sum_{l|\mathrm{pixel~}l\epsilon a_{ij}}A_l\Delta t_l
486:     (M_xt_l + b_x)e^{(t_lc_2 + s_0c_1)}\bigg],\\ 
487:   \frac{\partial V_k^{\mathrm{mod}}}{\partial y_i} & = & 
488:   \sum_{j\in J_i} \bigg[(I_i - I_j)
489:     \sum_{l|\mathrm{pixel~} l\epsilon a_{ij}}A_l\Delta t_l
490:     (M_yt_l + b_y)e^{(t_lc_2 + s_0c_1)}\bigg],
491: \end{eqnarray}
492: where $I_i$ is the intensity in cell $i$, $J_i$ is a set of the
493: indices of the polygons adjacent to $\mathcal{V}_i$, $a_{ij}$ is the
494: edge which divides polygons $\mathcal{V}_i$ and $\mathcal{V}_j$, $l$
495: sums over the pixels which intersect $a_{ij}$, $A$ is the CBI primary
496: beam. For further details see Sec.~\ref{ap:derivatives}.
497: 
498: The derivative of the visibilities with respect to the intensity of
499: each cell $I_i$ is
500: \begin{equation}
501:   \frac{\partial V_k^{\mathrm{mod}}}{\partial I_i} =
502:   \frac{\sin{(\pi u_k\Delta x)}\sin{(\pi v_k\Delta y)}}{\pi^2u_kv_k}
503:   \sum_{\textrm{\scriptsize{pixels }}l\epsilon \mathcal{V}_i}A_l
504:   e^{2\pi i(u_kx_l+v_ky_l)} \label{eq:dVdI},
505: \end{equation}
506: where $\vec{k}_k = (u_k, v_k)$ is the baseline corresponding to the
507: pair of antennas $k$, $\Delta x$ and $\Delta y$ are the pixel width
508: and height, and the sum extends over all the pixels inside
509: $\mathcal{V}_i$.
510: 
511: The entropy only depends of the intensities $I_i$, so $\frac{\partial
512: S}{\partial x_i} = \frac{\partial S}{\partial y_i} = 0$, then (see
513: Sec.~\ref{ap:derivativesdS})
514: \begin{equation} 
515:   \frac{\partial S}{\partial I_i} = \frac{1}{\sigma_\mathrm{q}}
516:   (\sum_{k=1}^{N}\frac{1}{k} - \ln{n} - \sum_{k=1}^{N_i}\frac{1}{k}).
517: \end{equation} 
518: 
519: %T % TESIS
520: %T \subsection{Genetic Algorithm}
521: %T 
522: %T Genetic Algorithms (GAs) are methods based on evolution of species
523: %T that find the vector of parameters that maximize certain function. As
524: %T said before, GA doesn't have a proof of why they work, but they often
525: %T do.
526: %T 
527: %T PIKAIA is a particular implementation of genetic algorithm
528: %T consisting in a set of points in the parameter space (called
529: %T \emph{population}), which usually start by choosing random values. On
530: %T each iteration the following procedure is made:
531: %T \begin{enumerate}
532: %T \item All points in the population are evaluated by the merit
533: %T function to be maximized.
534: %T \item Pairs of points in the set are selected (\emph{parents}) with a
535: %T probability proportional to its function value and breeded producing
536: %T two new points (\emph{offspring}). This step is repeated until the
537: %T number of offspring produced equals the number of individuals in the
538: %T current population.
539: %T \item Replace the old population by the new offspring.
540: %T \end{enumerate}
541: %T These steps are repeated until some termination criterion is satisfied. 
542: %T 
543: %T PIKAIA's idea is to simulate the process of natural selection. It
544: %T involves the breeding of the most qualified (or most probable) beings,
545: %T which produce offspring with similar characteristics. The less probable
546: %T points start to disappear on each generation. For the breeding process
547: %T PIKAIA encodes the parameters to a string-like structure
548: %T (\emph{chromosome}). The creation of the new offspring involves
549: %T crossover and mutation of their parents chromosomes. This creates a new
550: %T pair of chromosomes that afterwards are decoded to obtain the offspring
551: %T parameters.
552: %T 
553: %T Suppose you have a point $(x_1, x_2, \cdots , x_n)$ in the
554: %T $n$-dimension parameter space. PIKAIA's encoding consists in directly
555: %T fragmenting each parameter $x_i$ into simple decimal integers and then
556: %T concatenating them into a single string. The number of digits taken in
557: %T account $nd$ is given by the user. For example, if the parameter space
558: %T has two dimensions, a point $(x_1, x_2) = (0.123456789,0.987654321)$
559: %T using $nd = 8$ would be encoded into the string
560: %T ``1234567898765432''. The decoding process is straightforward, the
561: %T string should be splitted into two strings of size $nd$ and transform
562: %T them to their decimal form, $1234567898765432\rightarrow
563: %T (0.12345678,0.98765432)=(x_1,x_2)$. Note that in the whole process a
564: %T truncation lost has occurred.
565: %T 
566: %T PIKAIA's crossover process consist of splitting each chromosome in a
567: %T unique random position and then swap their first and second parts in
568: %T order to obtain two new chromosomes. For example, if we have,
569: %T 1234567898765432 as one of the parents and 7654321023456789 as the
570: %T other one, and we randomly select to cut them at site number 10, the
571: %T splitted chromosomes would be $123456789\leftrightarrow 8765432$ and
572: %T $765432102\leftrightarrow 3456789$. Then, the swap of the splitted
573: %T parts would be $123456789\leftrightarrow 3456789$ and
574: %T $765432102\leftrightarrow 8765432$, which finally produce the two
575: %T offspring 1234567893456789 and 7654321028765432. After this, both new
576: %T chromosomes are decodified in order to obtain the offspring
577: %T parameters. In literature, this is called one point uniform
578: %T crossover. ``One point'' because chromosomes are splitted in only one
579: %T place, and ``uniform'' because the place to be splitted is selected by a
580: %T random uniform distribution.
581: %T 
582: %T PIKAIA also involves a mutation process in which each gene (or digit)
583: %T in the offspring chromosome can be changed with a small probability
584: %T $P_\mathrm{m}$. If the gene is to be changed, a new digit is chosen
585: %T randomly in order to replace the old one. For example, in our
586: %T chromosome 7654321028765432, digit number two can be changed with a
587: %T probability $P_\mathrm{m}$ to another random number, for example, 1,
588: %T leaving the offspring as 7154321028765432. In literature, this is
589: %T called one point uniform mutation.
590: %T 
591: %T Crossover and mutation can create individuals very similar or very
592: %T different from their parents. It all depends on the splitting point in
593: %T the crossover process and in the chromosome to be changed in the
594: %T mutation process. These selections may be done in one of the leading
595: %T digits or in one of the least significant digits of the chromosome,
596: %T making some points to move further from their parents than others.
597: %T 
598: %T We used PIKAIA for our reconstruction problem, using as parameters the
599: %T positions and intensities of our Voronoi polygons ($(x_i, y_i)$ and
600: %T $I_i$). We used a population of 100 individuals and stopped after
601: %T $10^6$ generations. Because our parameter space is too big, the GA
602: %T took a lot of time and did not converge to a satisfactory result.
603: 
604: %
605: % What genetic algorithms are and how they work. Particularly, explain
606: % PIKAIA.
607: % 
608: % \subsection{Monte Carlo Methods}
609: % Generation of samples which follow a particular probability
610: % function. How can we use this for optimization. Brief explanation of
611: % uniform sampling, importance sampling and rejection sampling.
612: % \subsubsection{Metropolis-Hasting}
613: % \subsubsection{Gibbs Sampling}
614: % \subsubsection{Efficient Monte Carlo Methods}
615: % Brief explanation of Hamiltonian Monte Carlo, Overrelaxation,
616: % Simulated Annealing and Skilling's Method, and why didn't we used
617: % them.
618: % 
619: % \subsection{Hybrid Methods}
620: % How we combined different of the above algorithms.
621: % \subsubsection{Incremental Algorithms}
622: % Starting with few polygons and incrementing them in each step.
623: 
624: \section{VIR Design and Implementation} \label{sec:Implementation}
625: 
626: We have designed, and implemented in c++, VIR with 6 modules which include
627: algorithms for:
628: \begin{itemize}
629:   \item the generation of the Voronoi diagram
630:   \item calculation of model visibilities
631:   \item calculation of the merit function $L$ to be optimized as well
632:     as its derivatives
633:   \item fitting a Voronoi diagram to an image
634:   \item the CG method
635:   \item the optimization of the number of polygons
636: \end{itemize}
637: %We have implemented them by using object oriented programming in c++.
638: 
639: VIR uses the CG method from \cite{NumericalRecipes} and searches for
640: the position and intensities of the Voronoi polygons, ${x_i, y_i,
641: I_i}$, that minimize our merit function $L$. The CG method modifies
642: the intensities and also moves the positions of the Voronoi
643: generators. This causes the shape of the Voronoi polygons to change as
644: well.
645: % for optimizing the
646: % position and intensity of each Voronoi cell. 
647: % We decided to use object oriented programming and programmed a set of
648: % classes in C++.
649: A general problem with CG is that it usually converges on local
650: minima. For VIR in particular, though Voronoi polygons intensities
651: adjust quite fine, the positions of the generators are difficult to
652: modify substantially. The VIR parameter space is smooth enough in
653: intensity space to converge to a good solution. But the parameter
654: space in cell generator positions is very structured, and CG is
655: quickly stuck on local minima.
656: 
657: Due to the fact that CG easily falls into local minima, we needed a
658: good approximation for the initial Voronoi diagram. For this purpose
659: we used a pixelated version of the Bayesian algorithm, where the model
660: was a uniform grid.
661: % QUANTUM
662: % But, as explained below, the importance of the natural entropy term
663: % increases for many free parameters, which causes plainer images than
664: % expected. 
665: We decided to do a pure $\chi^2$ (maximum likelihood, ML)
666: reconstruction and use the fifth CG iteration as our starting
667: image. We chose this particular iteration because on inspection the
668: modeled images were still smooth.
669: %, this iteration picks up smooth intensity fields. 
670: Pure $\chi^2$ reaches convergence with noisy images, where the
671: true image is unrecognizable.
672: %by visual inspection of its residuals. 
673: We then fitted a Voronoi diagram to the image (see
674: Sec.~\ref{ap:fitting}) and ran CG using the positions and intensities
675: of the generators as our free parameters, which led to our final
676: reconstruction. Truncation to a level of $10^{-5}$ quanta was used to
677: enforce positivity.
678: 
679: % \subsection{Quantum Size and Number of Voronoi Generators}
680: 
681: An important issue to consider is the size of the quantum
682: $\sigma_\mathrm{q}$. \cite{W&S} treat $\sigma_\mathrm{q}$ as a free
683: parameter. But, as we now explain, $\sigma_\mathrm{q}$ was held
684: constant in this implementation of VIR. We treat the number of quanta
685: per cell as a continuous variable in order to use the CG method.
686: Entropy is maximized at $\sigma_\mathrm{q} = \infty$, where, for a
687: given configuration of intensities $\{I_i\}$, $N = 0$ and $S = 0$. For
688: every other value of $N$, the entropy will be negative. This means
689: that even for large $\sigma_\mathrm{q}$, the intensities $I_i =
690: \sigma_\mathrm{q}N_i$ can have reasonable values (using small
691: $N_i$). Figure \ref{fig:SvsN} shows $S$ as a function of $N$ for $51$
692: Voronoi generators and 3 different intensity distributions using the
693: model tessellation of Figure \ref{fig:reconstruccion}c. We considered:
694: 1- the VIR intensities of Figure \ref{fig:reconstruccion}c, 2- a
695: uniform intensity distribution image ($N_i = \frac{N}{n}$ $\forall$
696: $i$), 3- a spike where all $N$ are only in one cell ($N_i = N$, $N_j
697: =0$ $\forall$ $j\neq i$). The curves of Figure \ref{fig:SvsN} are
698: obtained by keeping the intensities fixed and modifying
699: $\sigma_\mathrm{q}$ in order to obtain different $N$. It can be seen
700: on Figure~\ref{fig:SvsN} that the entropy is maximized at $N = 0$,
701: independently of the intensities $\{I_i\}$ of the model, where the
702: optimal value of $\sigma_\mathrm{q} = \infty$ is achieved if the
703: number of quanta per cell is treated as a continuous variable. If the
704: number of quanta per cell were discrete variables, as in \cite{W&S},
705: the choice of a big $\sigma_\mathrm{q}$ would admit only zero values
706: for every cell. Otherwise, if one or more quanta fell in a given cell,
707: the intensity of that cell would diverge as $\sigma_\mathrm{q}$ for
708: arbitrarily large $\sigma_\mathrm{q}$, causing a big $\chi^2$
709: value. Therefore, in our continuous optimization the intensity quantum
710: must be determined a-priori.
711: 
712: \begin{figure}[h]
713: %\epsscale{1.50}
714: \plotone{f3.eps}
715: \caption{Entropy values for different $N$, $n = 51$ and keeping
716:   $\{I_i\}$ fixed. This is achieved by varying
717:   $\sigma_\mathrm{q}$. (a) VIR reconstruction intensities. (b) Uniform
718:   intensities distribution, $N_i = \frac{N}{n}$ $\forall$ $i$. (c)
719:   Only one cell has all the quanta. $N_i = N$, $N_j =0$ $\forall$
720:   $j\neq i$.
721:   \label{fig:SvsN}}
722: \end{figure}
723: 
724: \begin{figure}[h!]
725: %  \includegraphics[scale=1.115]{figuras/reconstrucciones51.ps}
726:   \epsscale{1}
727:   \plotone{f4_color.eps}
728:   \caption{Comparison of MEM and VIR reconstruction techniques for a
729:     SNR of $\sim52$. (a) The true image. (b) Dirty map. (c) VIR
730:     reconstruction with its polygons drawn. (d) VIR
731:     reconstruction. (e) Dirty map of the VIR reconstruction
732:     residuals. (f) Restored image for the VIR model. (g) MEM
733:     reconstruction. (h) Dirty map of the MEM reconstruction
734:     residuals. (i) Restored image for the MEM model.
735:     \label{fig:reconstruccion}}
736: \end{figure}
737: 
738: 
739: %QUANTUM
740: %The importance of the entropy diminishes for small $N$. Figure
741: %\ref{fig:SvsN} shows $S$ as a function of $N$ for $57$ generators
742: %Voronoi diagrams with 3 different intensity distributions using the
743: %same model tessellation of Figure \ref{fig:reconstruccion}a. We
744: %considered: 1- the VIR intensities of Figure
745: %\ref{fig:reconstruccion}a, 2- a uniform intensity distribution image
746: %($N_i = \frac{N}{n}$ $\forall$ $i$), 3- a spike where all $N$ are only
747: %in one cell ($N_i = N$, $N_j =0$ $\forall$ $j\neq i$). The curves of
748: %Figure \ref{fig:SvsN} are obtained by keeping the intensities fixed
749: %and modifying $\sigma_\mathrm{q}$ in order to obtain different $N$.
750: %The diminution of $||S||$ when $N$ diminishes means that for larger
751: %$\sigma_\mathrm{q}$, $\chi^2$ becomes more important than the entropy,
752: %resulting in noisy optimal images. We reach this conclusion because
753: %%This is caused by the fact that 
754: %we are optimizing our $N_i$ as continuous variables, which means that
755: %even for large $\sigma_\mathrm{q}$, the intensities $I_i =
756: %\sigma_\mathrm{q}N_i$ will have reasonable values (using small
757: %$N_i$). If the number of quanta per cell were discrete variables, as
758: %in \cite{W&S}, the choice of a big $\sigma_\mathrm{q}$ would admit
759: %only zero values for every cell. Otherwise, if one or more quanta fell
760: %in a given cell, the intensity of that cell would diverge as
761: %$\sigma_\mathrm{q}$ for arbitrarily large $\sigma_\mathrm{q}$, causing
762: %a big $\chi^2$ value.
763: %
764: %Therefore, in our continuous optimization the intensity quantum must
765: %be determined a-priori. 
766: In the Bayesian description of the entropy we count events that fall
767: in each cell. It seems reasonable to take the noise level as the
768: minimum value of intensity we can distinguish. So, $\sigma_\mathrm{q}$
769: should approximate the estimated thermal noise in the naturally
770: weighted dirty map. The definition of the weighted dirty map
771: \citep[e.g.][]{Briggs} is
772: \begin{eqnarray}
773:   I^\mathrm{D} (x, y) \equiv
774:   \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}W(u, v)V(u, v)e^{-2\pi
775:   i(ux + vy)}dudv,\\
776:   W(u, v) = \frac{1}{\sum_kw_k} \sum_kw_k\delta(u-u_k, v-v_k),
777: \end{eqnarray}
778: where the sums extend over all visibilities, $w_k$ are the weights
779: given to visibility $k$ and $\delta$ is the two-dimensional Dirac
780: delta function. Propagating the thermal noise, we get for the standard
781: deviation of the dirty map
782: \begin{equation}
783: \sigma_\mathrm{rms}^\mathrm{D} =
784: \sqrt{\frac{\sum_kw_k^2\sigma_k^2}{(\sum_kw_k)^2}}, \label{eq:sigmaD}
785: \end{equation}
786: where $\sigma_k$ are the visibilities standard deviations. To take
787: into account model pixels correlated by the interferometer beam, we
788: should multiply the previous expression by $\sqrt{N_\mathrm{beam}}$,
789: where $N_\mathrm{beam}$ is the number of pixels inside a beam
790: pattern. This leads to
791: \begin{equation}
792:   \sigma_\mathrm{rms} = \sqrt{\frac{\sum_kw_k^2\sigma_k^2}
793:   {(\sum_kw_k)^2}}\sqrt{N_\mathrm{beam}}.
794: \end{equation}
795: For natural weights, $\sigma_k^2 = \frac{1}{w_k}$,
796: \begin{equation}
797:   \sigma_\mathrm{rms} = \sqrt{\frac{N_\mathrm{beam}}{\sum_kw_k}} =
798:   \sqrt{\frac{N_\mathrm{beam}}{\sum_k\frac{1}{\sigma_k^2}}}.
799: \end{equation}
800: We calculated the noise with natural weighting, $w_k =
801: \frac{1}{\sigma_k^2}$, because this is the weight we give to each
802: individual visibility data in the optimization of the merit function.
803: 
804: Once we have the value of $\sigma_\mathrm{q}$ we search for the
805: optimal number of cells $n$. In Figure \ref{fig:Lvsn} we plot the
806: optimal merit function for different $n$ and
807: $\sigma_\mathrm{q}$. These reconstructions were made over a simulation
808: of CBI observations on a mock sky image (Figure
809: \ref{fig:reconstruccion}a). We averaged over 100 reconstructions with
810: different realizations of Gaussian noise. The average curves shown in
811: Figure \ref{fig:Lvsn}, start with $n = 10$ and end with $n = 100$ for
812: even $n$. One single reconstruction for all $n$ took about two hours
813: using an AMD Athlon64 XP3000 processor with 1GB of DDR RAM at 333 MHz,
814: so the 300 reconstructions
815: % using the three different values of $\sigma_\mathrm{q}$ 
816: took about $25$ CPU days, but we distributed the work in six
817: computers, so it took about $5$ real days in total. It can be seen
818: that for a signal to noise ratio (SNR) of $\sim 52$, on average, the
819: optimal number of polygons $n$ is between 50 and 55. When
820: $\sigma_\mathrm{q}$ is diminished to $\frac{1}{10}\sigma_\mathrm{q}$,
821: on average, the optimal merit function is found at $n$ close to
822: $30$. For $\sigma_\mathrm{q} = 10\sigma_\mathrm{q}$, the optimal $n$
823: is found between $80$ and $90$.  It can be seen that as we increase
824: the value of $\sigma_\mathrm{q}$ we reach lower values for our
825: function, as discussed above. Furthermore, the optimal number of
826: polygons increases.
827: %This is because as $\sigma_\mathrm{q}$
828: %increases, $\chi^2$ becomes more important than the entropy giving a
829: %reconstruction similar to a ML, and so more parameters are needed to
830: %model the noise.
831: 
832: \begin{figure}[h]
833: %\includegraphics[scale = 1.3]{figuras/resumen.ps}
834:   \epsscale{.50}
835: \plotone{f5.eps}
836: \caption{The merit function $L$ for different $\sigma_\mathrm{q}$ and
837:   number of polygons $n$. The lines are averages taken over 100
838:   different realizations of noise for each $n$. (a) Reconstructions
839:   made using $\sigma_\mathrm{q} =
840:   \frac{1}{10}\sigma_\mathrm{rms}$. (b) Reconstructions made using
841:   $\sigma_\mathrm{q} = \sigma_\mathrm{rms}$. (c) Reconstructions made
842:   using $\sigma_\mathrm{q} = 10\sigma_\mathrm{rms}$. (d) $L$ as a
843:   function of $n$ for a practical application of VIR to the simulated
844:   visibilities used in the reconstructions of
845:   Figure~\ref{fig:reconstruccion}. In this practical application, the
846:   minimum $L$ was found at $n = 51$, and is indicated by a vertical
847:   line.
848:   %(d) Dots represent the reconstruction of our simulation for
849:   %different $n$. A vertical line is drawn at $n = 51$, where the
850:   %minimum was found. 
851:   \label{fig:Lvsn}}
852: \end{figure}
853: 
854: %\subsection{Number of Voronoi Generators}
855: %Optimal number of polygons (still running, I had a few errors).
856: 
857: \section{Example Reconstruction} \label{sec:Results}
858: \subsection{Mock Dataset}
859: % We test the VIR algorithm over CBI simulations. 
860: % As said before, the CBI strong side-lobes make image reconstruction a
861: % difficult task.
862: The mock sky image we used for simulations is a $256\times 256$ image
863: consisting of three Gaussians and a rectangle. Figure
864: \ref{fig:reconstruccion}a shows this image on a $128\times128$ pixel
865: field. Pixels are $0.75'\times0.75'$, while the CBI's primary beam is
866: of $45'$ FWHM (60 pixels), so most of the emission lies under the
867: beam. We simulated a CBI observation of 3620 visibilities over this
868: image and added Gaussian noise to the visibilities in order to reach a
869: SNR of $\sim52$. This SNR was calculated by taking the maximum
870: intensity from the dirty map using natural weights, and using the
871: noise $\sigma_\mathrm{rms}^\mathrm{D}$ (see
872: Eq. \ref{eq:sigmaD}). Simulation of the CBI observations is performed
873: with the MockCBI program (Pearson 2000, private communication), which
874: calculates the visibilities $V(u,v)$ on the input images $I(x,y)$ with
875: the same $uv$ sampling as a reference visibility dataset
876: (Eq.~\ref{eq:vmodel}). Thus MockCBI creates the visibility dataset
877: that would have been obtained had the sky emission followed the true
878: image. Figure \ref{fig:reconstruccion}b shows the dirty map calculated
879: over these simulated visibilities using the DIFMAP package
880: \citep[][]{she97}. The CBI's primary beam is drawn as a dashed
881: circle. The secondary side-lobes due to the central discontinuity in
882: $u$-$v$ coverage can be distinguished in Figure
883: \ref{fig:reconstruccion}b at a level comparable to the true emission.
884: 
885: \subsection{MEM Reconstruction}
886: The VIR method was compared with the MEM algorithm described in
887: \cite{Casassus}. To fit the model image to the observed visibilities,
888: MEM calculates the model visibilities required by its merit function
889: $L_\mathrm{MEM}$.
890: % MEM fits model visibilities, calculated on a model
891: % image, to the observed visibilities. 
892: The model visibilities are those obtained by a simulation of CBI
893: observations had the sky followed the model image
894: % (``CBI-simulated visibilities'' hereafter)
895: .  The free-parameters of our MEM model are the pixels in the model
896: $64\times64$ image. The model functional we minimize is
897: $L_\mathrm{MEM} = \chi^2 - \lambda S$, with the entropy $S= - \sum_i
898: I_i \log I_i/M$, where $M$ is a default pixel value well below the
899: noise level, and $\{I_i\}_{i=1}^{N}$ is the model image. We started
900: with the fifth iteration of a pure $\chi^2$ reconstruction ($\lambda =
901: 0$) as initial condition for the CG minimization. This is the same ML
902: initial condition used in our VIR method. Figure
903: \ref{fig:reconstruccion}g shows the reconstructed image using $\lambda
904: = \frac{100}{\sigma_\mathrm{rms}}$ and $M =
905: 10^{-2}\sigma_\mathrm{rms}$ inset on a larger $128\times128$ image
906: \footnote{We choose to display the sky images in a larger field than
907: the domain of free parameters; larger fields are required to highlight
908: secondary side-lobes}.
909: 
910: % We avoided the use of Cornwell-Evans MEM (\cite{Corn&Ev}) because its
911: % neglect of side-lobes in the optimization algorithm is not well suited
912: % to the CBI, and because the relatively small number of visibilities
913: % % ($\sim 3000$ for each on-off cycle)
914: % allows working in the $uv-$plane and fit for the observed visibilities
915: % directly, rather than work in the sky plane and deconvolve the
916: % synthetic beam.  The model functional we minimize is $L =
917: % \frac{1}{2}\chi^2 - \lambda S$, with the entropy $S= - \sum_i I_i \log
918: % I_i/M$, where $M$ is a default pixel value well below the noise level,
919: % and $\{I_i\}_{i=1}^{N}$ is the model image.  $\lambda$ is a Lagrange
920: % multiplier that we adjust to ensure that $\chi^2 \sim \Omega$, where
921: % $\Omega$ is the number of observed visibilities ($\times 2$ to include
922: % real and imaginary parts).  The exact values of $\lambda$ and $M$ are
923: % subjective, and we guide our criterion by the quality of the
924: % visibility residuals and output image.  Image positivity is enforced
925: % by truncation of pixel intensities. We started with the fifth
926: % iteration of a pure $\chi^2$ reconstruction ($\lambda = 0$) as initial
927: % condition for the CG minimization. This is the same ML initial
928: % condition used in our VIR method. Figure \ref{fig:reconstruccion}c
929: % shows the reconstructed image using $\lambda =
930: % \frac{100}{\sigma_\mathrm{rms}}$ and $M = 10^{-2}\sigma_\mathrm{rms}$
931: % overlaid on a $128\times128$ image. Convergence is achieved in $\sim
932: % 30$ iterations using the Fletcher-Reeves conjugate-gradient algorithm
933: % of \cite{NumericalRecipes} obtaining a reduced $\chi^2$ value of
934: % 1.016.
935: 
936: \subsection{VIR Reconstruction}
937: The MEM algorithm described above requires the prior assignment of the
938: $\lambda$ and $M$ parameters as well as the entropy formula. In
939: contrast, our VIR algorithm is free from such arbitrary parameters
940: (provided the optimal $\sigma_\mathrm{q}$ is indeed equal to
941: $\sigma_\mathrm{rms}$). For our VIR method, we only need to find the
942: number of polygons to be used. In order to find the optimal number of
943: polygons we reconstructed with different numbers of generators in a
944: range covering each natural number from $n = 6$ to $n = 100$. We found
945: a minimum at $n = 51$. Figure \ref{fig:Lvsn} summarizes this
946: search. The whole search for a particular realization of noise took
947: about 10 hours on the AMD Athlon64 XP3000 processor with 1GB of DDR
948: RAM at 333 MHz. The VIR reconstruction using 51 polygons is shown in
949: Figure \ref{fig:reconstruccion}c, where the Voronoi cells have also
950: been drawn. Figure \ref{fig:reconstruccion}d shows the same model but
951: without drawing the Voronoi mesh.
952: 
953: \subsection{Results}
954: % MODEL
955: The quality of each reconstruction can be assessed by visual
956: inspection, comparing the VIR and MEM model images with the true
957: image.  The MEM model looks similar to the true image but is
958: noisy. The density of Voronoi generators in the VIR model is greater
959: where there is more emission in the true image, approximating the true
960: image with only a few polygons.  We calculated $\chi^2_\mathrm{im} =
961: \sum_i(I_i^\mathrm{mod} - I_i^\mathrm{true})^2$, where
962: $I_i^\mathrm{mod}$ is the intensity at pixel $i$ of the model image
963: (MEM or VIR), $I_i^\mathrm{true}$ is the intensity at pixel $i$ of the
964: true image, and the sum extends over all pixels in the
965: images. $\chi^2_\mathrm{im}$ gives a measure of how well the model
966: fits the true image. It can be seen in Table \ref{table:funciones}
967: that the VIR reconstruction has a better $\chi^2_\mathrm{im}$ than
968: MEM, showing that the VIR model is closer to the true image than the
969: MEM model.
970: 
971: \begin{deluxetable} {crrrrr}
972: \tabletypesize{\small} \tablecaption {Comparison
973: between MEM and VIR reconstructions.\label{table:funciones}}
974: \tablewidth{0pt} 
975: \tablehead{ & \colhead{$\chi^2$} & \colhead{$\frac{\chi^2}{n_\mathrm{data}}$} 
976:   & \colhead{$L$} & \colhead{$\chi^2_\mathrm{im}$}}
977: \startdata
978: MEM & 7354.85 & 1.016 & 12192.6 & 0.001608 \\
979: VIR & 7221.04 & 0.997 & 3753.28 & 0.001396 \\
980: \enddata
981: \end{deluxetable}
982: 
983: % RESIDUALS
984: Figures \ref{fig:reconstruccion}e and \ref{fig:reconstruccion}h show
985: the VIR and MEM models residuals. Residual images are the dirty map of
986: the residuals of the visibilities, calculated over the optimal model
987: visibilities. It can be noted on Figure \ref{fig:reconstruccion}e that
988: the VIR residuals are very good, showing only noise. On the other
989: hand, in the MEM residuals (Figure \ref{fig:reconstruccion}h) the
990: object shape can clearly be distinguished as well as the CBI's
991: side-lobes. The object seems to be more compact in the model than in
992: its MEM residuals; as expected these residuals are convolved with the
993: synthetic beam.
994: 
995: % RESTORED
996: Restored images
997: %(convolution of the reconstruction with the CBI beam plus the residuals) 
998: are shown in Figures \ref{fig:reconstruccion}f and
999: \ref{fig:reconstruccion}i. These images are obtained by convolving the
1000: models with a Gaussian point spread function (PSF) given by DIFMAP and
1001: adding the dirty map of the residuals visibilities.
1002: % The restored images show similar results. 
1003: On Figures \ref{fig:reconstruccion}f and \ref{fig:reconstruccion}i it
1004: can be assessed that VIR produces improved restored images relative to
1005: MEM. The VIR restored image is similar to that expected given the
1006: instrumental noise: it approximates the true image convolved with a
1007: Gaussian PSF plus a uniform noise level. In the MEM restored image, on
1008: the other hand, the CBI side-lobes can still be distinguished.
1009: 
1010: % STATS
1011: The number of optimization parameters in MEM are $64\times64 = 4096$,
1012: while the VIR method has only $51$ triplets (cell's $(x, y)$ position
1013: and intensity) i.e. $153$ free parameters. This smaller number of
1014: parameters causes the Bayesian entropy to be greater than the
1015: pixelated version, obtaining a smaller value for our merit function
1016: $L$ to be minimized.
1017: 
1018: Table \ref{table:funciones} also shows
1019: $\frac{\chi^2}{n_\mathrm{data}}$ values, where $n_\mathrm{data}$ is
1020: the number of data points ($3620\times 2$ in our case). A good
1021: reconstruction should have a $\frac{\chi^2}{n_\mathrm{data}}$ value
1022: close to $1$. It can be seen that the VIR model gives a value of
1023: $\frac{\chi^2}{n_\mathrm{data}}$ closer to $1$ than the MEM
1024: reconstruction.
1025: % The VIR model is more satisfactory in this $\chi^2$ sense than the MEM
1026: % model.
1027: 
1028: % Table \ref{table:funciones} shows different parameters used to
1029: % evaluate MEM and VIR reconstructions. We calculated two kinds of
1030: % reduced $\chi^2$. For a large number of parameters $n$, the use of a
1031: % reduced $\chi^2$ of the form $\chi^2 = \frac{\chi^2}{m}$ is needed,
1032: % where $m$ is the number of data points ($3620\times 2$ in our
1033: % case). When using a smaller $n$ the reduced $\chi^2$ is $\chi^2_\nu =
1034: % \frac{\chi^2}{\nu}$, where $\nu = m - n$ is the number of degrees of
1035: % freedom. In MEM models the number of free parameters $n$ can be larger
1036: % than $m$. It can be seen in Table \ref{table:funciones} that the MEM
1037: % reconstruction has a reduced $\chi^2$ value close to 1 only when using
1038: % the first definition. By contrast, the VIR reconstruction has a
1039: % reduced $\chi^2$ close to 1 for both definitions. VIR models are
1040: % satisfactory in the reduced $\chi^2$ sense, while MEM models lack a
1041: % statistical interpretation.
1042: 
1043: % MEM model does not give a good fit in this sense.
1044: % Although it cannot be strictly comparable, the Bayesian $L$ function
1045: % gives a much smaller value for VIR than for MEM (the MEM model
1046: % optimizes $L_\mathrm{MEM}$, not $L$). This reflects the increase in
1047: % entropy with the number of free parameters.
1048: 
1049: \section{Conclusions} \label{sec:Conclusions}
1050: We have introduced a Bayesian Voronoi image reconstruction (VIR)
1051: technique for interferometric data where the image is represented by a
1052: Voronoi tessellation in place of the usual pixelated image. The
1053: advantage of Voronoi models is that we can use a smaller number of
1054: free parameters, as required by the Bayesian analysis of a discretized
1055: intensity field. Our purpose is not optimal CPU efficiency; we search
1056: for the optimal image and model from a Bayesian point of view. The
1057: free parameters of our model are the Voronoi generators positions
1058: $(x_i, y_i)$ and intensities $I_i$. The following points summarize our work:
1059: \begin {itemize}
1060: % \item We used a new model for representing the sky plane consisting of
1061: %   a Voronoi tessellation. The free parameters of our model are the
1062: %  Voronoi generators positions $(x_i, y_i)$ and intensities $I_i$.
1063: \item We discretized the intensity field in order to calculate \emph{a
1064:   priori} probabilities. We defined a quantum intensity value
1065:   $\sigma_\mathrm{q}$ such that $I_i = \sigma_\mathrm{q} N_i$, where
1066:   $I_i$ is the intensity at cell $i$ and $N_i$ the number of quanta in
1067:   cell $i$.
1068: % \item We defined our merit function as $-\ln{P (I, M | D)}$ and used a
1069: %   conjugate gradient algorithm for optimizating it.
1070: \item We calculated the analytical derivatives required by the
1071:   conjugate gradient and cross checked them by finite
1072:   differences. Because the parameter space in cell generators
1073:   positions is very structured, the positions of the Voronoi
1074:   generators are difficult to change. As initial condition we took a
1075:   Voronoi tessellation adjusted to an interrupted maximum likelihood
1076:   reconstruction.
1077:   % (making the position of the  generators 
1078:   % hard to move), we adjusted a Voronoi tessellation to the fifth 
1079:   % iteration of a maximum likelihood reconstruction as initial 
1080:   % condition.
1081: \item We simulated a CBI observation over a true image and
1082:   reconstructed sky images from this mock visibility dataset using MEM
1083:   and VIR.
1084: \item We defined the value of $\sigma_\mathrm{q}$ as the estimated
1085:   noise of the dirty map and searched for the optimal number of
1086:   Voronoi polygons for our example dataset.
1087: \item We finally compared the MEM and VIR models, residuals and
1088:   restored images. The VIR model is closer to our true image than the
1089:   MEM model. Residuals and restored images are also better in VIR than
1090:   in MEM. We found that VIR model visibilities give a better fit to
1091:   the data than MEM, in the sense that $\chi^2$ is closer to its
1092:   expected value.
1093:   
1094:   %We also calculated $\chi^2$ values for quantitative comparison,
1095:   %obtaining a value closer to the number of data points for VIR.
1096: \end{itemize}
1097: 
1098: % Analysis of the correct quantum size for the \emph{a-priori} probability was
1099: % also made as well as its dependency in the optimal number of polygons.
1100: %
1101: % Our implementation consists in a conjugate gradient method which
1102: % maximizes the \emph{a-posteriori} probability of the image. Function
1103: % derivatives with respect to Voronoi generators and intensities are
1104: % needed, calculated analytically and cross checked by finite
1105: % differences.
1106: %
1107: % Reconstructions over simulated data on a true image using our
1108: % algorithm and a MEM one were compared.
1109: 
1110: \acknowledgments
1111: We are grateful to Tim Pearson for advice on FFTs and the use of
1112: MOCKCBI. G.F.C. and S.C. acknowledge support from FONDECYT grant
1113: 1060827, and from the Chilean Center for Astrophysics FONDAP 15010003.
1114: 
1115: \appendix
1116: 
1117: \section{Derivatives}\label{ap:derivatives}
1118: 
1119: Our merit function for minimization is
1120: \begin{eqnarray}
1121: L & = &
1122: \frac{1}{2}\sum_{j=1}^{N_\mathrm{Vis}}\frac{||V_j^{\mathrm{mod}}
1123:   - V_j^{\mathrm{obs}}||^2}{\sigma_j^2}
1124: -\ln\left(\frac{N!}{n^N \prod_{i=1}^{n}N_{i}!}\right)\\ & = &
1125: \frac{1}{2}\chi^2 - S.
1126: \end{eqnarray}
1127: So, the derivative of $L$ with respect to any variable $x$ is
1128: \begin{equation}
1129: \frac{\partial L}{\partial x} = \frac{1}{2}\frac{\partial
1130: \chi^2}{\partial x} - \frac{\partial S}{\partial x}
1131: \end{equation}
1132: 
1133: \subsection {Calculation of the Derivatives of $\chi^2$}
1134: 
1135: $\chi^2$ derivatives with respect to any variable $x$ can be obtain as follows
1136: \begin{eqnarray}
1137: \frac{\partial}{\partial x} \frac{1}{2}\chi^2
1138:   & = & \frac{\partial}{\partial x}
1139:     \left(\frac{1}{2}\sum_{k=1}^{N_\mathrm{Vis}}\frac{||V_k^{\mathrm{mod}} -
1140:     V_k^{\mathrm{obs}}||^2}{\sigma_k^2}\right)\nonumber\\
1141: % TESIS
1142: %   & = & \frac{1}{2}\sum_{k=1}^{N_\mathrm{Vis}} \frac{1}{\sigma_k^2}
1143: %     \frac{\partial}{\partial x}
1144: %     \left((V_k^{\mathrm{mod}} -
1145: %     V_k^{\mathrm{obs}})(V_k^{\mathrm{mod}} -
1146: %     V_k^{\mathrm{obs}})^*\right)\nonumber\\
1147: %   & = & \frac{1}{2}\sum_{k=1}^{N_\mathrm{Vis}} \frac{1}{\sigma_k^2}
1148: %     \frac{\partial}{\partial x}
1149: %     \left(\mathrm{Re}(V_k^{\mathrm{mod}} - V_k^{\mathrm{obs}})^2
1150: %     + \mathrm{Im}(V_k^{\mathrm{mod}} - V_k^{\mathrm{obs}})^2
1151: %     \right)\nonumber\\
1152:   & = & \sum_{k=1}^{N_\mathrm{Vis}} \frac{1}{\sigma_k^2}
1153:     \left(\mathrm{Re}(V_k^{\mathrm{mod}} - V_k^{\mathrm{obs}})
1154:     \mathrm{Re}\left(\frac{\partial V_k^{\mathrm{mod}}}{\partial x}\right)
1155:     %\nonumber\\ &&
1156:     + \mathrm{Im}(V_k^{\mathrm{mod}} - V_k^{\mathrm{obs}})
1157:     \mathrm{Im}\left(\frac{\partial V_k^{\mathrm{mod}}}{\partial x}\right)
1158:     \right),\nonumber\\
1159:   && \label{eq:dLdI}
1160: \end{eqnarray}
1161: where its necessary to calculate the model visibilities derivatives
1162: with respect to $x$.
1163: 
1164: \subsubsection {Calculation of $\frac{\partial V_k^{\mathrm{mod}}}{\partial I_i}$}
1165: 
1166: % As said before, $V(\vec{k}) = \int A(\vec{x}) I(\vec{x})
1167: % e^{2\pi i\vec{k}\vec{x}}d\vec{x}$, but according to 
1168: In our Voronoi tessellation representation of the sky image
1169: \begin{equation} \label{eq:VisDiscreta}
1170: V(\vec{k}) = \sum_i^{N_\mathrm{V}} I_i\int_{\mathcal{V}_i} A(\vec{x}) e^{2\pi
1171: i\vec{k}\vec{x}}d\vec{x},
1172: \end{equation}
1173: where $N_\mathrm{V}$ is the number of polygons, $\mathcal{V}_i$ is
1174: polygon $i$ and $I_i$ its intensity. We neglected the $\sqrt{1 - x^2 -
1175: y^2}$ term which is close to $1$, but it can easily be included in
1176: $A(\vec{x})$. After derivation and defining $f_k(\vec{x})\equiv
1177: A(\vec{x}) e^{2\pi i\vec{k_k}\vec{x}}$ we obtain
1178: \begin{eqnarray}
1179: \frac{\partial V_k^{\mathrm{mod}}}{\partial I_i} & = &
1180: \int\int_{\mathcal{V}_i} f_k(\vec{x})d^2x, \\
1181: & = & \frac{\sin{(\pi u_k\Delta x)}\sin{(\pi v_k\Delta y)}}{\pi^2u_kv_k}
1182:   \sum_{\textrm{\scriptsize{pixels }}l\epsilon \mathcal{V}_i}A_l
1183:   e^{2\pi i(u_kx_l+v_ky_l)}, \\
1184: & \simeq & \Delta x\Delta y
1185:   \sum_{\textrm{\scriptsize{pixels }}l\epsilon \mathcal{V}_i}A_l
1186:   e^{2\pi i(u_kx_l+v_ky_l)}
1187: \end{eqnarray}
1188: for small $\Delta x$ and $\Delta y$.
1189: % This integral can be computed numerically, and its domain is only
1190: % polygon $i$.
1191: 
1192: \subsubsection {Calculation of $\frac{\partial V_k^{\mathrm{mod}}}{\partial x_i}$ and $\frac{\partial V_k^{\mathrm{mod}}}{\partial y_i}$}
1193: 
1194: To evaluate $\frac{\partial V_k}{\partial x_i}$ we move the generator
1195: $\vec{x}_i$ an infinitesimal quantity $\delta_x$ parallel to the
1196: $\hat{x}$ axis as in Figure \ref{fig:VoronoiDelta}. We will calculate
1197: \begin{equation}
1198: \frac{\partial V_k}{\partial x_i} =
1199: \lim_{\delta_x\to 0}\frac{\Delta V}{\delta_x},\label{eq:DerivadaLimite}
1200: \end{equation}
1201: where $\Delta V_k = V_k(\vec{x}_1, \cdots,
1202: \vec{x}_i + \vec{\delta_x}, \cdots, \vec{x}_{N_V}) -
1203: V_k(\vec{x}_1, \cdots, \vec{x}_i, \cdots, \vec{x}_{N_V})$.
1204: 
1205: \begin{figure}[h]
1206: %  \plotone{figuras/Voronoi_deltaB&N.ps}
1207:   \plotone{f6.eps}
1208:   \caption{Voronoi tessellation before and after translating the site
1209:     $\vec{x}_i$ by $\vec{\delta_x}$. Voronoi ge\-ne\-rators are
1210:     represented with dots. The solid lines are the polygons before
1211:     moving $\vec{x_i}$. The dotted lines represent the new polygons
1212:     after varying $\vec{x_i}$.}
1213:   \label{fig:VoronoiDelta}
1214: \end{figure}
1215: 
1216: It can be seen in Figure \ref{fig:VoronoiDelta} that when moving the
1217: generator $\vec{x}_i$, the only polygons modified are $\mathcal{V}_i$
1218: and its neighbors. Using this, Eq. \ref{eq:VisDiscreta} leads to
1219: \begin{eqnarray}
1220: \Delta V_k & = & I_i'\int_{\mathcal{V}_i'}f_k(\vec{x})d\vec{x} -
1221: I_i\int_{\mathcal{V}_i}f_k(\vec{x})d\vec{x}\nonumber\\
1222: & & + \sum_{j\in J_i}\bigg(I_j'\int_{\mathcal{V}_j'}f_k(\vec{x})d\vec{x} -
1223: I_j\int_{\mathcal{V}_j}f_k(\vec{x})d\vec{x}\bigg),\label{eq:deltaVis}
1224: \end{eqnarray}
1225: where $\mathcal{V}_i$ is the polygon generated by $\vec{x}_i$ before
1226: moving, $\mathcal{V}_i'$ is the same polygon after moving $\vec{x}_i$,
1227: $J_i$ is the set of indices of the polygons that are neighbors to
1228: $\mathcal{V}_i$ and $J_i'$ is the set of indices of the polygons that
1229: are neighbors to $\mathcal{V}_i'$.
1230: 
1231: It can be seen in Figure \ref{fig:VoronoiDelta} that
1232: \begin{eqnarray}
1233: \mathcal{V}_i = (\mathcal{V}_i\cap \mathcal{V}_i')\cup(\mathcal{V}_i\setminus \mathcal{V}_i\cap \mathcal{V}_i'), & \mathcal{V}_i' =
1234: (\mathcal{V}_i\cap \mathcal{V}_i')\cup(\mathcal{V}_i'\setminus \mathcal{V}_i\cap \mathcal{V}_i'),\\
1235: \mathcal{V}_j = (\mathcal{V}_j\cap \mathcal{V}_j')\cup(\mathcal{V}_i'\cap \mathcal{V}_j), & \mathcal{V}_j' = (\mathcal{V}_j\cap
1236: \mathcal{V}_j')\cup(\mathcal{V}_i\cap \mathcal{V}_j'),
1237: \end{eqnarray}
1238: so, Eq. \ref{eq:deltaVis} is
1239: \begin{eqnarray}
1240: \Delta V_k & = & (I_i'-I_i)\int_{\mathcal{V}_i'\cap
1241: \mathcal{V}_i}f_k(\vec{x})d\vec{x}\nonumber\\
1242: && + \sum_{j\in J_i}\bigg[(I_i' - I_j)\int_{\mathcal{V}_i'\cap
1243: \mathcal{V}_j}f_k(\vec{x})d\vec{x} + (I_j' - I_i)\int_{\mathcal{V}_i\cap
1244: \mathcal{V}_j'}f_k(\vec{x})d\vec{x}\bigg].
1245: \end{eqnarray}
1246: In our case the cells' intensities don't depend of the position of the
1247: generators, so we obtain
1248: \begin{equation}\label{eq:DeltaVis}
1249: \Delta V_k = \sum_{j\in J_i}\bigg[(I_i-I_j)\bigg(\int_{\mathcal{V}_i'\cap
1250: \mathcal{V}_j}f_k(\vec{x})d\vec{x} - \int_{\mathcal{V}_i\cap
1251: \mathcal{V}_j'}f_k(\vec{x})d\vec{x}\bigg)\bigg].
1252: \end{equation}
1253: 
1254: It can be seen in Figure \ref{fig:VoronoiDelta} that to obtain $\Delta
1255: V_k$ we must integrate only over the shaded regions.  For this
1256: purpose, for each region between $\vec{x}_i$ and $\vec{x}_j$ we will
1257: define a coordinate system
1258: \begin{eqnarray}
1259: \hat{s} = -\cos{\alpha_j}\hat{x} + \sin{\alpha_j}\hat{y}. & \hat{t}
1260: = \sin{\alpha_j}\hat{x} + \cos{\alpha_j}\hat{y},\label{eq:sistema_coordenadas}
1261: \end{eqnarray}
1262: where $\alpha_j$ is the angle formed by the $-\hat{x}$ axis and the
1263: edge $a_{ij}$ between $\vec{x}_i$ and $\vec{x}_j$ (see Figure
1264: \ref{fig:VoronoiCoordenadas}). Using this change of coordinates, the
1265: integral over the region of interest is
1266: \begin{equation}
1267: \int_{\mathcal{V}_i'\cap \mathcal{V}_j}f_k(x, y)dxdy = \int_{\mathcal{V}_i'\cap
1268: \mathcal{V}_j}f_k(s,t)dsdt.\label{eq:int}
1269: \end{equation}
1270: 
1271: \begin{figure}[h]
1272:   \epsscale{.50}
1273: %  \plotone{figuras/VoronoiCoordenadasB&N.ps}
1274:   \plotone{f7.eps}
1275:   \caption{Change of coordinates from $(x, y)$ to $(s, t)$.}
1276:   \label{fig:VoronoiCoordenadas}
1277: \end{figure}
1278: 
1279: Let $\vec{x_i} = (x_i, y_i)$ be the position of the $i$ cell's
1280: generator, $\vec{x_j} = (x_j, y_j)$ one of its neighbor, and
1281: $\vec{x_i}' = (x_i + \delta_x, y_i)$ the site's position after moving
1282: it a quantity $\delta_x$. We define $\vec{x_0} \equiv (x_0, y_0)$ as
1283: the point in the intersection of the segment formed by $\vec{x_i}$ and
1284: $\vec{x_j}$ and its respective edge $a_{ij}$. The same way, we define
1285: $\vec{x_0}' = (x_0', y_0')$ as the point in the intersection of the
1286: segment formed by $\vec{x_i}'$ and $\vec{x_j}$ and its respective edge
1287: $a_{ij}'$. It can be seen on Figure \ref{fig:VoronoiCoordenadas} that
1288: $x_0 = \frac{x_i + x_j}{2}$ , $x_0' = x_0 + \frac{\delta}{2}$ and $y_0'
1289: = y_0 = \frac{y_i + y_j}{2}$.
1290: 
1291: The edge $a_{ij}$ is defined in the new coordinate system by
1292: \begin{equation}
1293: s = s_0 = -x_0\cos{\alpha_j} + y_0\sin{\alpha_j}.
1294: \end{equation}
1295: In the same way, the edge $a_{ij}'$ is defined in the original
1296: coordinate system by
1297: \begin{equation}
1298: y = m(x - x_0') + y_0,
1299: \end{equation}
1300: where
1301: \begin{equation}
1302: m \equiv \frac{x_i + \delta_x - x_j}{y_j - y_i}.
1303: \end{equation}
1304: We can define the same line in our new coordinate system as
1305: \begin{equation}
1306: s = m't+b',
1307: \end{equation}
1308: where
1309: \begin{eqnarray}
1310: m' & \equiv & -\frac{\cos{\alpha_j} + m\sin{\alpha_j}}{\sin{\alpha_j}
1311:   - m\cos{\alpha_j}},\\ 
1312: b' & \equiv & \frac{-mx_0' +
1313:   y_0}{\sin{\alpha_j} - m\cos{\alpha_j}}.
1314: \end{eqnarray}
1315: This can be approximated to first order in $\delta_x$ as
1316: \begin{eqnarray}
1317: m' & \simeq & \delta_xM_x,\\
1318: b' & \simeq &s_0 + \delta_xB_x,
1319: \end{eqnarray}
1320: where
1321: \begin{eqnarray}
1322: M_x & \equiv &\frac{\sin^2\alpha_j}{y_j - y_i} =
1323: \frac{\sin{\alpha_j}\cos{\alpha_j}}{x_i - x_j},\\ 
1324: B_x & \equiv &
1325: \frac{\sin{\alpha_j}}{y_j-y_i}(s_0\cos{\alpha_j} + x_i) =
1326: \frac{\cos{\alpha_j}}{x_i-x_j}(s_0\cos{\alpha_j} + x_i).
1327: \end{eqnarray}
1328: 
1329: The integral in Eq. \ref{eq:DeltaVis} using our new coordinate system
1330: will be
1331: \begin{eqnarray}
1332: \mathcal{I} & = & \int_{\mathcal{V}_i'\cap \mathcal{V}_j}f_k(\vec{x})d\vec{x} - \int_{\mathcal{V}_i\cap
1333: \mathcal{V}_j'}f_k(\vec{x})d\vec{x}\\
1334: & = & \int\int_{a_{ij}}^{a_{ij}'}A(\vec{x})e^{2\pi i(ux + vy)}dxdy.\label{eq:Ixy}
1335: \end{eqnarray}
1336: If we use $A(\vec{x})$ in the $(s, t)$ coordinate system as a
1337: pixelated image, Eq. \ref{eq:Ixy} will be
1338: \begin{equation}
1339: \mathcal{I} = \sum_{l\ \epsilon \mathrm{\ pixeles\ de\ }
1340: a_i}A_l\int_{t_{ijl}^1}^{t_{ijl}^2}\int_{s_0}^{m't+b'} e^{2\pi i(ux(s,
1341: t) + vy(s, t))}dsdt,
1342: \end{equation}
1343: where $t_{ijl}^1$ and $t_{ijl}^2$ are the $t$ coordinate of the
1344: beginning and end of the portion of the edge $a_{ij}$ that intersects pixel
1345: $l$.  Developing the previous expression,
1346: \begin{eqnarray}
1347: \mathcal{I} & = & \sum_{l}A_l\int_{t_{ijl}^1}^{t_{ijl}^2}\int_{s_0}^{m't+b'}
1348: e^{2\pi i(u(-s\cos{\alpha_j} + t \sin{\alpha_j}) + v(s\sin{\alpha_j} +
1349: t \cos{\alpha_j}))}dsdt,\label{eq:integralPixel}\\ 
1350: % & = &
1351: % \sum_{l}A_l\int_{t_{ijl}^1}^{t_{ijl}^2}\int_{s_0}^{m't+b'} e^{2\pi
1352: % i(sc_1 + tc_2)}dsdt,\\ 
1353: & \simeq & 
1354: \sum_{l}\frac{A_l}{\pi c_2}e^{2\pi i(s_0c_1 + \bar{t}_{ijl}c_2)}\kappa_{ijl}\delta_x,
1355: \label{eq:IExacto}
1356: \end{eqnarray}
1357: where we defined
1358: \begin{eqnarray}
1359:   c_1 & \equiv & -u\cos{\alpha_j} + v\sin{\alpha_j},\\ 
1360:   c_2 & \equiv & u\sin{\alpha_j} + v\cos{\alpha_j},\\ 
1361:   \kappa_{ijl} & \equiv &
1362:   (M_x\bar{t}_{ijl} + B_x)\sin{(\pi c_2\Delta t_{ijl})} \\\nonumber 
1363:   &&+ i\frac{M_x}{2} \bigg(\frac{\sin(\pi c_2\Delta t_{ijl})}{\pi c_2} -
1364:   \Delta t_{ijl}\cos{(\pi c_2\Delta t_{ijl})}\bigg),\\ 
1365:   \bar{t}_{ijl} & \equiv & \frac{t_{ijl}^1 + t_{ijl}^2}{2},\\ 
1366:   \Delta t_{ijl} & \equiv & \frac{t_{ijl}^2 - t_{ijl}^1}{2}.
1367: \end{eqnarray}
1368: 
1369: In the calculation above we integrated over the fraction of the edge
1370: that falls inside pixel $l$ and then summed these integrals over the
1371: whole edge $a_i$. It is also possible to approximate the integral of
1372: Eq. \ref{eq:integralPixel} as $\int_{t_{ijl}^1}^{t_{ijl}^2}g(t) dt =
1373: g(\bar{t}_{ijl})\Delta t_{ijl}$, which is equivalent to taking the
1374: limit over the integral $\mathcal{I}$ of Eq. \ref{eq:IExacto},
1375: $\lim_{\Delta t_{ijl}\rightarrow 0} \mathcal{I}$, obtaining
1376: \begin{equation}
1377:   \mathcal{I} = \sum_{l} A_l\Delta t_{ijl} (M_x\bar{t}_{ijl} + B_x)e^{2\pi
1378:     i(\bar{t}_{ijl}c_2 + s_0c_1)}\delta_x.\label{eq:IAprox}
1379: \end{equation}
1380: 
1381: We found by direct evaluation that the difference between
1382: Eq.~\ref{eq:IAprox} and Eq.~\ref{eq:IExacto} is negligible, so, for
1383: simplicity, we will use Eq. \ref{eq:IAprox}. Introducing
1384: Eq. \ref{eq:IAprox} in Eq. \ref{eq:DeltaVis}, we obtain
1385: \begin{equation}
1386:   \Delta V_k = \delta_x\sum_{j\in
1387:     J_i}\bigg[(I_i-I_j)\sum_{l} A_l\Delta t_{ijl} (M_x\bar{t}_{ijl} +
1388:     B_x)e^{2\pi i(\bar{t}_{ijl}c_2 + s_0c_1)}\bigg],
1389: \end{equation}
1390: so, according to Eq. \ref{eq:DerivadaLimite}, the derivative of the
1391: $k$ visibility with respect to the position $x$ of polygon $i$ is
1392: \begin{eqnarray}
1393:   \frac{\partial V_k}{\partial x_i} & = &
1394:   \lim_{\delta_x\to 0}\frac{\Delta V}{\delta_x},\\
1395:   & = & \sum_{j\in
1396:     J_i}\bigg[(I_i-I_j)\sum_{l} A_l\Delta t_{ijl} (M_x\bar{t}_{ijl} +
1397:     B_x)e^{2\pi i(\bar{t}_{ijl}c_2 + s_0c_1)}\bigg].
1398: \end{eqnarray}
1399: Similarly, for the derivative with respect to the position $y$ of the
1400: $i$ polygon we obtain
1401: \begin{equation}
1402:   \frac{\partial V_k}{\partial y_i} = \sum_{j\in
1403:     J_i}\bigg[(I_i-I_j)\sum_{l} A_l\Delta t_{ijl} (M_y\bar{t}_{ijl} +
1404:     B_y)e^{2\pi i(\bar{t}_{ijl}c_2 + s_0c_1)}\bigg],
1405: \end{equation}
1406: where
1407: \begin{eqnarray}
1408: M_y & \equiv &\frac{\cos^2\alpha_j}{x_i - x_j} =
1409: \frac{\sin{\alpha_j}\cos{\alpha_j}}{y_j - y_i},\\ 
1410: B_y & \equiv & \frac{\sin{\alpha_j}}{y_j-y_i}(s_0\sin{\alpha_j} - y_i)
1411: = \frac{\cos{\alpha_j}}{x_i-x_j}(s_0\sin{\alpha_j} - y_i).
1412: \end{eqnarray}
1413: 
1414: \subsection {Calculation of the Derivatives of $S$}\label{ap:derivativesdS}
1415: 
1416: We defined our entropy as 
1417: \begin{eqnarray}
1418: S & = & \ln\left(\frac{N!}{n^N
1419:   \prod_{i=1}^{n}N_{i}!}\right) \\
1420:   & = & \ln(N!) - N\ln(n) - \sum_{i=1}^{n}\ln(N_i!)\\
1421:   & = & \ln\Big(\Gamma(N + 1)\Big) - N\ln(n) - \sum_{i=1}^{n}\ln\Big(\Gamma(N_i +
1422:   1)\Big),
1423: \end{eqnarray}
1424: where $N_i = \frac{I_i}{\sigma_\mathrm{q}}$ is the number of quanta
1425: in cell $i$, $N = \sum_iN_i$ and $\Gamma$ is the Gamma function. It
1426: can be seen that this function does not depend on the position of the
1427: Voronoi generators, so
1428: \begin{equation}
1429:   \frac{\partial S}{\partial x_i} = \frac{\partial S}{\partial y_i} = 0.
1430: \end{equation}
1431: 
1432: Using Weierstrass' definition of the Gamma function
1433: \[\Gamma(z) = z^{-1}e^{-\gamma z}\prod_{n=1}^\infty
1434:   \left[\left(1 + \frac{z}{n}\right)^{-1} e^{z/n}\right],\]
1435: where $\gamma$ is Euler's constant, we can obtain
1436: \begin{equation}
1437:   \frac{\partial \ln \Big(\Gamma(z + 1)\Big)}{\partial z} = -\gamma +
1438:   \sum_{n=1}^z\frac{1}{n}
1439: \end{equation}
1440: so, the derivative of $S$ with respect to $I_i$ is
1441: \begin{equation}
1442:   \frac{\partial S}{\partial I_i} = \frac{1}{\sigma_\mathrm{q}}
1443:   (\sum_{k=1}^{N}\frac{1}{k} - \ln{n} - \sum_{k=1}^{N_i}\frac{1}{k}).
1444: \end{equation}
1445: 
1446: \subsection {Finite Difference Cross Check on the Derivatives}
1447: 
1448: Numerical calculation of the derivatives by finite differences is not
1449: very accurate, in particular for the position of the
1450: generators. Finite difference derivatives are calculated as
1451: $\frac{\partial L}{\partial x} = \frac{ L(x + \delta) -
1452: L(x)}{\delta}$, where $\delta$ is a small displacement of $x$. In the
1453: case of the positions of the generators, if $\delta$ is too small, the
1454: pixelization of the Voronoi diagram (needed to obtain the model
1455: visibilities) will not change after the displacement $\delta$. On the
1456: other hand, if $\delta$ is too big, the generator displacement may
1457: cause the function to change abruptly, as explained below. That is why
1458: we calculated the analytical expression for the derivatives.
1459: 
1460: To verify that our derivatives are correctly calculated and
1461: programmed, we compared our analytical result with a numerical
1462: calculation. We created a Voronoi tessellation of $50$ polygons with
1463: random positions and intensities and calculated the analytical and
1464: numerical derivatives using these parameters $\{x_i, y_i, I_i\}$. For
1465: the case of $\frac{\partial L}{\partial x_i}$ and $\frac{\partial
1466: L}{\partial y_i}$ this numerical cross check consists of moving each
1467: Voronoi generator a quantity $\delta$ from -0.1 to 0.1 with an
1468: interval of $10^{-3}$ in units of the total size of the square
1469: image. We evaluate the merit function $L$ at each position intervals,
1470: thus obtaining two sequences $\{L_i\}_{i=1}^{2\times 10^2}$. We then
1471: fitted a polynomial of order four to the curve defined by each
1472: sequence $\{L_i\}$ and calculated the derivative of the polynomial at
1473: $\delta = 0$. For the case of $\frac{\partial L}{\partial I_i}$ we
1474: varied the intensity of cell $i$ from $-\sigma_\mathrm{q}$ to
1475: $\sigma_\mathrm{q}$ and did the same approximation to a polynomial of
1476: order four and calculated its derivative. Figure~\ref{fig:dLdx} shows
1477: this cross check for $\frac{\partial L}{\partial x_i}$ and
1478: $\frac{\partial L}{\partial I_i}$. Although the derivatives are
1479: similar, they are not exactly the same for $\frac{\partial L}{\partial
1480: x_i}$. This is caused by the polynomial coarseness fit, as explained
1481: below.
1482: 
1483: \begin{figure}[h]
1484:   \epsscale{1.1}
1485: %  \plottwo{figuras/dx.ps}{figuras/dI.ps}
1486:   \plottwo{f8a.eps}{f8b.eps}
1487:   \caption{Verification of the derivatives. The solid line shows
1488:   analytical derivatives, and dots show numerical
1489:   approximations. \emph{Left:} $\frac{\partial L}{\partial
1490:   x_i}$. \emph{Right:} $\frac{\partial L}{\partial I_i}$. 
1491:   The polygon identifier $i$ is indicated on the $x$-axis.
1492:   % The polygon $i$ is represented on the $x$-axis.
1493:   \label{fig:dLdx}}
1494: \end{figure}
1495: 
1496: % MOCKCBI
1497: % In order to increase the calculation speed, MockCBI uses a fast
1498: % Fourier transform (FFT) to obtain a grid of the $uv$ plane. To obtain
1499: % the visibility at a certain baseline MockCBI interpolates the
1500: % visibility value at the four pixels surrounding the $(u, v)$
1501: % point. Figure \ref{fig:comparacionVis} shows a comparison of the
1502: % visibilities obtained using MockCBI and a direct Fourier transform
1503: % (DFT). We plotted correlations between visibility real and imaginary
1504: % parts, and modulus. We also fitted a straight line $f(x) = ax + b$ to
1505: % these data obtaining $a = 0.93,$ $b = 0.14$ for the real part, $a =
1506: % 0.51,$ $b = -0.06$ for the imaginary part and $a = 1.28,$ $b = 0.01$
1507: % for the modulus. It can be seen that using a FFT and then
1508: % interpolating to obtain simulated visibilities is not very accurate,
1509: % though it is considerably fastest for implementation purpose. The FFT
1510: % takes $\mathcal{O}((n_x\times n_y)\log{(n_x\times n_y)})$, while the
1511: % DFT takes $\mathcal{O}((N_\mathrm{Vis})(n_x\times n_y))$, where $n_x$
1512: % is the image width in pixels, $n_y$ is the image height in pixels, and
1513: % $N_\mathrm{Vis}$ is the number of visibilities. The DFT computation
1514: % time is considerably larger than the FFT, an important issue for our
1515: % method, so we decided to use MockCBI although is not exact. Figure
1516: % \ref{fig:dLdx} shows a comparison of the numerical and the analytical
1517: % derivatives for each polygon, using MockCBI (i.e. a FFT) and a DFT to
1518: % calculate model visibilities. Analytical-MockCBI derivatives use
1519: % MockCBI only when calculating model visibilities, the analytical
1520: % derivatives are calculated as above using these visibilities. It can
1521: % be seen that difference between using MockCBI and a DFT is small.
1522: 
1523: Figure \ref{fig:ajuste} shows the curve fit for $\frac{\partial
1524: L}{\partial x_i}$ for three different generators (generator number 37,
1525: 36 and 18 respectively). It can be seen in Figure \ref{fig:ajuste}
1526: that the polynomial fit adjusts quite well to the function values for
1527: polygon number 37, so on Figure \ref{fig:dLdx} both derivatives are
1528: the same. On the contrary, for polygons number 36 and 18, the fitted
1529: polynomial does not resemble the function $L$ at $\delta = 0$, causing
1530: a slight difference in their derivatives on Figure \ref{fig:dLdx}. For
1531: polygon number 18 the polynomial does not fit the curve at all. This
1532: is the main problem of using a numerical approximation for the
1533: derivatives of $\{\vec{x_i}\}$: when two polygons are closer than
1534: $\delta$, the generator displacement causes the function $L$ to change
1535: abruptly (see Figure \ref{fig:VorProblem}).
1536: 
1537: \begin{figure}[h!]
1538: %  \includegraphics[scale=1.05]{figuras/ajuste37.ps}
1539: %  \includegraphics[scale=1.05]{figuras/ajuste36.ps}
1540: %  \includegraphics[scale=1.05]{figuras/ajuste18.ps}
1541:   \epsscale{0.4}
1542:   \plotone{f9a.eps}
1543: 
1544:   \plotone{f9b.eps}
1545: 
1546:   \plotone{f9c.eps}
1547:   \caption{Examples of polynomial fits, used to determine numerical
1548:   derivatives of the optimization function $L$ for a particular
1549:   generator. Dots represent $L$ vs the polygon displacement $\delta$
1550:   in $x$ and the solid line shows the fourth order polynomial fit to
1551:   $L$. A vertical line is drawn at $\delta = 0$, where the derivatives
1552:   were calculated. \emph{Top:} Generator number 37, with a
1553:   satisfactory polynomial fit. \emph{Middle:} Generator number 36, the
1554:   curve is not a good fit at $\delta = 0$. \emph{Bottom:}
1555:   $18^\mathrm{th}$ generator, the curve is not a good fit because $L$
1556:   shows an abrupt variation near $\delta = 0$, which is due to the
1557:   proximity of another generator.\label{fig:ajuste}}
1558:   
1559:   % does not fit well to $L$ because the generator is too close to another
1560:   % one.\label{fig:ajuste}}
1561: \end{figure}
1562: 
1563: \begin{figure}[h]
1564:   \epsscale{0.9}
1565: %  \plottwo{figuras/VorDelta1.eps}{figuras/VorDelta2.eps}
1566:   \plottwo{f10a.eps}{f10b.eps}
1567:   \caption{Translation of a generator close to another. \emph{Left:}
1568:   Before moving generator $\vec{x_i}$, polygon $i$, the darker polygon
1569:   in the image, is on the left. \emph{Right:} After moving generator
1570:   $\vec{x_i}$, by a displacement of $\delta$, polygon $i$ is on the
1571:   right of polygon $j$.  When displacing generator $\vec{x_i}$ the
1572:   diagram changes considerably, with a concomitant abrupt variation in
1573:   $L$.
1574:   %making the optimization function $L$ to change as well.
1575:   \label{fig:VorProblem}}
1576: \end{figure}
1577: 
1578:  It can be seen that the analytical and numerical derivatives on
1579: Figure \ref{fig:dLdx} are almost the same. As explained above,
1580: differences are produced because there are cases where the polynomials
1581: do not fit well to the variations of the merit function $L$ (for
1582: example, when two generators are too close). In an accurate
1583: calculation it is necessary to use the analytical derivatives.
1584: 
1585: \section{Fitting a Voronoi Tessellation to an Image}\label{ap:fitting}
1586: Once we have a reasonable reconstruction for a pixelated image, we
1587: would like to fit a Voronoi tessellation to it in order to have a good
1588: initial starting point for the CG. This is done in an incremental way.
1589: 
1590: We start with a mesh consisting in only one polygon. We calculate the
1591: error per polygon as
1592: \begin{equation}
1593:   e_i^2 = \sum_l
1594:   (I_i - I^\mathrm{im}_l)^2,
1595: \end{equation}
1596: where the sum runs over all the pixels that fall inside polygon $i$,
1597: $I_i$ is the intensity of that polygon and $I^\mathrm{im}_l$ is the
1598: intensity of pixel $l$ in the image to be fitted. In each iteration we
1599: add a new polygon inside the one with the greatest error. The new
1600: generator is inserted in the position of the pixel that has the most
1601: different intensity value with respect to the mesh intensity.
1602: 
1603: % I'm not sure where to put this. I guess its an incremental algorithm,
1604: % so I left it here.
1605: 
1606: % {\it Facilities:} \facility{CBI}.
1607: 
1608: \begin{thebibliography}{}
1609: %\bibitem[Briggs et al.(1999)]{Briggs} Briggs D. S., Schwab F. R.,
1610: %  Sramek R. A., 1999, in Taylor G. B., Carilli C. L., Perley R. A., eds,
1611: %  Synthesis Imaging in Radio Astronomy II, ASP Conf. Ser. 180,
1612: %  Astron. Soc. Pac., San Francisco, p. 127
1613: \bibitem[Briggs et al.(1999)]{Briggs} Briggs, D. S., Schwab, F. R. \&
1614:   Sramek R. A. 1999, ASP Conf. Ser., 180, 127
1615: \bibitem[Casassus et al.(2006)]{Casassus} Casassus, S., Cabrera, G. F., F\"orster, F, Pearson, T. J., Readhead, A. C. S., Dickinson, C. 2006,
1616:   \apj, 639, 951
1617: \bibitem[Cornwell \& Evans(1985)]{Corn&Ev} Cornwell, T. J. \& Evans,
1618:   K. F. 1985, A\&A, 143, 77
1619: \bibitem[H\"ogbom(1974)]{CLEAN} H\"ogbom, J. A. 1974, A\&AS, 15, 417
1620: \bibitem[Narayan \& Nityananda(1986)]{Nar&Nit} Narayan, Ramesh \&
1621:   Nityananda, Rajaram 1986, \araa, 24, 127
1622: \bibitem[Okabe et al.(1992)]{Voronoi} Okabe, A., Boots, B. \&
1623:   Sugihara, K. 1992, Spacial Tessellations Concepts and Applications
1624:   of Voronoi Diagrams, John Wiley \& Sons
1625: \bibitem[Padin et al.(2002)]{pad02} Padin, S., et al, 2002, \pasp,
1626:   114, 83
1627: \bibitem[Pi\~na \& Puetter(1993)] {P&P}Pi\~na, R. K. \& Puetter,
1628:   R. C. 1993, \pasp, 105, 630
1629: \bibitem[Press et al.(1992)]{NumericalRecipes} Press, W. H., Flannery,
1630:   B. P., Teukilsky, S. A., Vettering, W. Y. 1992, Numerical Recipes in
1631:   C, C. Cambridge University Press
1632: \bibitem[Shepherd(1997)]{she97} Shepherd, M.C., 1997, in ASP
1633:   Conf. Ser. 125, Astronomical Data Analysis Software and Systems VI,
1634:   ed. G.~Hunt \& H.E.~Payne (San Francisco: ASP), 77
1635: \bibitem[Sutton \& Wandelt(2006)]{W&S} Sutton, E. C. \& Wandelt,
1636:   B. D. 2006, \apjs, 162, 401
1637: % \bibitem[Auri\`ere(1982)]{aur82} Auri\`ere, M.  1982, \aap,
1638: %   109, 301
1639: \end{thebibliography}
1640: 
1641: \clearpage
1642: 
1643: \clearpage
1644: 
1645: % \begin{figure}
1646: %   \begin{minipage}[t]{2.1in}
1647: %     %\epsscale{1.1}
1648: %     %\plottwo{figuras/gaussianas.ps}{figuras/reconstruccion.ps}
1649: %     \begin{center}
1650: %       TRUE
1651: %       \includegraphics[scale=.32]{figuras/gaussianas128.ps}
1652: %       DIRTY MAP
1653: %       \includegraphics[scale=.32]{figuras/dirtyMap128K.ps}
1654: %     \end{center} 
1655: %   \end{minipage}
1656: %   \begin{minipage}[t]{2.1in}
1657: %     \begin{center}
1658: %       MEM RECONSTRUCTION
1659: %       \includegraphics[scale=.32]{figuras/MEM_CBIFinal.ps}
1660: %       MEM RESIDUALS
1661: %       \includegraphics[scale=.32]{figuras/residuosMEM128K.ps}
1662: %       MEM RESTORED
1663: %       \includegraphics[scale=.32]{figuras/MEMrestaurada128.ps}
1664: %     \end{center}
1665: %   \end{minipage}
1666: %   \begin{minipage}[t]{2.1in}
1667: %     \begin{center}
1668: %       VIR RECONSTRUCTION
1669: %       \includegraphics[scale=.32]{figuras/VIR_48_nsr50_128.ps}
1670: %       VIR RESIDUALS
1671: %       \includegraphics[scale=.32]{figuras/residuosVIR128K.ps}
1672: %       VIR RESTORED
1673: %       \includegraphics[scale=.32]{figuras/VIRrestaurada128.ps}
1674: %     \end{center}
1675: %   \end{minipage}
1676: %   \caption{Comparison of MEM and VIR reconstruction techniques for a
1677: %     nsr of $\sim51$.
1678: %     \label{fig:reconstruccion}}
1679: % \end{figure}
1680: 
1681: % MOCKCBI
1682: % \begin{figure}
1683: %   \epsscale{1.1}
1684: %   \includegraphics[scale=0.423]{figuras/comparacionVisR.ps}
1685: %   \includegraphics[scale=0.423]{figuras/comparacionVisI.ps}
1686: %   \includegraphics[scale=0.423]{figuras/comparacionVis.ps}
1687: %   \caption{Simulated visibility correlations over the full range of
1688: %     the $uv$-radii obtained by MockCBI and by a direct Fourier
1689: %     transform for each visibility baseline. Plots show real
1690: %     (\emph{left}) and imaginary (\emph{center}) parts, and
1691: %     modulus (\emph{right}).\label{fig:comparacionVis}}
1692: % \end{figure}
1693: 
1694: \end{document}
1695: