1: \NeedsTeXFormat{LaTeX2e}[1995/06/01]
2: \documentclass[10pt]{article}
3: \usepackage{epsfig,graphics}
4: \usepackage{cite}
5: %%%%% Packages d'edition francaise.
6: %\usepackage[english,french]{babel}
7: %\usepackage{english}
8: \usepackage[T1]{fontenc}
9: \usepackage[swedish,english]{babel}
10: %%%%% Package for theorems
11: \usepackage{ntheorem}
12:
13: %%%%% Standards mathematical sets
14: \newcommand{\N}{{\bf N}}
15: \newcommand{\Z}{{\bf Z}}
16: \newcommand{\Q}{{\bf Q}}
17: \newcommand{\R}{{\bf R}}
18: \newcommand{\C}{{\bf C}}
19: \newcommand{\Qu}{{\bf H}}
20: \newcommand{\card}{\rm card}
21:
22: %%%% Abreviations pour definition, theoreme, et demonstration.
23:
24: \newtheorem{df}{Definition}[section]
25: \newtheorem{theorem}{Theorem}[section]
26: \newtheorem{prop}{Proposition}[section]
27: \newtheorem{lemma}{Lemma}[section]
28:
29: \begin{document}
30:
31: %%%%%%%%%%%%%%%% Title %%%%%%%%%%%%%%%%%%%%%
32: \title{\textsf{Almost sure convergence of the minimum bipartite matching
33: functional in Euclidean space}}
34:
35: %%%%%%%%%%%%%%%%% Authors %%%%%%%%%%%%%%%%%%%%
36: \author{\textsf{J.H.~Boutet de Monvel$^*$ and O.C.~Martin$^\dag$} \\
37: \textsf{\small $^*$Center for Hearing and Communication Research, Karolinska Institutet, 17176}\\
38: \textsf{\small Stockholm, Sweden; $^\dag$Laboratoire de Physique Th\'eorique et Mod\`eles Statistiques,}\\
39: \textsf{\small Universit\'e de Paris-Sud, 91405 Orsay, France;}}
40:
41: %%%%%%%%%%%% Date and Title %%%%%%%%%%%%%%%%%
42: \date{To appear in Combinatorica}
43: \maketitle
44:
45: %%%%%%%%%%%%%%% Abstract %%%%%%%%%%%%%%
46: \begin{abstract}
47: Let $L_N = L_{MBM}(X_1,\ldots ,X_N; Y_1,\ldots ,Y_N)$ be the minimum length of a
48: bipartite matching between two sets of points in $\mathbf{R}^d$, where
49: $X_1,\ldots ,X_N,\ldots$ and $Y_1,\ldots ,Y_N,\ldots$ are random points independently and
50: uniformly distributed in $[0,1]^d$. We prove that for $d \ge 3$, $L_N/N^{1-1/d}$ converges
51: with probability one to a constant $\beta_{MBM}(d)>0$ as $N\to \infty $.
52: \end{abstract}
53:
54: %%%%%%%%%%%%%%%%%%% Text proper %%%%%%%%%%%%%%%%%%%%%%%%
55: \section{Introduction and statement of the result.}
56:
57: \noindent Given two sets of $N$ points $X=\{X_1,...,X_N\}$ and $Y=\{Y_1,...,Y_N\}$ in
58: $\R^d$, a bipartite matching of $X$ and $Y$ is a perfect matching $M$ on the set $X\cup Y$,
59: such that each pair in $M$ is made of one point of $X$ and one point of $Y$. The length of such
60: a matching is defined to be the sum of the euclidean lengths of the edges formed by its pairs.
61: The (euclidean) minimum bipartite matching problem (MBMP) then asks one to find a
62: bipartite matching of $X$ and $Y$ whose length is as small as possible. We shall denote by
63: $L_{MBM}(X,Y)$ the length of a minimum bipartite matching of $X$ and $Y$.
64:
65: A related problem is the simple minimum matching problem (MMP), where one is asked
66: to find a perfect matching of smallest euclidean length on a set $X=\{X_1,...,X_N\}\subset \R^d$.
67: The subadditive methods inaugurated by Beardwood, Halton and Hammersley
68: (BHH) \cite{BHH59_PCPS} and further developed
69: in \cite{Steele81_AP,Rhee93_AAP,RedmondYukich94_AAP}, show
70: that a strong limit theorem applies to the length $L_{MM}(X)$ of a simple minimum matching
71: on $X$, when the points $X_1,\ldots, X_N$ are random.
72: The theorem states that for any dimension $d$, if $X_1,\ldots, X_N,\ldots$ is a sequence of
73: points distributed independently and uniformly in a bounded region $\Omega\subset {\mathbf R}^d$,
74: then the ratio $L_{MM}(X_1,\ldots X_N)/N^{1-1/d}$ converges almost surely to
75: ${\rm Vol(\Omega)}^{1/d}\beta_{MM}(d)$, where ${\rm Vol(\Omega)}$ denotes the Lebesgues
76: measure of $\Omega$ and $\beta_{MM}(d)>0$ is a universal constant depending only upon $d$.
77:
78: The functional $L_{MBM}$ does not satisfy this form of limit theorem in dimensions
79: $1$ and $2$. For $d=1$, the MBMP amounts to a sorting problem and it is not difficult
80: to show that if $X$ and $Y$ both consist of $N$ points independently and uniformly
81: distributed in $[0,1]$, there are constants $0<C_1<C_2$ such that
82: $C_1\sqrt N\le L_{MBM}(X,Y)\le C_2 \sqrt N$ with probability $1-o(1)$ as
83: $N\to \infty$. Moreover in that case the variance of $L_{MBM}(X,Y)/\sqrt{N}$ does
84: {\it not} converge to zero as $N\to \infty$. ($L_{MBM}$ is not ``self-averaging'',
85: in the statistical physics' terminology.)
86: For $d=2$ Ajtai et al. \cite{Ajtai&Al84_C} proved a remarkable fact: if the sets
87: $X,Y$ are now distributed in $[0,1]^2$, then for some constants $C_1,C_2$ indendent of
88: $N$, one has $C_1\sqrt{N\log N}\le L_{MBM}(X,Y)\le C_2\sqrt{N\log N}$ with
89: probability $1-o(1)$. Numerical simulations suggest that $L_{MBM}(X,Y)/\sqrt{N\log N}$
90: converges to a non-random constant as $N\to \infty$, however this has not yet been proved.
91:
92: In this article, we show that for any $d\ge 3$ we recover a BHH theorem for the functional
93: $L_{MBM}$.
94:
95: \begin{theorem}\label{th1}
96: Let $X_1,...,X_N,...$ and $Y_1,...,Y_N,...$ be two sequences of
97: random points independently and uniformly distributed in $[0,1]^d$, where
98: $d\ge 3$, and let $L_N = L_{MBM}(X_1,\ldots ,X_N;Y_1,\ldots ,Y_N)$.
99: There exists a constant $\beta_{MBM}(d)>0$ such that
100: with probability one
101: $$ \lim_{N\to \infty} L_N/ N^{1-1/d} = \beta_{MBM}(d).$$
102: \end{theorem}
103:
104: \section{Proof of Theorem \ref{th1}.}
105:
106: To begin, we remark that to prove this theorem it will suffice to
107: establish that $L_N/N^{1-1/d}$ converges in mean value to a constant
108: $\beta_{MBM}(d)$. This is a consequence of the following lemma \cite{Talagrand92_AAP}:
109:
110: \begin{lemma}
111: For any $t>0$, one has
112: $$P(|{L_N\over N^{1-1/d}}- E({L_N\over N^{1-1/d}})| > t) \le 2 \exp(-{N^{1-2/d} t^2\over 8d}).$$
113: \end{lemma}
114:
115: \noindent This result follows from the application of Azuma's inequality \cite{Azuma67_TMJ}
116: and the martingale difference method to $L_N$, in a way by now standard in the
117: probabilistic theory of combinatorial optimisation \cite{Steele97_Book}.
118: Given the lemma, the theorem follows easily from the convergence of
119: $EL_N/N^{1-1/d}$ as $N\to \infty$, by applying the Borel-Cantelli lemma.
120:
121: We have now to establish that for $d\ge 3$ the quantity
122: $EL_N/N^{1-1/d}$ indeed converges to a constant $\beta_{MBM}(d)>0$.
123: To prove this we exploit the subadditivity properties of $L_{MBM}$, in the spirit
124: of Steele's theory of subadditive Euclidean functionals \cite{Steele81_AP}.
125: Let us divide the unit cube $[0,1]^d$ into disjoint
126: similar subcubes $Q_k,~k=1,\ldots ,m^d$ with edges of length $1/m$,
127: and compare the value of $L_{MBM}(X,Y)$ to
128: the sum
129: \begin{equation} \label{SumOnCubes}
130: \sum_{k=1}^{m^d} L_k,
131: \end{equation}
132: where $L_k$ is the value of the functional $L_{MBM}$ for the set of points
133: $X_i$ and $Y_i$ which belongs to $Q_k$. A difficulty arises as in
134: general the $Q_k$'s do not contain the same number of points $X_i$ and of
135: points $Y_i$. (In fact the special properties of the MBMP in dimensions $1$ and
136: $2$ originate from the fluctuations of the differences between these numbers
137: around their mean value $0$.)
138: To give meaning to the sum (\ref{SumOnCubes}) we need to generalize the
139: functional $L_{MBM}$ to matchings between two sets of different cardinalities.
140: There are several ways to do this; we shall define
141: $L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})$ by imposing that
142: the minimum matching contains as few unmatched points as possible. That is if
143: $N_1>N_2$, we leave $N_1-N_2$ points of $X$ unmatched, whereas if
144: $N_1<N_2$ we leave $N_2-N_1$ points of $Y$ unmatched.
145:
146: Although expression (\ref{SumOnCubes}) now makes sense, it is still not possible
147: to write a subadditivity inequality of the same form as the one studied
148: in \cite{Steele81_AP}. Indeed, such a form (which Steele calls ``geometric
149: subadditivity'') implies an upper bound of the form $CN^{1-1/d}$ for the functional
150: at hand \cite{Steele97_Book}, and it is easy to see that no such bound applies
151: to $L_{MBM}(X,Y)$. We shall however see that a geometric subadditivity
152: property holds {\it in the mean} for the functional $L_{MBM}$.
153: Suppose that the points $X_1,\ldots X_{N_1},Y_1,\ldots Y_{N_2}$ belong to an
154: arbitrary cube $Q$ having edge length $a$, and divide $Q$ into
155: disjoint cubes $Q_p,~p=1,\ldots 2^d$ by splitting each edge in two halves.
156: Construct in each $Q_p$ an optimal matching in the sense just defined,
157: between the $n_{1,p}$ points $X_i$ and the $n_{2,p}$ points $Y_i$ in $Q_p$,
158: and denote its length by $L_p$.
159: The points that are left unpaired are in number $|n_{1,p}-n_{2,p}|$ in each
160: $Q_p$, so if $L_0$ denotes the length of an optimal matching for these
161: points one has
162: \begin{eqnarray} \label{Decimation}
163: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots ,Y_{N_2}) \le
164: \sum_{p=1}^{2^d} L_p + L_0 \nonumber\\
165: \le \sum_{p=1}^{2^d} L_p + {1\over 2} a\sqrt d
166: \sum_{p=1}^{2^d} |n_{1,p}-n_{2,p}|,
167: \end{eqnarray}
168: where the last inequality is obtained by bounding $L_0$ in an obvious way.
169:
170: We shall apply this to $Q=[0,1]^d$. Let $Q_{p_1}~p_1=1,\ldots 2^d$
171: be the cubes obtained in the above subdivision; let $Q_{p_1p_2}$ be
172: the cubes obtained by splitting in two halves the edges of each cube $Q_{p_1}$;
173: and so on. By repeating this operation $K$ times, we get a subdivision with
174: cubes $Q_{p_1\ldots p_K}$ whose edges are of length $1/2^K$. Let
175: $n_{1,p_1\ldots p_K}$ and $n_{2,p_1\ldots p_K}$ be respectively
176: the number of points $X_i$ and $Y_i$ in $Q_{p_1\ldots p_K}$. Apply
177: (\ref{Decimation}) first to the $Q_{p_1,\ldots p_{K-1}}$'s, then to the
178: $Q_{p_1\ldots p_{K-2}}$'s, etc, keeping at each step only those points which
179: are still unpaired. It is easy to convince oneself that the number of unpaired
180: points in each $Q_{p_1,\ldots p_{K-k}}$ just after step $k$ is given by
181: $|n_{1,p_1,\ldots p_{K-k}}-n_{2,p_1,\ldots p_{K-k}}|$. After step $k=K$ one
182: obtains a matching between $X_1,\ldots X_{N_1}$ and $Y_1,\ldots Y_{N_2}$
183: where all the points but $|N_1-N_2|$ are matched.
184: One is thus led to the following inequality:
185: \begin{eqnarray} \label{SousAddMBMP}
186: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})
187: \le \sum_{p_1\ldots p_K} L_{p_1\ldots p_K} \nonumber\\
188: + \sum_{k=1}^K {\sqrt d\over 2^k}
189: \sum_{p_1\ldots p_k} |n_{1,p_1\ldots p_k}-n_{2,p_1\ldots p_k}|.
190: \end{eqnarray}
191: We now proceed to derive a subadditivity property for the mean
192: value of $L_{MBM}(X,Y)$. We first consider the case where
193: $N_1=\card X$ and $N_2=\card Y$ are not fixed integers but are independent Poisson
194: random variables with the same mean value $N$, the elements of $X$ and $Y$ being
195: chosen independently and uniformly in $[0,1]^d$. For a given $k$, the numbers
196: $n_{1,p_1,\ldots p_k}$ and $n_{2,p_1,\ldots p_k}$ are then also independent
197: Poisson random variables, with parameter $N/2^{kd}$. Let
198: $M(N)= EL_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})$.
199: It is immediate by homogeneity that
200: \begin{equation}
201: EL_{p_1\ldots p_K} = 2^{-K} M(N/2^{Kd}).
202: \end{equation}
203: Moreover from the well known properties of Poisson variables we have
204: \begin{equation} \label{RMSPoisson}
205: E|n_{1,p_1\ldots p_k}-n_{2,p_1\ldots p_k}| \le
206: \sqrt 2 \Big( {N\over 2^{kd}} \Big)^{1/2}.
207: \end{equation}
208: By taking mean values in (\ref{SousAddMBMP}) we obtain:
209: \begin{equation}
210: M(N) \le 2^{K(d-1)}M(N/2^{Kd}) + \sqrt{2dN} \sum_{k=1}^K 2^{k(d/2-1)}.
211: \end{equation}
212: This inequality has been obtained for a subdivision of $[0,1]^d$ which
213: consists in $2^{Kd}$ similar cubes. Suppose now that we start from the
214: subdivision $\Sigma$ in $m^d$ similar cubes $Q_k~k=1,\ldots m^d$,
215: where $m$ is an arbitrary integer. One can then reproduce the previous
216: construction in the following manner. Let $m=2^K+r$
217: where $0\le r<2^K$. Consider the cube $Q_0=[0,2^{K+1}/m]^d$ and form the
218: natural subdivision $\Sigma_0$ of $Q_0$ by $2^{(K+1)d}$ cubes
219: $Q_{p_0,\ldots p_K}$ whose edges have length $1/m$. We can proceed with
220: $Q_0$ and $\Sigma_0$ to a $K+1$ steps construction similar to the one
221: which led to (\ref{SousAddMBMP}). The only differences are that $Q_0$ has
222: edges of length $2^{K+1}/m$ rather than $1$, and that some of the
223: $Q_{p_0\ldots p_K}$'s, namely those which belong to
224: $\Sigma_0$ but not to $\Sigma$, are empty.
225: Nevertheless, we may write
226: \begin{eqnarray}
227: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots ,Y_{N_2}) - \sum_{p=1}^{m^d} L_k \nonumber \\
228: \le \sum_{k=0}^K {\sqrt d 2^{K-k} \over m}
229: \sum_{p_0\ldots p_k} |n_{1,p_0\ldots p_k}-n_{2,p_0\ldots p_k}| \nonumber \\
230: \le \sum_{k=0}^K {\sqrt d \over 2^k}
231: \sum_{p_0\ldots p_k} |n_{1,p_0\ldots p_k}-n_{2,p_0\ldots p_k}|.
232: \end{eqnarray}
233: Now $n_{1,p_0\ldots p_k}$ and $n_{2,p_0\ldots p_k}$ are Poisson
234: variables with parameter lower than $2^{(K-k)d} N/m^d \le 2^{-kd}N$ so we
235: still have
236: \begin{equation}
237: E|n_{1,p_0\ldots p_k}-n_{2,p_0\ldots p_k}| \le
238: \sqrt 2 \Big({N \over 2^{kd}} \Big)^{1/2}.
239: \end{equation}
240: Taking average values one is led to
241: \begin{equation}
242: M(N) \le m^{d-1} M(N/m^d) +
243: 2^d \sqrt{2dN} \sum_{k=0}^K 2^{k(d/2-1)}.
244: \end{equation}
245: Dividing this last inequality by $N^{1-1/d}$ and then replacing $N$ by
246: $m^dN$, we get
247: \begin{equation}
248: {M(m^dN) \over (m^dN)^{1-1/d}} \le {M(N)\over N^{1-1/d}} +
249: {2^d\sqrt{2d} \over N^{1/2-1/d}} \sum_{k=0}^K 2^{-k(d/2-1)}.
250: \end{equation}
251: If $d>2$, the sum on the r.h.s. of the last inequality is bounded above independently of
252: $N$, and is divided by a positive power of $N$. Elementary analysis now shows that the
253: ratio $M(N)/N^{1-1/d}$ necessarily converges to a limit $\beta_{MBM}(d)$ as $N\to \infty$.
254: Indeed, let $f(t) = M(t^d)/t^{d-1}$. One verifies at once that $f(t)$ satisfies
255: \begin{equation} \label{fInequality}
256: f(mt)\le f(t)+C/t^{d/2-1}
257: \end{equation}
258: for all $t>0$ and any integer $m$; $f(t)$ is continuous,
259: since $M(N)$ is a continuous function of $N$.
260: So the expression $f(t) + C_d/t^{d/2-1}$ is bounded in $[1,2]$ and since
261: $[1,\infty[$ is the union of the intervals $m[1,2], m\ge 1$, it follows
262: from (\ref{fInequality}) that $f(t)$ remains bounded as $t\to \infty$,
263: thus $\lim^* f(t) < \infty$. Now define $\beta=\lim_* f(t)$. For any
264: $\epsilon >0$, chose $t_0\gg 1$ and
265: $\eta >0$ such that $f(t)+C_d/t^{d/2-1} < \beta + \epsilon$
266: for $t$ in the interval $I=[t_0-\eta,t_0+\eta]$.
267: Since the intervals $mI$, $m\ge 1$ span a whole interval
268: $[A,\infty[$ for an $A$ sufficiently large,
269: it follows again from (\ref{fInequality}) that
270: $\lim^* f(t)\le \beta+\epsilon$.
271: Since $\epsilon$ is arbitrary one has $\lim^* f(t)=\beta$, hence
272: $f(t) \to \beta$ as $t\to \infty$, from which it follows that
273: $\lim_{N\to \infty} M(N)/N^{1-1/d}=\beta$. Q.E.D.
274:
275: We have thus shown for $d\ge 3$, that one has
276: \begin{equation} \label{PoissonMBMPAsymptotics}
277: EL_{MBM}(X_1,\ldots,X_{N_1};Y_1,\ldots,Y_{N_2})
278: \sim \beta_{MBMP}^E(d)N^{1-1/d},~N\to \infty
279: \end{equation}
280: when $N_1$ and $N_2$ are independent Poisson variables with parameter $N$.
281: The same result for the mean value $EL_N$, where $N$ is a fixed integer,
282: follows then easily. Indeed, we have the obvious bound
283: \begin{eqnarray}
284: |L_{MBM}(X_1,\ldots X_N;Y_1,\ldots Y_N)-
285: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})| \nonumber\\
286: \le \sqrt d (|N_1-N|+|N_2-N|),
287: \end{eqnarray}
288: whence taking mean values,
289: \begin{equation}
290: |EL_N - EL_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})| \le 2 \sqrt{2dN},
291: \end{equation}
292: and dividing by $N^{1-1/d}$ we deduce that
293: \begin{equation}
294: \lim_{N\to \infty} {EL_N\over N^{1-1/d}} \to \beta_{MBM}(d).
295: \end{equation}
296: Theorem \ref{th1} is now proved.
297:
298: \section{Concluding remarks.}
299:
300: \noindent 1) Our decimation procedure does not give back the bounds
301: proven by Ajtai {\it et al.} in $d=2$, but a weaker
302: $O(\sqrt{N} \ln N)$ bound.
303: It is believed that a self-averaging theorem applies also to the
304: functional $L_{MBM}$ in dimension $2$ \cite{Smith89_Thesis}.
305:
306: \noindent 2) The estimation of the constants $\beta_{MBM}(d)$ is also an
307: interesting problem. A remarkable result of Talagrand \cite{Talagrand92_AAP}
308: shows that one has $\beta_{MBM}(d)= \sqrt{d/2e\pi} (1+O(\ln d / d))$ as
309: $d\to \infty$. It is conjectured that a $1/d$ series expansion actually exists
310: for $\beta_{MBM}(d)$.
311:
312: \noindent 3) M\'ezard and Parisi have obtained detailed analytic predictions for
313: the {\it random link} versions of the MMP and the MBMP \cite{MezardParisi87_JdP},
314: where the distance matrix between the points $X_i$ and $Y_j$ is replaced by a matrix of
315: independent and identically distributed entries. (Some of these predictions, for the random
316: assignment problem, have been proven recently by Aldous \cite{Aldous01_RSA}.)
317: Numerical studies \cite{BoutetMartin97_PRL,HBM98_EPJB} indicate that for the MMP and the
318: MBMP, the random link model provides one with a very good ``mean-field'' approximation to
319: the Euclidean model in the large $d$ limit. Except for simpler combinatorial problems
320: however \cite{BertsimasVanRyzin90_ORL}, very few rigorous results are known for comparing
321: the euclidean and the random link models.
322:
323: \bigskip
324: {\noindent \bf \large Aknowledgments}
325:
326: \noindent It is a pleasure to thank J.M. Steele for fruitful discussions and pointing to us
327: reference \cite{Talagrand92_AAP}.
328:
329:
330: %%%%%%%%%%%%%%%%%% Bibliography %%%%%%%%%%%%%%%%%%%%%%%%%
331: \bibliography{co,jbdm}
332: \bibliographystyle{perroten}
333:
334: \end{document}
335:
336: