0205:math0205140/mbm.tex

1: \NeedsTeXFormat{LaTeX2e}[1995/06/01]

2: \documentclass[10pt]{article}

3: \usepackage{epsfig,graphics}

4: \usepackage{cite}

5: %%%%% Packages d'edition francaise.

6: %\usepackage[english,french]{babel}

7: %\usepackage{english}

8: \usepackage[T1]{fontenc}

9: \usepackage[swedish,english]{babel}

10: %%%%% Package for theorems

11: \usepackage{ntheorem}

12:

13: %%%%% Standards mathematical sets

14: \newcommand{\N}{{\bf N}}

15: \newcommand{\Z}{{\bf Z}}

16: \newcommand{\Q}{{\bf Q}}

17: \newcommand{\R}{{\bf R}}

18: \newcommand{\C}{{\bf C}}

19: \newcommand{\Qu}{{\bf H}}

20: \newcommand{\card}{\rm card}

21:

22: %%%% Abreviations pour definition, theoreme, et demonstration.

23:

24: \newtheorem{df}{Definition}[section]

25: \newtheorem{theorem}{Theorem}[section]

26: \newtheorem{prop}{Proposition}[section]

27: \newtheorem{lemma}{Lemma}[section]

28:

29: \begin{document}

30:

31: %%%%%%%%%%%%%%%% Title %%%%%%%%%%%%%%%%%%%%%

32: \title{\textsf{Almost sure convergence of the minimum bipartite matching

33: functional in Euclidean space}}

34:

35: %%%%%%%%%%%%%%%%% Authors %%%%%%%%%%%%%%%%%%%%

36: \author{\textsf{J.H.~Boutet de Monvel$^*$ and O.C.~Martin$^\dag$} \\

37: \textsf{\small $^*$Center for Hearing and Communication Research, Karolinska Institutet, 17176}\\

38: \textsf{\small Stockholm, Sweden; $^\dag$Laboratoire de Physique Th\'eorique et Mod\`eles Statistiques,}\\

39:  \textsf{\small Universit\'e de Paris-Sud, 91405 Orsay, France;}}

40:

41: %%%%%%%%%%%% Date and Title %%%%%%%%%%%%%%%%%

42: \date{To appear in Combinatorica}

43: \maketitle

44:

45: %%%%%%%%%%%%%%% Abstract %%%%%%%%%%%%%%

46: \begin{abstract}

47: Let $L_N = L_{MBM}(X_1,\ldots ,X_N; Y_1,\ldots ,Y_N)$ be the minimum length of a

48: bipartite matching between two sets of points in $\mathbf{R}^d$, where

49: $X_1,\ldots ,X_N,\ldots$ and $Y_1,\ldots ,Y_N,\ldots$ are random points independently and

50: uniformly distributed in $[0,1]^d$. We prove that for $d \ge 3$,  $L_N/N^{1-1/d}$ converges

51: with probability one to a constant $\beta_{MBM}(d)>0$ as $N\to \infty $.

52: \end{abstract}

53:

54: %%%%%%%%%%%%%%%%%%% Text proper %%%%%%%%%%%%%%%%%%%%%%%%

55: \section{Introduction and statement of the result.}

56:

57: \noindent Given two sets of $N$ points $X=\{X_1,...,X_N\}$ and $Y=\{Y_1,...,Y_N\}$ in

58: $\R^d$, a bipartite matching of $X$ and $Y$ is a perfect matching $M$ on the set $X\cup Y$,

59: such that each pair in $M$ is made of one point of $X$ and one point of $Y$. The length of such

60: a matching is defined to be the sum of the euclidean lengths of the edges formed by its pairs.

61: The (euclidean) minimum bipartite matching problem (MBMP) then asks one to find a

62: bipartite matching of $X$ and $Y$ whose length is as small as possible. We shall denote by

63: $L_{MBM}(X,Y)$ the length of a minimum bipartite matching of $X$ and $Y$.

64:

65: A related problem is the simple minimum matching problem (MMP), where one is asked

66: to find a perfect matching of smallest euclidean length on a set $X=\{X_1,...,X_N\}\subset \R^d$.

67: The subadditive methods inaugurated by Beardwood, Halton and Hammersley

68: (BHH) \cite{BHH59_PCPS} and further developed

69: in \cite{Steele81_AP,Rhee93_AAP,RedmondYukich94_AAP}, show

70: that a strong limit theorem applies to the length $L_{MM}(X)$ of a simple minimum matching

71: on $X$, when the points $X_1,\ldots, X_N$ are random.

72: The theorem states that for any dimension $d$, if  $X_1,\ldots, X_N,\ldots$ is a sequence of

73: points distributed independently and uniformly in a bounded region $\Omega\subset {\mathbf R}^d$,

74: then the ratio $L_{MM}(X_1,\ldots X_N)/N^{1-1/d}$ converges almost surely to

75: ${\rm Vol(\Omega)}^{1/d}\beta_{MM}(d)$, where ${\rm Vol(\Omega)}$ denotes  the Lebesgues

76: measure of $\Omega$ and $\beta_{MM}(d)>0$ is a universal constant depending only upon $d$.

77:

78: The functional $L_{MBM}$ does not satisfy this form of limit theorem in dimensions

79: $1$ and $2$. For $d=1$, the MBMP amounts to a sorting problem and it is not difficult

80: to show that if $X$ and $Y$ both consist of $N$ points independently and uniformly

81: distributed in $[0,1]$, there are constants $0<C_1<C_2$ such that

82: $C_1\sqrt N\le L_{MBM}(X,Y)\le C_2 \sqrt N$ with probability $1-o(1)$ as

83: $N\to \infty$. Moreover in that case the variance of $L_{MBM}(X,Y)/\sqrt{N}$ does

84: {\it not} converge to zero as $N\to \infty$. ($L_{MBM}$ is not ``self-averaging'',

85: in the statistical physics' terminology.)

86: For $d=2$ Ajtai et al. \cite{Ajtai&Al84_C} proved a remarkable fact: if the sets

87: $X,Y$ are now distributed in $[0,1]^2$, then for some constants $C_1,C_2$ indendent of

88: $N$, one has $C_1\sqrt{N\log N}\le L_{MBM}(X,Y)\le C_2\sqrt{N\log N}$ with

89: probability $1-o(1)$. Numerical simulations suggest that $L_{MBM}(X,Y)/\sqrt{N\log N}$

90: converges to a non-random constant as $N\to \infty$, however this has not yet been proved.

91:

92: In this article, we show that for any $d\ge 3$ we recover a BHH theorem for the functional

93: $L_{MBM}$.

94:

95: \begin{theorem}\label{th1}

96: Let $X_1,...,X_N,...$ and $Y_1,...,Y_N,...$ be two sequences of

97: random points independently and uniformly distributed in $[0,1]^d$, where

98: $d\ge 3$, and let $L_N = L_{MBM}(X_1,\ldots ,X_N;Y_1,\ldots ,Y_N)$.

99: There exists a constant $\beta_{MBM}(d)>0$ such that

100: with probability one

101: $$ \lim_{N\to \infty} L_N/ N^{1-1/d} = \beta_{MBM}(d).$$

102: \end{theorem}

103:

104: \section{Proof of Theorem \ref{th1}.}

105:

106: To begin, we remark that to prove this theorem it will suffice to

107: establish that $L_N/N^{1-1/d}$ converges in mean value to a constant

108: $\beta_{MBM}(d)$. This is a consequence of the following lemma  \cite{Talagrand92_AAP}:

109:

110: \begin{lemma}

111: For any $t>0$, one has

112: $$P(|{L_N\over N^{1-1/d}}- E({L_N\over N^{1-1/d}})| > t) \le 2 \exp(-{N^{1-2/d} t^2\over 8d}).$$

113: \end{lemma}

114:

115: \noindent This result follows from the application of Azuma's inequality \cite{Azuma67_TMJ}

116: and the martingale difference method to $L_N$, in a way by now standard in the

117: probabilistic theory of combinatorial optimisation \cite{Steele97_Book}.

118: Given the lemma, the theorem follows easily from the convergence of

119: $EL_N/N^{1-1/d}$ as $N\to \infty$, by applying the Borel-Cantelli lemma.

120:

121: We have now to establish that for $d\ge 3$ the quantity

122: $EL_N/N^{1-1/d}$ indeed converges to a constant $\beta_{MBM}(d)>0$.

123: To prove this we exploit the subadditivity properties of $L_{MBM}$, in the spirit

124: of Steele's theory of subadditive Euclidean functionals \cite{Steele81_AP}.

125: Let us divide the unit cube $[0,1]^d$ into disjoint

126: similar subcubes $Q_k,~k=1,\ldots ,m^d$ with edges of length $1/m$,

127: and compare the value of $L_{MBM}(X,Y)$ to

128: the sum

129: \begin{equation} \label{SumOnCubes}

130: \sum_{k=1}^{m^d} L_k,

131: \end{equation}

132: where $L_k$ is the value of the functional $L_{MBM}$ for the set of points

133: $X_i$ and $Y_i$ which belongs to $Q_k$. A difficulty arises as in

134: general the $Q_k$'s do not contain the same number of points $X_i$ and of

135: points $Y_i$. (In fact the special properties of the MBMP in dimensions $1$ and

136: $2$ originate from the fluctuations of the differences between these numbers

137: around their mean value $0$.)

138: To give meaning to the sum (\ref{SumOnCubes}) we need to generalize the

139: functional $L_{MBM}$ to matchings between two sets of different cardinalities.

140: There are several ways to do this; we shall define

141: $L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})$  by imposing that

142: the minimum matching contains as few unmatched points as possible.  That is if

143: $N_1>N_2$, we leave $N_1-N_2$ points of $X$ unmatched, whereas if

144: $N_1<N_2$ we leave $N_2-N_1$ points of $Y$ unmatched.

145:

146: Although expression (\ref{SumOnCubes}) now makes sense, it is still not possible

147: to write a subadditivity inequality of the same form as the one studied

148: in \cite{Steele81_AP}. Indeed, such a form (which Steele calls ``geometric

149: subadditivity'') implies an upper bound of the form $CN^{1-1/d}$ for the functional

150: at hand \cite{Steele97_Book}, and it is easy to see that no such bound applies

151: to $L_{MBM}(X,Y)$.  We shall however see that a geometric subadditivity

152: property holds {\it in the mean} for the functional $L_{MBM}$.

153: Suppose that the points $X_1,\ldots X_{N_1},Y_1,\ldots Y_{N_2}$ belong to an

154: arbitrary cube $Q$ having edge length $a$, and divide $Q$ into

155: disjoint cubes $Q_p,~p=1,\ldots 2^d$ by splitting each edge in two halves.

156: Construct in each $Q_p$ an optimal matching in the sense just defined,

157: between the $n_{1,p}$ points $X_i$ and the $n_{2,p}$ points $Y_i$ in $Q_p$,

158: and denote its length by $L_p$.

159: The points that are left unpaired are in number $|n_{1,p}-n_{2,p}|$ in each

160: $Q_p$, so if $L_0$ denotes the length of an optimal matching for these

161: points one has

162: \begin{eqnarray} \label{Decimation}

163: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots ,Y_{N_2}) \le

164: \sum_{p=1}^{2^d} L_p + L_0 \nonumber\\

165: \le \sum_{p=1}^{2^d} L_p + {1\over 2} a\sqrt d

166: \sum_{p=1}^{2^d} |n_{1,p}-n_{2,p}|,

167: \end{eqnarray}

168: where the last inequality is obtained by bounding $L_0$ in an obvious way.

169:

170: We shall apply this to $Q=[0,1]^d$. Let  $Q_{p_1}~p_1=1,\ldots 2^d$

171: be the cubes obtained in the above subdivision; let $Q_{p_1p_2}$ be

172: the cubes obtained by splitting in two halves the edges of each cube $Q_{p_1}$;

173: and so on. By repeating this operation $K$ times, we get a subdivision with

174: cubes $Q_{p_1\ldots p_K}$ whose edges are of length $1/2^K$. Let

175: $n_{1,p_1\ldots p_K}$ and $n_{2,p_1\ldots p_K}$ be respectively

176: the number of points $X_i$ and $Y_i$ in $Q_{p_1\ldots p_K}$. Apply

177: (\ref{Decimation}) first to the $Q_{p_1,\ldots p_{K-1}}$'s, then to the

178: $Q_{p_1\ldots p_{K-2}}$'s, etc, keeping at each step only those points which

179: are still unpaired. It is easy to convince oneself that the number of unpaired

180: points in each $Q_{p_1,\ldots p_{K-k}}$ just after step $k$ is given by

181: $|n_{1,p_1,\ldots p_{K-k}}-n_{2,p_1,\ldots p_{K-k}}|$. After step $k=K$ one

182: obtains a matching between $X_1,\ldots X_{N_1}$ and $Y_1,\ldots Y_{N_2}$

183: where all the points but $|N_1-N_2|$ are matched.

184: One is thus led to the following inequality:

185: \begin{eqnarray} \label{SousAddMBMP}

186: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})

187: \le \sum_{p_1\ldots p_K} L_{p_1\ldots p_K} \nonumber\\

188: + \sum_{k=1}^K {\sqrt d\over 2^k}

189: \sum_{p_1\ldots p_k} |n_{1,p_1\ldots p_k}-n_{2,p_1\ldots p_k}|.

190: \end{eqnarray}

191: We now proceed to derive a subadditivity property for the mean

192: value of $L_{MBM}(X,Y)$. We first consider the case where

193: $N_1=\card X$ and $N_2=\card Y$ are not fixed integers but are independent Poisson

194: random  variables with the same mean value $N$, the elements of $X$ and $Y$ being

195: chosen independently and uniformly in $[0,1]^d$. For a given $k$, the numbers

196: $n_{1,p_1,\ldots p_k}$ and $n_{2,p_1,\ldots p_k}$ are then also independent

197: Poisson random variables, with parameter $N/2^{kd}$. Let

198: $M(N)= EL_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})$.

199: It is immediate by homogeneity that

200: \begin{equation}

201: EL_{p_1\ldots p_K} = 2^{-K} M(N/2^{Kd}).

202: \end{equation}

203: Moreover from the well known properties of Poisson variables we have

204: \begin{equation} \label{RMSPoisson}

205: E|n_{1,p_1\ldots p_k}-n_{2,p_1\ldots p_k}| \le

206: \sqrt 2 \Big( {N\over 2^{kd}} \Big)^{1/2}.

207: \end{equation}

208: By taking mean values in (\ref{SousAddMBMP}) we obtain:

209: \begin{equation}

210: M(N) \le 2^{K(d-1)}M(N/2^{Kd}) + \sqrt{2dN} \sum_{k=1}^K 2^{k(d/2-1)}.

211: \end{equation}

212: This inequality has been obtained for a subdivision of $[0,1]^d$ which

213: consists in $2^{Kd}$ similar cubes. Suppose now that we start from the

214: subdivision $\Sigma$ in $m^d$ similar cubes $Q_k~k=1,\ldots m^d$,

215: where $m$ is an arbitrary integer. One can then reproduce the previous

216: construction in the following manner. Let $m=2^K+r$

217: where $0\le r<2^K$. Consider the cube $Q_0=[0,2^{K+1}/m]^d$ and form the

218: natural subdivision $\Sigma_0$ of $Q_0$ by $2^{(K+1)d}$ cubes

219: $Q_{p_0,\ldots p_K}$ whose edges have length $1/m$. We can proceed with

220: $Q_0$ and $\Sigma_0$ to a $K+1$ steps construction similar to the one

221: which led to (\ref{SousAddMBMP}). The only differences are that $Q_0$ has

222: edges of length $2^{K+1}/m$ rather than $1$, and that some of the

223: $Q_{p_0\ldots p_K}$'s, namely those which belong to

224: $\Sigma_0$ but not to $\Sigma$, are empty.

225: Nevertheless, we may write

226: \begin{eqnarray}

227: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots ,Y_{N_2}) - \sum_{p=1}^{m^d} L_k \nonumber \\

228: \le \sum_{k=0}^K {\sqrt d 2^{K-k} \over m}

229: \sum_{p_0\ldots p_k} |n_{1,p_0\ldots p_k}-n_{2,p_0\ldots p_k}| \nonumber \\

230: \le \sum_{k=0}^K {\sqrt d \over 2^k}

231: \sum_{p_0\ldots p_k} |n_{1,p_0\ldots p_k}-n_{2,p_0\ldots p_k}|.

232: \end{eqnarray}

233: Now $n_{1,p_0\ldots p_k}$ and $n_{2,p_0\ldots p_k}$ are Poisson

234: variables with parameter lower than $2^{(K-k)d} N/m^d \le 2^{-kd}N$ so we

235: still have

236: \begin{equation}

237: E|n_{1,p_0\ldots p_k}-n_{2,p_0\ldots p_k}| \le

238: \sqrt 2 \Big({N \over 2^{kd}} \Big)^{1/2}.

239: \end{equation}

240: Taking average values one is led to

241: \begin{equation}

242: M(N) \le m^{d-1} M(N/m^d) +

243: 2^d \sqrt{2dN} \sum_{k=0}^K 2^{k(d/2-1)}.

244: \end{equation}

245: Dividing this last inequality by $N^{1-1/d}$ and then replacing $N$ by

246: $m^dN$, we get

247: \begin{equation}

248: {M(m^dN) \over (m^dN)^{1-1/d}} \le {M(N)\over N^{1-1/d}} +

249: {2^d\sqrt{2d} \over N^{1/2-1/d}} \sum_{k=0}^K 2^{-k(d/2-1)}.

250: \end{equation}

251: If $d>2$, the sum on the r.h.s. of the last inequality is bounded above independently of

252: $N$, and is divided by a positive power of $N$. Elementary analysis now shows that the

253: ratio $M(N)/N^{1-1/d}$ necessarily converges to a limit $\beta_{MBM}(d)$ as $N\to \infty$.

254: Indeed, let $f(t) = M(t^d)/t^{d-1}$. One verifies at once that $f(t)$ satisfies

255: \begin{equation} \label{fInequality}

256: f(mt)\le f(t)+C/t^{d/2-1}

257: \end{equation}

258: for all $t>0$ and any integer $m$; $f(t)$ is continuous,

259: since $M(N)$ is a continuous function of $N$.

260: So the expression $f(t) + C_d/t^{d/2-1}$ is bounded in $[1,2]$ and since

261: $[1,\infty[$ is the union of the intervals $m[1,2], m\ge 1$, it follows

262: from (\ref{fInequality}) that $f(t)$ remains bounded as $t\to \infty$,

263: thus $\lim^* f(t) < \infty$. Now define $\beta=\lim_* f(t)$. For any

264: $\epsilon >0$, chose $t_0\gg 1$ and

265: $\eta >0$ such that $f(t)+C_d/t^{d/2-1} < \beta + \epsilon$

266: for $t$ in the interval $I=[t_0-\eta,t_0+\eta]$.

267: Since the intervals $mI$, $m\ge 1$ span a whole interval

268: $[A,\infty[$ for an $A$ sufficiently large,

269: it follows again from (\ref{fInequality}) that

270: $\lim^* f(t)\le \beta+\epsilon$.

271: Since $\epsilon$ is arbitrary one has $\lim^* f(t)=\beta$, hence

272: $f(t) \to \beta$ as $t\to \infty$,  from which it follows that

273: $\lim_{N\to \infty} M(N)/N^{1-1/d}=\beta$. Q.E.D.

274:

275: We have thus shown for $d\ge 3$, that one has

276: \begin{equation} \label{PoissonMBMPAsymptotics}

277: EL_{MBM}(X_1,\ldots,X_{N_1};Y_1,\ldots,Y_{N_2})

278: \sim \beta_{MBMP}^E(d)N^{1-1/d},~N\to \infty

279: \end{equation}

280: when $N_1$ and $N_2$ are independent Poisson variables with parameter $N$.

281: The same result for the mean value $EL_N$, where $N$ is a fixed integer,

282: follows then easily. Indeed, we have the obvious bound

283: \begin{eqnarray}

284: |L_{MBM}(X_1,\ldots X_N;Y_1,\ldots Y_N)-

285: L_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})| \nonumber\\

286: \le \sqrt d (|N_1-N|+|N_2-N|),

287: \end{eqnarray}

288: whence taking mean values,

289: \begin{equation}

290: |EL_N - EL_{MBM}(X_1,\ldots X_{N_1};Y_1,\ldots Y_{N_2})| \le 2 \sqrt{2dN},

291: \end{equation}

292: and dividing by $N^{1-1/d}$ we deduce that

293: \begin{equation}

294: \lim_{N\to \infty} {EL_N\over N^{1-1/d}} \to \beta_{MBM}(d).

295: \end{equation}

296: Theorem \ref{th1} is now proved.

297:

298: \section{Concluding remarks.}

299:

300: \noindent 1) Our decimation procedure does not give back the bounds

301: proven by  Ajtai {\it et al.} in $d=2$, but a weaker

302: $O(\sqrt{N} \ln N)$ bound.

303: It is believed that a self-averaging theorem applies also to the

304: functional $L_{MBM}$ in dimension $2$ \cite{Smith89_Thesis}.

305:

306: \noindent 2) The estimation of the constants $\beta_{MBM}(d)$ is also an

307: interesting problem. A remarkable result of Talagrand \cite{Talagrand92_AAP}

308: shows that one has $\beta_{MBM}(d)= \sqrt{d/2e\pi} (1+O(\ln d / d))$ as

309: $d\to \infty$. It is conjectured that a $1/d$ series expansion actually exists

310: for $\beta_{MBM}(d)$.

311:

312: \noindent 3) M\'ezard and Parisi have obtained detailed analytic predictions for

313: the {\it random link} versions of the MMP and the MBMP  \cite{MezardParisi87_JdP},

314: where the distance matrix between the points $X_i$ and $Y_j$ is replaced by a matrix of

315: independent and identically distributed entries. (Some of these predictions, for the random

316: assignment problem, have been proven recently by Aldous \cite{Aldous01_RSA}.)

317: Numerical studies \cite{BoutetMartin97_PRL,HBM98_EPJB} indicate that for the MMP and the

318: MBMP, the random link model provides one with a very good ``mean-field'' approximation to

319: the Euclidean model in the large $d$ limit. Except for simpler combinatorial problems

320: however \cite{BertsimasVanRyzin90_ORL}, very few rigorous results are known for comparing

321: the euclidean and the random link models.

322:

323: \bigskip

324: {\noindent \bf \large Aknowledgments}

325:

326: \noindent It is a pleasure to thank J.M. Steele for fruitful discussions and pointing to us

327: reference \cite{Talagrand92_AAP}.

328:

329:

330: %%%%%%%%%%%%%%%%%% Bibliography %%%%%%%%%%%%%%%%%%%%%%%%%

331: \bibliography{co,jbdm}

332: \bibliographystyle{perroten}

333:

334: \end{document}

335:

336: