0308:cs0308023/cs0308023

1: \documentclass[12pt]{article}

2:

3: \textwidth6.25in \textheight8.5in \oddsidemargin.25in

4: \topmargin0in

5:

6: \usepackage{epsfig}

7: %\usepackage{showkeys} % HERE: Comment this out in final version

8:

9: %\renewcommand{\baselinestretch}{2.0}

10:

11: \def\be{\begin{equation}}

12: \def\ee{\end{equation}}

13: \def\la{\langle}

14: \def\ra{\rangle}

15: \def\IP{\hbox{\rm I\kern -1.6pt{\rm P}}}

16: \def\IC{{\hbox{\rm C\kern-.58em{\raise.53ex\hbox{$\scriptscriptstyle|$}}

17:     \kern-.55em{\raise.53ex\hbox{$\scriptscriptstyle|$}} }}}

18: \def\IN{\hbox{I\kern-.2em\hbox{N}}}

19: \def\IR{\hbox{\rm I\kern-.2em\hbox{\rm R}}}

20: \def\ZZ{\hbox{{\rm Z}\kern-.3em{\rm Z}}}

21: \def\IT{\hbox{\rm T\kern-.38em{\raise.415ex\hbox{$\scriptstyle|$}} }}

22: %\newtheorem{theorem}{Theorem}[section]

23: \newtheorem{theorem}{Theorem}

24: \newtheorem{lemma}[theorem]{Lemma}

25: \newtheorem{sublemma}[theorem]{Sublemma}

26: \newtheorem{proposition}[theorem]{Proposition}

27: \newtheorem{corollary}[theorem]{Corollary}

28: \newtheorem{remark}[theorem]{Remark}

29:

30: \begin{document}

31:

32: \title{On the complexity of curve fitting algorithms}

33: \author{N. Chernov, C. Lesort, N. Sim\'{a}nyi\\

34: Department of Mathematics\\

35: University of Alabama at Birmingham\\

36: Birmingham, AL 35294, USA}

37: \date{\today}

38: \maketitle

39:

40: %The corresponding author:\\

41:

42: %\noindent

43: %Nikolai Chernov, the address above,\\

44: %E-mail: chernov@math.uab.edu\\

45: %Fax: 1-205-934-9025

46:

47: %\newpage

48:

49: \begin{abstract}

50: We study a popular algorithm for fitting polynomial curves to scattered

51: data based on the least squares with gradient weights. We show that

52: sometimes this algorithm admits a substantial reduction of complexity,

53: and, furthermore, find precise conditions under which this is possible.

54: It turns out that this is, indeed, possible when one fits circles but

55: not ellipses or hyperbolas.

56: \end{abstract}

57:

58: %\vspace*{3cm}

59: %\begin{center}

60: %Keywords: least squares fit, curve fitting, algebraic gradient weight

61: %fitting, complexity.

62: %\end{center}

63:

64: %\newpage

65:

66: %\renewcommand{\theequation}{\arabic{section}.\arabic{equation}}

67:

68: %\section{Introduction}

69: %\label{secI} \setcounter{equation}{0}

70:

71: In many applications one needs to fit a curve described by a polynomial

72: equation

73: $$

74:            P(x,y;\Theta)=0

75: $$

76: (here $\Theta$ denotes the vector of unknown parameters) to

77: experimental data $(x_i,y_i)$, $i=1,\ldots,n$. In this equation $P$ is

78: a polynomial in $x$ and $y$, and its coefficients are either unknown

79: parameters or functions of unknown parameters. For example, a number of

80: recent publications \cite{CBH01,GGS94,LM00} are devoted to the problem

81: of fitting quadrics $Ax^2+ Bxy+ Cy^2+ Dx+ Ey+ F=0$, in which case

82: $\Theta=(A,B,C,D,E,F)$ is the parameter vector. The problem of fitting

83: circles, given by equation $(x-a)^2+ (y-b)^2 -R^2=0$ with three

84: parameters $a,b,R$, also arises in practice \cite{CO84,Ka98}.

85:

86: It is standard to assume that the data $(x_i,y_i)$ are noisy

87: measurements of some true (but unknown) points $(\bar{x}_i,\bar{y}_i)$

88: on the curve, see \cite{BC86,CL03a,Ka96,Ka98} for details. The noise

89: vectors $e_i=(x_i-\bar{x}_i,y_i-\bar{y}_i)$ are then assumed to be

90: independent gaussian vectors with zero mean and a scalar covariance

91: matrix, $\sigma^2 I$. In this case the maximum likelihood estimate of

92: $\Theta$ is given by the {\em orthogonal least squares fit} (OLSF),

93: which is based on the minimization of the function

94: \be

95:        {\cal F}(\Theta) =  \sum_{i=1}^n d_i^2

96:          \label{Fmain1}

97: \ee

98: where $d_i$ denotes the distance from the point $(x_i,y_i)$ to the

99: curve $P(x,y;\Theta)=0$.

100:

101: Under these assumptions the OLSF is statistically optimal -- it

102: provides estimates of $\Theta$ whose covariance matrix attains its

103: Rao-Cramer lower bound \cite{CL03a,Ka96,Ka98}. The OLSF is widely used

104: in practice, especially when one fits simple curves such as lines or

105: circles. However, for more general curves the OLSF becomes intractable,

106: because the precise distance $d_i$ is hard to compute. In those cases

107: one resorts to various alternatives, and the most popular one is the

108: {\em algebraic fit} (AF) based on the minimization of

109: \be

110:        {\cal F}_{\rm a}(\Theta) =

111:        \sum_{i=1}^n w_i\, [P(x_i,y_i;\Theta)]^2

112:          \label{Fmain2}

113: \ee

114: where $w_i=w(x_i,y_i;\Theta)$ are suitably defined weights. The choice

115: of the weight function $w(x,y;\Theta)$ is important.

116: % The minimization of (\ref{Fmain2}) is usually much cheaper than that

117: % of (\ref{Fmain1}), but the weights $w_i$ must be chosen wisely.

118: The AF is known \cite{CL03a} to provide a statistically optimal

119: estimate of $\Theta$ (in the sense that the covariance matrix will

120: attain its Rao-Cramer lower bound) if and only if the weight

121: function satisfies

122: \be

123:    w(x,y;\Theta) = a(\Theta) / \|\nabla P(x,y;\Theta)\|^2

124:      \label{wgrad}

125: \ee

126: for all points $x,y$ on the curve, i.e.\ such that

127: $P(x,y;\Theta)=0$. Here $\nabla P = (\partial P/\partial

128: x,\partial P/\partial y)$ is the gradient vector of the polynomial

129: $P$, and $a(\Theta)>0$ may be an arbitrary function of $\Theta$

130: (in practice, one simply sets $a(\Theta)=1$). Any other choice of

131: $w$ will result in the loss of accuracy, see \cite{CL03a}. We call

132: $w(x,y;\Theta)$ a {\em gradient weight function} if it satisfies

133: (\ref{wgrad}) for all $x,y$ on the curve $P(x,y;\Theta)=0$. The AF

134: (\ref{Fmain2}) with a gradient weight function $w(x,y;\Theta)$ is

135: commonly referred to as the {\em gradient weighted algebraic fit}

136: (GRAF). It was introduced in the mid-seventies \cite{Tu74} and

137: recently became standard for polynomial curve fitting, see, for

138: example, \cite{CBH01,LM00,Ta91}.

139:

140: Even though the GRAF is much cheaper than the OLSF, it is still a

141: nonlinear problem requiring iterative methods. For example, in a

142: popular {\em reweight procedure} \cite{Sa82,Ta91} one uses the $k$-th

143: approximation $\Theta^{(k)}$ to compute the weights $w_i =

144: w(x_i,y_i;\Theta^{(k)})$ and then finds $\Theta^{(k+1)}$ by minimizing

145: (\ref{Fmain2}) regarding the just computed $w_i$'s as constants. Note

146: that if the parameters $\Theta$ are the coefficients of $P$, then

147: (\ref{Fmain2}), with fixed weights, becomes a quadratic function in

148: $\Theta$, and its minimum can be easily found. Another algorithm is

149: based on solving the equation $\nabla_{\Theta}{\cal F}_{\rm a}(\Theta)

150: = 0$, i.e.\

151: \be

152:    \sum P_i^2 \, \nabla_{\Theta} w_i +

153:    2 \sum w_i \, P_i \, \nabla_{\Theta} P_i = 0

154:       \label{weq}

155: \ee

156: for which various iterative schemes could be used. In the case of

157: fitting quadrics, for example, the most advanced algorithms are the

158: renormalization method \cite{Ka96}, the heteroscedastic

159: error-in-variables method \cite{LM00} and the fundamental numerical

160: scheme \cite{CBH01}. In all these algorithms, one needs to evaluate

161: ${\cal O}(n)$ terms at each iteration. Therefore, the complexity of

162: those algorithms is ${\cal O}(kn)$, where $k$ is the number of

163: iterations. Moreover, each algorithm requires access to individual

164: coordinates $x_i,y_i$ of the data points at each iteration. These

165: difficulties can be sometimes avoided in a remarkable way, as we show

166: next.

167:

168: Suppose we need to fit circles given by equation

169: $$

170:      P(x,y)=(x-a)^2+ (y-b)^2-R^2=0.

171: $$

172: Then we have

173: \be

174:    \|\nabla P(x,y;\Theta)\|^2 =

175:     4(x-a)^2 +4(y-b)^2\\

176:    = 4P(x,y) + 4R^2

177:      \label{4444}

178: \ee

179: hence $\|\nabla P(x,y;\Theta)\|^2 = 4R^2$ for all the points

180: $(x,y)$ lying on the circle $P(x,y)=0$, and we can set

181: $w(x,y;\Theta) = 1/R^2$. Therefore

182: \begin{eqnarray}

183:     {\cal F}_{\rm a}(a,b,R) &=&

184:        \sum_{i=1}^n R^{-2} \left[x_i^2+y_i^2-2ax_i-2by_i+

185:        a^2+b^2-R^2\right]^2\nonumber\\

186:        &=& R^{-2}[z_1+az_2+bz_3+a^2z_4+b^2z_5+abz_6

187:        +cz_7+ac z_8+bc z_9+ c^2n]

188:          \label{FmainC}

189: \end{eqnarray}

190: where we denoted $c=a^2+b^2-R^2$ for brevity, and

191: $$

192:    z_1=\sum (x_i^2+y_i^2)^2,\ z_2=-4\sum x_i(x_i^2+y_i^2),\ldots

193: $$

194: are some expressions involving $x_i$ and $y_i$ only.

195:

196: The minimization of (\ref{FmainC}) is still a nonlinear problem

197: requiring iterative methods \cite{CO84,CL02,Pr87}, but it has

198: obvious advantages over the reweight procedure described above and

199: other generic methods for solving the equation (\ref{weq}). First

200: of all, the values of $z_1,\ldots,z_9$ only need to be computed

201: once, and then the cost of minimization of (\ref{FmainC}) will not

202: depend on $n$ anymore. Thus, the complexity of this algorithm is

203: ${\cal O}(n) + {\cal O}(k)$, where ${\cal O}(n)$ is the cost of

204: evaluation of $z_1,\ldots,z_9$ and ${\cal O}(k)$ is the cost of

205: some $k$ iterations spent on the subsequent minimization of ${\cal

206: F}_{\rm a}(a,b,R)$. Moreover, once the values of $z_1,\ldots,z_9$

207: are computed and stored, the coordinates $x_i,y_i$ can be

208: destroyed. Practically, $z_1,\ldots,z_9$ can be computed

209: ``on-line'', when the data are collected. The minimization

210: procedure per se can be implemented ``off-line'', without storage

211: of (or access to) the data points. The quantities $z_1,\ldots,z_9$

212: here play the role of sufficient statistics.

213:

214: Inspired by the above example, we might say that the problem of fitting

215: a polynomial curve $P(x,y;\Theta)=0$ {\em admits a reduction of

216: complexity} if there are $\ell$ functions

217: $z_j(x_1,y_1,\ldots,x_n,y_n)$, $1\leq j\leq\ell$, with $\ell $ being

218: independent of $n$ and $\Theta$, and a gradient weight function

219: $w(x,y;\Theta)$ such that

220: \be

221:        {\cal F}_{\rm a} = f(z_1,\ldots,z_{\ell};\Theta)

222:          \label{Fzz}

223: \ee

224: i.e.\ ${\cal F}_{\rm a}$ is a function of $z_1,\ldots,z_{\ell}$ and

225: $\Theta$ only.

226:

227: This definition does not suggest how to find the functions

228: $z_1,\ldots,z_{\ell}$ in practical terms, though. Since ${\cal F}_{\rm

229: a}$ is given by (\ref{Fmain2}) with $P(x_i,y_i;\Theta)$ being a

230: polynomial in $x_i,y_i$, then the most natural (if not the only) way to

231: construct the functions $z_1,\ldots,z_{\ell}$ is to express the

232: gradient weight function (\ref{wgrad}) in the form

233: \be

234:        w(x,y;\Theta) = \sum_{k=1}^K C_k(\Theta)\, D_k(x,y)

235:          \label{wCD}

236: \ee

237: where $C_k$ are functions of the parameter vector $\Theta$ alone, and

238: $D_k$ are functions of $x$ and $y$ only (here the number of terms, $K$,

239: must be independent of $\Theta$). Indeed, suppose that the

240: representation (\ref{wCD}) is found. Since $P^2$ is a polynomial in

241: $x,y$, we can expand it as

242: $$

243:     P^2(x,y) = \sum_{p,q} c_{p,q}x^py^q

244: $$

245: where $c_{p,q} = c_{p,q} (\Theta)$ denote its coefficients. Now the

246: function ${\cal F}_{\rm a}$ can be evaluated as

247: \begin{eqnarray*}

248:    {\cal F}_{\rm a} &=& \sum_{k=1}^K\sum_{p,q}

249:    C_k(\Theta)c_{p,q}(\Theta)

250:    \sum_{i=1}^n x_i^py_i^qD_k(x_i,y_i) \\

251:    &=& \sum_{k=1}^K\sum_{p,q}

252:    C_k(\Theta)c_{p,q}(\Theta)

253:    z_{k,p,q}

254: \end{eqnarray*}

255: where

256: $$

257:    z_{k,p,q} = \sum_{i=1}^n x_i^py_i^qD_k(x_i,y_i)

258: $$

259: The values of $z_{k,p,q}$ depend on the data $x_i,y_i$ only, hence we

260: obtain the desired representation (\ref{Fzz}). Therefore, (\ref{wCD})

261: implies (\ref{Fzz}). We believe that the converse is also true, i.e.\

262: the conditions (\ref{Fzz}) and (\ref{wCD}) are actually equivalent, but

263: we do not attempt to prove that.

264:

265: Motivated by the above considerations, we adopt the following

266: definition: the problem of fitting a polynomial curve $P(x,y;\Theta)=0$

267: {\em admits a reduction of complexity} if the gradient weight function

268: (\ref{wgrad}) can be expressed in the form (\ref{wCD}).

269:

270:

271: As we have seen, the problem of fitting circles admits a reduction

272: of complexity (and so does the simpler problem of fitting lines).

273: Now if the problem of fitting ellipses and/or hyperbolas admitted

274: a reduction of complexity as defined above, we would be able to

275: dramatically improve the known GRAF algorithms

276: \cite{CBH01,Ka96,LM00}. Unfortunately, this is impossible -- there

277: are deep mathematical reasons which prevent a reduction of

278: complexity in the case of ellipses, hyperbolas, and parabolas.

279:

280: In this paper we find general conditions on the polynomial

281: $P(x,y;\Theta)$ under which the problem of fitting the curve

282: $P(x,y;\Theta)=0$ allows a reduction of complexity. It turns out

283: that lines and circles satisfy these conditions, but ellipses,

284: hyperbolas, and parabolas do not. Our results thus demonstrate (in

285: a rigorous mathematical way) that fitting noncircular conics is an

286: intrinsically more complicated problem than fitting circles or

287: lines.

288:

289: For convenience, let us denote

290: $$

291:    Q(x,y;\Theta) :=

292:    \|\nabla P(x,y;\Theta)\|^2 =

293:    (\partial P/\partial x)^2+(\partial P/\partial y)^2

294: $$

295: Clearly, $Q(x,y;\Theta)$ is itself a polynomial in $x$ and $y$. Our

296: subsequent arguments will involve some facts from complex analysis. We

297: will treat $x$ and $y$ as {\em complex}, rather than {\em real},

298: variables.

299:

300: \medskip\noindent{\bf Theorem}. {\em The problem of fitting curves

301: $P(x,y;\Theta)=0$ admits a reduction of complexity (as defined

302: above) under the condition that the system of polynomial

303: equations}

304: \be

305:   \left \{

306:   \begin{array}{c}

307:   P(x,y) = 0\\

308:   Q(x,y) = 0

309:   \end{array} \right .

310:   \label{PQ0}

311: \ee

312: {\em has no solutions, real or complex.}

313: \medskip

314:

315: Before we prove our theorem, we shall show how to use it. For the

316: problem of fitting circles, we have already computed $Q=4P+4R^2$, see

317: (\ref{4444}), hence the system (\ref{PQ0}) has indeed no solutions for

318: nondegenerate circles (for which $R\neq 0$).

319:

320: When using the theorem, the following {\em invariance} property

321: will be helpful. Let $(x,y)\mapsto (\tilde{x},\tilde{y})$ be a

322: transformation of the $xy$ plane that is a composition of

323: translations, rotations, mirror reflections and similarities (the

324: latter are defined by $(x,y)\mapsto (cx,cy)$ for some $c\neq 0$).

325: Denote by $\tilde{P}(\tilde{x},\tilde{y})$ the polynomial $P$ in

326: the new coordinates $\tilde{x},\tilde{y}$. Then the system

327: (\ref{PQ0}) has a solution (real or complex) if and only if the

328: corresponding system

329: $$

330:   \left \{

331:   \begin{array}{c}

332:   \tilde{P}(\tilde{x},\tilde{y}) = 0\\

333:   \tilde{Q}(\tilde{x},\tilde{y}) = 0

334:   \end{array} \right .

335: $$

336: has a solution, real or complex. Here $\tilde{Q} = \|\nabla

337: \tilde{P}\|^2$. This simple fact, which can be verified directly

338: by the reader, allows us to simplify the polynomial $P(x,y)$

339: before applying the theorem.

340:

341: Consider the problem of fitting ellipses and hyperbolas. By using

342: a translation and rotation of the $xy$ plane we can always reduce

343: the polynomial $P$ to a canonical form $ax^2+by^2+c=0$ (with

344: $a\neq b$ and $abc\neq 0$). Then $Q=4a^2x^2+4b^2y^2$ and we arrive

345: at a system of equations

346: $$

347:   \left \{

348:   \begin{array}{c}

349:    ax^2+by^2+c = 0\\

350:    a^2x^2+b^2y^2 = 0

351:   \end{array} \right .

352: $$

353: It is easy to see that it always has a solution

354: $$

355:    x=\pm\sqrt{\frac{bc}{a(a-b)}},\quad

356:    y=\pm\sqrt{-\frac{ac}{b(a-b)}}

357: $$

358: (note that $x$ or $y$ may be an imaginary number, which is allowed

359: by our theorem). Therefore, the problem does not admit a reduction

360: of complexity.

361:

362: If our curve is a parabola, then we can use its canonical equation

363: $y=cx^2$ for $c>0$, hence $P=y-cx^2$ and $Q=4c^2x^2+1$. Here again we

364: have a common zero of $P$ and $Q$ at the point $x={\bf i}/2c$ and

365: $y=-1/4c$. Thus, no conic sections (except circles) satisfy the

366: conditions of our theorem.

367:

368: %Even though we only prove that the conditions of our theorem are

369: %sufficient for a reduction of complexity, we believe they are also

370: %necessary (but we do not attempt to prove their necessity).

371:

372: We now prove our theorem. Since $w(x,y;\Theta)$ must be a gradient

373: weight function, the requirement (\ref{wCD}) is equivalent to

374: \be

375:      \frac{1}{Q(x,y)} = \sum_{k=1}^K C_k(\Theta)\, D_k(x,y)

376:      \ \ \ \ \ \ {\rm whenever}\ \ \ \

377:      P(x,y)=0

378:        \label{QU}

379: \ee

380: (here we incorporated the factor $a(\Theta)$ into the coefficients

381: $C_k(\Theta)$, for convenience). We emphasize that the left

382: identity in (\ref{QU}) does not have to hold on the entire $xy$

383: plane, it only has to hold {\em on the curve} $P(x,y)=0$. If we

384: denote that curve by $\cal L$, then (\ref{QU}) can be restated as

385: \be

386:      \frac{1}{Q(x,y)} = \sum_{k=1}^K C_k(\Theta)\, D_k(x,y)

387:      \ \ \ \ \ \ {\rm whenever}\ \ \ \

388:      (x,y)\in{\cal L}

389:        \label{QU1}

390: \ee

391:

392: The functions $D_k(x,y)$ in (\ref{QU}) cannot be arbitrary, they

393: must be easily computable, i.e.\ available in the machine

394: arithmetics. That is, they must be combinations of elementary

395: functions -- polynomials, exponentials, logarithms, trigonometric

396: functions, etc. In that case $D_k(x,y)$ are analytic functions of

397: $x$ and $y$. Therefore, they have analytic extensions to the

398: two-dimensional complex plane $\IC^2$. We note that they do not

399: need be {\em entire functions}, i.e.\ analytic everywhere in

400: $\IC^2$, they may have some singularities. For example, the

401: function $(1+x^2+y^2)^{-1}$ is analytic in $\IR^2$ but has

402: singularities in $\IC^2$, e.g.\ the point $x={\bf i}$ and $y=0$ is

403: its singularity. Also, those extensions maybe multivalued

404: functions (examples are $\ln x$ or $\sqrt{x}$).

405:

406: Now, the following function will also be analytic in $\IC^2$:

407: $$

408:   G(x,y) = 1 - Q(x,y)\sum_{k=1}^K C_k(\Theta)\, D_k(x,y)

409: $$

410: since it is a combination of analytic functions. By (\ref{QU1}),

411: it vanishes on the curve $\cal L$ in the real $xy$ plane. Consider

412: the subset ${\cal Z}\subset\IC^2$ defined by the equation

413: $P(x,y)=0$, where $x$ and $y$ are treated as complex variables.

414: Note that $\cal L$ is a curve on the two-dimensional manifold

415: $\cal Z$. We will prove that the function $G(x,y)$ vanishes on the

416: entire $\cal Z$.

417:

418: We can assume that $P(x,y)$ is an irreducible polynomial

419: (otherwise we can apply our argument to each irreducible factor of

420: $P$). Then $\cal Z$ is an algebraic variety, hence it admits a

421: complex parametrization (a complex coordinate, $z$), and the

422: restriction of the function $G$ onto $\cal Z$ will be an analytic

423: function of $z$. It is known in complex analysis that if an

424: analytic function $G(z)$, $z\in\IC$, vanishes on a one-dimensional

425: curve in $\IC$, then it is identically zero on $\IC$, hence

426: $G(z)\equiv 0$ for all $z\in\IC$. In our case the curve on which

427: $G$ vanishes is $\cal L$ (and we assume, of course, that it is a

428: nondegenerate curve for all the relevant values of the parameter

429: $\Theta$). Hence, $G$ vanishes on the entire $\cal Z$, and

430: therefore

431: \be

432:      G(x,y)=0

433:      \ \ \ \ \ \ {\rm whenever}\ \ \ \

434:      (x,y)\in{\cal Z}

435:        \label{GP}

436: \ee

437: On the other hand, if the system of equations (\ref{PQ0}) has a

438: complex solution $(x,y)$, then (\ref{GP}) would be impossible,

439: since any solution of (\ref{PQ0}) lies on the manifold $\cal Z$

440: (because $P(x,y) = 0$), and at the same time $Q(x,y)=0$ implies

441: $G(x,y) = 1$. Therefore, if the system (\ref{PQ0}) has a solution

442: (real or complex), then the representation (\ref{wCD}) cannot

443: possibly exist.

444:

445: %One can argue here that if the solutions of the system (\ref{PQ0})

446: %belonged to the singularities of the functions $D_k(x,y)$, then

447: %(\ref{GP}) might still hold on the domain of the function $G(x,y)$.

448: %However, this objection is easy to overturn. Indeed, the functions

449: %$D_k(x,y)$ are independent of the parameter vector $\Theta$, hence

450: %their singularities in $\IC^2$ are fixed. On the contrary, both

451: %functions $P(x,y)$ and $Q(x,y)$ in (\ref{PQ0}) depend on $\Theta$, thus

452: %the solutions of that system ``float'' in $\IC^2$ as $\Theta$ changes,

453: %and so they cannot be always ``blocked'' by the singularities of $D_k$.

454:

455: It remains to show that if the system (\ref{PQ0}) has no solutions,

456: then the representation (\ref{wCD}) is possible, and hence our problem

457: indeed admits a reduction of complexity. Assuming that (\ref{PQ0}) has

458: no solutions, we will construct the representation (\ref{wCD}) in the

459: simplest, polynomial form:

460: \be

461:        w(x,y;\Theta) = \sum_{p,q} w_{p,q}(\Theta)\, x^py^q

462:          \label{wmn}

463: \ee

464: the degree of this polynomial being independent of the parameter

465: $\Theta$. Consider a polynomial equation

466: \be

467:     P(x,y)\, U(x,y) + Q(x,y)\, W(x,y) = 1

468:       \label{UW}

469: \ee

470: where $U(x,y)$ and $W(x,y)$ are unknown polynomials. A classical

471: mathematical theorem, Hilbert's Nullstellensatz \cite{ZS}, says

472: that the equation (\ref{UW}) has polynomial solutions $U(x,y)$ and

473: $W(x,y)$ if and only if $P(x,y)$ and $Q(x,y)$ have no common

474: zeroes in $\IC^2$, i.e.\ whenever the system (\ref{PQ0}) has no

475: complex solutions, which is exactly what we have assumed. Note

476: that since $P$ and $Q$ depend on $\Theta$, then so do $U$ and $W$,

477: but we suppress this dependence in the equation (\ref{UW}).

478:

479: Now the polynomial $W(x,y)$ solving (\ref{UW}) gives us the weight

480: function $w(x,y;\Theta)=W(x,y)$, and it is easy to see that

481: $$

482:      W(x,y) = 1/Q(x,y)

483:      \ \ \ \ \ \ {\rm whenever}\ \ \ \

484:      P(x,y) = 0

485: $$

486: Technically, the theorem is proved, but we make a further

487: practical remark. Suppose we know that the system (\ref{PQ0}) has

488: no solutions, so that the problem admits a reduction of

489: complexity. In this case we need to find the polynomial $W(x,y)$

490: solving (\ref{UW}) in an explicit form, in order to determine the

491: weight function $w(x,y;\Theta)$. To this end we describe a finite

492: and relatively simple algorithm for computing the coefficients

493: $w_{pq}$ of the polynomial $W$. We substitute the expansions

494: $$

495:        W(x,y) = \sum_{p,q} w_{p,q}\, x_i^py_i^q

496:        \ \ \ \ \ {\rm and}\ \ \ \ \

497:        U(x,y) = \sum_{p,q} u_{p,q}\, x_i^py_i^q

498: $$

499: into the identity (\ref{UW}) and then equate the terms on the left hand

500: side and those on the right hand side with the same degrees of the

501: variables $x,y$. This gives a linear system of equations for the

502: unknown coefficients $w_{pq}$ and $u_{pq}$. This might be a large

503: system (its size depends on the degrees of $U$ and $W$), but it is a

504: linear system whose solution can be found by routine matrix methods. If

505: the assumed degrees of $U$ and $V$ are high enough, then the above

506: system is always solvable by the so called {\em effective

507: Nullstellensatz}, see \cite{S67}. By solving that system we can obtain

508: explicit formulas for the coefficients $w_{pq}$ and $u_{pq}$. In fact,

509: we only need the coefficients of $W$, not $U$. Lastly, we remark that

510: those coefficients will be rational functions of the coefficients of

511: the polynomial $P(x,y)$, hence they will be easily computable.

512:

513: \noindent{\bf Acknowledgement}. N. Chernov is partially supported by

514: NSF grant DMS-0098788 and N.~Sim\'{a}nyi is partially supported by NSF

515: grant DMS-0098773.

516:

517:

518:

519: \begin{thebibliography}{99}

520:

521: \bibitem{BC86} M. Berman and D. Culpin,

522:     The statistical behaviour of some least squares estimators of the centre and radius of a

523:     circle,  {\em J. R. Statist. Soc. B}, {\bf 48}, 1986, 183--196.

524:

525: \bibitem{CO84} N. I. Chernov and G. A. Ososkov,

526:     Effective algorithms for circle fitting,

527:     {\em Comp. Phys. Comm.} {\bf 33}, 1984, 329--333.

528:

529: \bibitem{CL02} N. Chernov and C. Lesort,

530:     {\rm Fitting circles and lines by least squares: theory and experiment},

531:     preprint, available at http://www.math.uab.edu/cl/cl1

532:

533: \bibitem{CL03a} N. Chernov and C. Lesort,

534:     {\rm Statistical efficiency of curve fitting algorithms},

535:     preprint, available at http://www.math.uab.edu/cl/cl2

536:

537: \bibitem{CBH01} W. Chojnacki, M. J. Brooks, and A. van den Hengel,

538:     {\rm Rationalising the renormalisation method of Kanatani},

539:     {\em J. Math. Imaging \& Vision}, {\bf 14}, 2001, 21--38.

540:

541: \bibitem{GGS94} W. Gander, G. H. Golub, and R. Strebel,

542:     {\rm Least squares fitting of circles and ellipses},

543:     {\em BIT} {\bf 34}, 1994, 558--578.

544:

545: \bibitem{Ka96} K. Kanatani,

546:     {\em Statistical Optimization for Geometric Computation: Theory and Practice},

547:     Elsevier Science, Amsterdam, 1996.

548:

549: \bibitem{Ka98} K. Kanatani,

550:     {\rm Cramer-Rao lower bounds for curve fitting},

551:     {\em Graph. Models Image Proc.} {\bf 60}, 1998, 93--99.

552:

553: \bibitem{LM00} Y. Leedan and P. Meer,

554:     {\rm Heteroscedastic regression in computer vision: Problems with bilinear

555:     constraint},

556:     {\em Intern. J. Comp. Vision}, {\bf 37}, 2000, 127--150.

557:

558: \bibitem{Pr87} V. Pratt,

559:     {\rm Direct least-squares fitting of algebraic surfaces},

560:     {\em Computer Graphics} {\bf 21}, 1987, 145--152.

561:

562: \bibitem{Sa82} P. D. Sampson,

563:     {\rm Fitting conic sections to very scattered data:

564:     an iterative refinement of the Bookstein algorithm},

565:     {\em Comp. Graphics Image Proc.} {\bf 18}, 1982, 97--108.

566:

567: \bibitem{S67} J. R. Shoenfield, {\em Mathematical logic}, Reading, Mass.,

568: Addison-Wesley, 1967, p. 100, Ex. 18 (e).

569:

570: \bibitem{Ta91} G. Taubin,

571:     {\rm Estimation Of Planar Curves, Surfaces And Nonplanar

572:     Space Curves Defined By Implicit Equations,

573:     With Applications To Edge And Range Image Segmentation},

574:     {\em IEEE Transactions on Pattern Analysis and Machine

575:     Intelligence},  {\bf 13}, 1991, 1115--1138.

576:

577: \bibitem{Tu74} K. Turner, {\em Computer perception of curved

578: objects using a television camera}, Ph.D.\ Thesis, Dept.\ of Machine

579: Intelligence, University of Edinburgh, 1974.

580:

581: \bibitem{ZS} O. Zariski and P. Samuel, {\em Commutative algebra}, Vol. 2.

582: Princeton, N.J., Van Nostrand [1958-60], p. 164.

583:

584: \end{thebibliography}

585:

586: \end{document}

587: \end

588:

589: \bibitem{Ag81} Agin, G. J., 1981.

590:     {\em Fitting Ellipses and General Second-Order Curves},

591:     Carnegi Mellon University, Robotics Institute, Technical Report 81-5.

592:

593: \bibitem{ARW01} Ahn, S. J., Rauh, W., and Warnecke, H. J., 2001.

594:     {\rm Least-squares orthogonal distances fitting of circle,

595:     sphere, ellipse, hyperbola, and parabola},

596:     {\em Pattern Recog.}, {\bf 34}, 2283--2303.

597:

598: \bibitem{An81} Anderson, D. A., 1981.

599:     The circular structural model,

600:     {\em J. R. Statist. Soc. B}, {\bf 27}, 131--141.

601:

602: \bibitem{BC86} Berman, M. and Culpin, D., 1986.

603:     The statistical behaviour of some least squares estimators of the centre and radius of a

604:     circle,  {\em J. R. Statist. Soc. B}, {\bf 48}, 183--196.

605:

606: \bibitem{Be89} Berman, M., 1989.

607:     Large sample bias in least squares estimators of a circular arc center and its

608:     radius,  {\em Computer Vision, Graphics and Image Processing}, {\bf 45}, 126--128.

609:

610: \bibitem{Ch65} Chan, N. N., 1965.

611:     On circular functional relationships,

612:     {\em J. R. Statist. Soc. B}, {\bf 27}, 45--56.

613:

614: \bibitem{CT95} Chan, Y. T. and Thomas, S. M., 1995.

615:     {\rm Cramer-Rao Lower Bounds for Estimation of a Circular Arc Center and Its

616:     Radius},  {\em Graph. Models Image Proc.} {\bf 57}, 527--532.

617:

618: \bibitem{CO84} Chernov, N. I. and Ososkov, G. A., 1984.

619:     Effective algorithms for circle fitting,

620:     {\em Comp. Phys. Comm.} {\bf 33}, 329--333.

621:

622: \bibitem{CL02} N. Chernov and C. Lesort,

623:     {\rm Fitting circles and lines by least squares: theory and experiment},

624:     preprint, available at http://www.math.uab.edu/cl/cl1

625:

626: \bibitem{CBH01} Chojnacki, W., Brooks, M. J., and van den Hengel,A., 2001.

627:     {\rm Rationalising the renormalisation method of Kanatani},

628:     {\em J. Math. Imaging \& Vision}, {\bf 14}, 21--38.

629:

630: \bibitem{GGS94} Gander, W., Golub, G. H., and Strebel, R., 1994.

631:     {\rm Least squares fitting of circles and ellipses},

632:     {\em BIT} {\bf 34}, 558--578.

633:

634: \bibitem{Hu97} {\em Recent advances in total least squares techniques

635: and errors-in-variables modeling}, Ed. by S. van Huffel, SIAM,

636: Philadelphia, 1997.

637:

638: \bibitem{Ka96} Kanatani, K., 1996.

639:     {\em Statistical Optimization for Geometric Computation: Theory and Practice},

640:     Elsevier Science, Amsterdam.

641:

642: \bibitem{Ka98} Kanatani, K., 1998.

643:     {\rm Cramer-Rao lower bounds for curve fitting},

644:     {\em Graph. Models Image Proc.} {\bf 60}, 93--99.

645:

646: \bibitem{La87} Landau, U. M., 1987.

647:     {\rm Estimation of a circular arc center and its radius},

648:     {\em Computer Vision, Graphics and Image Processing}, {\bf 38},

649:     317--326.

650:

651: \bibitem{LM00} Leedan, Y. and Meer, P., 2000.

652:     {\rm Heteroscedastic regression in computer vision: Problems with bilinear

653:     constraint},

654:     {\em Intern. J. Comp. Vision}, {\bf 37}, 127--150.

655:

656: \bibitem{Pr87} Pratt, V., 1987.

657:     {\rm Direct least-squares fitting of algebraic surfaces},

658:     {\em Computer Graphics} {\bf 21}, 145--152.

659:

660: \bibitem{Sp96} Spath, H., 1996.

661:     {\rm Least-Squares Fitting By Circles},

662:     {\em Computing}, {\bf 57}, 179--185.

663:

664: \bibitem{Sp97} Spath, H., 1997.

665:     {\rm Orthogonal least squares fitting by conic sections},

666:     in {\em Recent Advances in Total Least Squares techniques and

667:     Errors-in-Variables Modeling}, SIAM, 259--264.

668:

669: \bibitem{Ta91} Taubin, G., 1991.

670:     {\rm Estimation Of Planar Curves, Surfaces And Nonplanar

671:     Space Curves Defined By Implicit Equations,

672:     With Applications To Edge And Range Image Segmentation},

673:     {\em IEEE Transactions on Pattern Analysis and Machine

674:     Intelligence},  {\bf 13}, 1115--1138.

675:

676: \bibitem{Tu74} Turner, K., 1974. {\em Computer perception of curved

677: objects using a television camera}, Ph.D.\ Thesis, Dept.\ of Machine

678: Intelligence, University of Edinburgh.

679:

680: {\em Final remark}. In the theorem, we assumed that the vectors

681: $v_1,\ldots,v_n$ spanned $\IR^k$. If they do not, then the matrix $B$

682: will be singular and, furthermore, no proper sets of matrices

683: $A_1,\ldots,A_n$ would exist in the above sense. However, the theorem

684: can be modified as follows: first, in the definition of proper sets of

685: matrices we must require that $r\in\, $span$\, \{v_1,\ldots,v_n\}$

686: rather than $r\in\IR^k$, and second, the matrix $B^{-1}$ must be

687: replaced by its generalized (Moore-Penrose) inverse $B^-$. The proof of

688: the theorem in this case only requires minor changes, which we omit.

689:

690: {\em Remark}. Consider the following popular iterative algorithm:

691: using the $k$-th approximation $\Theta^{(k)}$, one computes the

692: weight $w_i = w(x_i,y_i;\Theta^{(k)})$, then substitutes $w_i$

693: into (\ref{Fmain3}) and finds $\Theta^{(k+1)}$ by solving

694: minimizing ${\cal F}_3(\Theta)$ assuming that the weights $w_i$

695: are fixed (this often becomes a linear problem in $\Theta$, so it

696: is easily solvable). If this algorithm converges, i.e.\ if

697: $\Theta^{(k)}\to\hat{\Theta}$, then the limit $\hat{\Theta}$ is a

698: solution of (\ref{weq1}). We emphasize that this method solves

699: (\ref{weq1}) rather than (\ref{weq}). Therefore, the above

700: procedure fails to minimize the proper objective function

701: (\ref{Fmain3}). But the resulting error is negligibly small, as

702: $\sigma \to 0$. This error does not alter the principal term of

703: the covariance matrix of the solution $\hat{\Theta}$, hence it

704: does not affect the statistical behavior of $\hat{\Theta}$. In

705: practice, one often uses the above iterative procedure for

706: minimizing (\ref{Fmain3}) and ignores the error it involves, see

707: \cite{Sa82,Ta91}.

708:

709: For example, when one fits ellipses and hyperbolas, then $P$ is a

710: quadratic polynomial, and the function (\ref{Fmain2}) becomes

711: relatively simple:

712: \be

713:    \sum_{i=1}^n \frac{\Theta^T{\bf A}_i\Theta}{\Theta^T{\bf B}_i\Theta}

714:    \ \to \ \min

715:      \label{TABTmin}

716: \ee

717: where ${\bf A}_i$ and ${\bf B}_i$

718: