cs0507026/cs0507026
1: 
2: \documentclass[12pt]{article}
3: 
4: \usepackage{latexsym, amsmath}
5: \setlength{\oddsidemargin}{0pt}
6: \setlength{\textwidth}{6.4in}
7: 
8: \newtheorem{theorem}{Theorem}
9: \newtheorem{lemma}{Lemma}
10: \newtheorem{corollary}{Corollary}
11: \newtheorem{definition}{Definition}
12: \newtheorem{remark}{Remark}
13: \newtheorem{conjecture}{Conjecture}
14: \newtheorem{proposition}{Proposition}
15: \newtheorem{algorithm}{Algorithm}
16: \newenvironment{proof}{{\it Proof:\/}}{\hfill $\Box$\\ }
17: 
18: 
19: 
20: \newcommand{\Z}{\mbox{\bf Z}}
21: \newcommand{\F}{\mbox{\bf  F}}
22: \newcommand{\Q}{\mbox{\bf  Q}}
23: \newcommand{\C}{\mbox{\bf  C}}
24: \newcommand{\E}{{\cal  E}}
25: \newcommand{\ord}{\mbox{\rm ord}}
26: 
27: \newcommand{\Y}{{\cal Y}}
28: 
29: 
30: \title{Hard Problems of Algebraic Geometry Codes}
31: 
32: \date{}
33: 
34: \author{Qi Cheng\thanks{School of Computer Science,
35: the University of Oklahoma,
36: Norman, OK 73019, USA.
37: Email: {\tt qcheng@cs.ou.edu.}
38: This research is partially supported by NSF Career
39: Award CCR-0237845
40: }
41: }
42: 
43: \begin{document}
44: 
45: \maketitle
46: 
47: 
48: \begin{abstract}
49: 
50: The minimum distance is one of the most important combinatorial
51: characterizations of a code. The maximum likelihood decoding
52: problem is one of the most important algorithmic problems of a code.
53: While these problems
54: are known to be hard for general linear codes, the techniques used to
55: prove their hardness often rely on the construction of artificial
56: codes.
57: In general, much less is known about
58: the hardness of the specific classes of natural linear codes.
59: In this paper, we show
60: that both problems are
61:  NP-hard for algebraic geometry codes.
62: We achieve this by reducing a well-known NP-complete problem 
63: to these 
64: problems using a randomized algorithm. 
65: The family of codes in the reductions 
66: are based on elliptic curves. They have positive rates,
67: but the alphabet sizes are exponential 
68: in the block lengths.
69: \end{abstract}
70: 
71: \section{Introduction}
72: 
73: An $[n,k]_q$ linear error-correcting code is a linear 
74: subspace of a vector space $\F_q^n$, where $\F_q$ denotes the
75: finite field of $q$ elements, 
76: and $k$ denotes the dimension of the subspace. The
77: {\em Generator Matrix} for a linear code is a $k \times n$ 
78: matrix, with row rank $k$ which defines a linear mapping from $\F_q^k$ (called
79: the {\em message space}) to
80: $\F_q^n$. Therefore, the code $C$ is:
81: $$ C =\{ a G | a\in \F_q^k\}. $$
82: We call a vector in $C$ a codeword.
83: The most important codes include the Reed-Solomon codes,
84: the Reed-Muller codes,
85: the BCH codes and the algebraic geometry codes.
86: 
87: The {\em Hamming Distance} between two codewords $x$ and $y$,
88: is the weight (number of nonzero coordinates) of $x-y$. 
89: The minimum distance of a code is the minimum
90: Hamming distance between any two codewords.
91: If the code is linear,
92: then the vector $x-y$ is a codeword, and the minimum distance of the
93: code is equal to the minimum weight of any codeword.
94: 
95: 
96: 
97: Given a linear code as input, 
98: how hard is it to compute the minimum distance?
99:  This problem had been open for two decades
100: before it was finally solved by Vardy in 1997 \cite{Vardy97},
101: when he proved that the problem is NP-complete.
102: Interestingly, determining whether a code contains
103: a codeword of a given weight was known to be NP-complete much 
104: earlier \cite{BerlekampMc78}.
105: However,  if we know that the minimum distance of a code
106: is $d$, it merely implies that there is a codeword
107: of weight $d$, and for any $w<d$, there is no codeword
108: of weight $w$. It is not clear that for any $n \geq w>d$,
109: whether there exists a codeword of weight $w$ or not.
110: Thus there is no straight-forward  reduction from this problem
111: to the minimum distance problem.
112: 
113: 
114: Dumer et.al. \cite{DumerMi03}  studied how hard it
115: is to approximate the minimal distance of a linear code.
116: They showed that the minimum distance
117: of a linear code is not approximable  within
118: any constant factor in random polynomial time,
119: unless NP$=$RP.
120: The codes used in the work
121: of them and Vardy \cite{Vardy97} are
122: artificially designed. Their results exhibit that
123: it is hard to compute the minimum distance for the {\em general} linear codes,
124: but  say nothing specific about any of the well-studied
125: and widely-deployed codes.
126: 
127: To use a code in practice, one must have an efficient
128: decoding algorithm. Traditionally, {\em unique decoding algorithms}, 
129: which correct errors of weight at most half of the minimum distance
130: of a code,
131: have been investigated for natural classes of codes. The discovery of
132: such algorithms, which provide a means to correct errors,
133: enable the widespread application of error-correcting codes.
134: The {\em list decoding problem} can 
135: correct more errors and outputs a list of
136: codewords, any of which may be the intended message.
137: In the last decade,
138: spectacular success in the area of list decoding has been achieved,
139: its influence can be seen throughout theoretical computer science,
140: ranging from the approximation algorithm and the average case complexity,
141: to pseudorandomness and derandomization.
142: %a huge impact can be felt in theoretical computer science,
143: %from PCP to average case studies.
144: The ultimate goal, the {\em maximum likelihood decoding} problem, is
145: one of the central problems in algorithmic coding theory. For any
146: vector $y$ in $\F_q^n$, it asks for a codeword $x$ to minimize the
147: distance between $x$ and $y$. Given that a received word is equally
148: likely to contain an error in any position, codewords that are closest
149: to the received word (i.e. differ in fewer coordinates) are most
150: likely to encode the intended message.  This problem is proved to be
151: NP-hard for general linear codes \cite{BerlekampMc78}.  Proving
152: NP-hardness for the classes of useful codes is more difficult and
153: subtle.  The only result of this kind to date  is the result of
154: \cite{GuruswamiVa05} on the NP-completeness of maximum likelihood
155: decoding for Reed-Solomon codes. A related result by Cheng and Wan 
156: \cite{ChengWa04}
157: shows that decoding of Reed-Solomon codes at certain radius
158: is at least as hard as discrete logarithm problem over finite fields.
159: 
160: In this paper, we prove that the minimum distance problem
161: and the maximum likelihood decoding problem are NP-hard
162: for a natural class of codes, namely, the algebraic-geometry
163: codes. The algebraic geometry codes can be seen as 
164: a  generalization of the
165: Reed-Solomon codes. 
166: While the study of algebraic geometry codes began as a 
167: purely mathematical pursuit, an increased understanding of their
168: unique combinatorial properties 
169: promises that they 
170: will find real-world applications
171: in the foreseeable future.
172: 
173: 
174: In combinatorics, it is often hard to explicitly
175: construct an  object which is, in certain aspects, better than 
176: a random object.
177: A family of algebraic geometry codes 
178: is one of a few bright spots,
179: where we can  explicitly construct a code having more
180: codewords than a random code given the block length and the
181: minimum distance.
182: Moreover,  given proper representations, these codes  possess a
183: polynomial time list decoding algorithm \cite{GuruswamiSu99},
184: which corrects errors well beyond half of the minimum distance.
185: In contrast, a random code usually does not have a good decoding algorithm
186: due to its lack of algebraic structure.
187: 
188: 
189: Proving the NP-hardness of the maximum likelihood
190: decoding of algebraic geometry codes (MLDAGC) 
191: answers the most important question
192: about the decodability of this class of codes.
193: Proving the NP-hardness of the
194: minimum distance problem for algebraic geometry codes (MDPAGC) is also 
195: well motivated.
196: The designed distance, which is a lower bound of
197: the minimum distance,  can be easily obtained from 
198: the description of the codes. Less attention is paid to the problem
199: of computing the exact minimum distance.
200: 
201: Also, the minimum distance problem for general linear codes
202: defied solution for so long time, one would imagine that
203: the problem for  codes with algebraic structures is more subtle.
204: If a code has a good list decoding algorithm,
205: while at the same time computing its minimum distance is hard,
206: then  we cannot easily find
207: a center of a Hamming ball with
208: the list decodable radius that contains two codewords at 
209: the minimum distance from each other. This illustrates deep structural
210: information about the code which may uncover properties of the
211: code that we have not yet realized.
212: 
213: A nice surprise about our proofs is its conceptual simplicity.
214: We use the subset sum problem directly, thus all of
215: the results on the preprocessing subset sum problem 
216: can be readily carried over to the algebraic geometry codes.
217: However our reductions are
218: randomized, which we would prefer to avoid. 
219:  The need for randomization seems to occur in places
220: where we  deal
221: with number theory and primes. 
222: In \cite{Vardy97} and \cite{GuruswamiVa05}, 
223: an irreducible polynomial over $\F_2$ 
224: is needed. Even though there is no polynomial time algorithm
225: which finds an irreducible polynomial over a finite field
226: of a given degree,
227: there does exist a deterministic algorithm
228: which finds an irreducible polynomial of a given degree
229: over finite fields of fixed number of elements \cite{Shoup90}.
230: This explains why the reduction in \cite{Vardy97} and \cite{GuruswamiVa05}
231: is deterministic.
232: 
233: 
234: 
235: Our reduction always maps a ``Yes'' instance to a ``Yes'' instance,
236: and maps a ``No'' instance to a ``No'' instance
237: in {\em expected polynomial time}. The reductions in \cite{DumerMi03}
238: is a {\em reverse unfaithful random reduction}, which always maps a
239: ``No'' instance to a ``No'' instance, but with a small probability,
240: maps  a ``Yes'' instance to a ``No'' instance.
241: 
242: 
243: The minimum distance problem, and the maximum likelihood
244: decoding problem, correspond to 
245: the shortest vector problem and the closest vector problem
246: in integral lattices. 
247: These problems have received a lot of
248: attentions recently \cite{Ajtai98, Khot04}. 
249: The attempts to
250: find a reduction from the minimum distance problem
251: of linear codes to the shortest vector problem 
252: of lattices
253: have failed so far.
254: 
255: 
256: 
257: 
258: \section{Elliptic curves}
259: 
260: The Reed-Solomon code of block length $n$ and dimension $k$
261: is obtained by evaluating polynomials of degree $k-1$ at
262: a set of
263: $n$ many elements in a finite field.
264: For a linear $[n,k]_q$ code, the Singleton bound
265: asserts that $d \leq n -k +1 $.
266: The Reed-Solomon codes are optimal, in that they satisfy the
267: Singleton bound with equality. 
268: It is trivial to read the minimum distance of Reed-Solomon codes
269: from the block length and the dimension.
270: 
271: The algebraic geometry codes are natural generalizations of
272: the Reed-Solomon codes.
273: Let $K$ be a function field over a finite field $\F$.
274: Let $A_1, A_2, \cdots, A_n, B_1, B_2, \cdots, B_m$
275: be $\F$-rational places.
276: Let $a_1, a_2, \cdots, a_n, b_1, b_2, \cdots, b_m$
277: be positive integers.
278: Given a divisor $A = \sum_{i =1}^n a_i A_i - \sum_{i =1}^m b_i B_i$,
279: define $L(A)$ to be the set of functions, each has poles only at $A_1,
280: A_2, \cdots, A_n$ with multiplicities at most $a_1, a_2, \cdots, a_n$
281: respectively, has zeros at $B_1, B_2, \cdots, B_m$ with
282: multiplicities at least $b_1, b_2, \cdots, b_m$ respectively.
283: The functions in $L(A)$ form  a linear space over the field $\F$.
284: It has dimension no less than $deg(A) - g +1$,
285: where $g$ is the genus of the function field, and
286: $deg(A) = \sum_{i =1}^n a_i  - \sum_{i =1}^m b_i $.
287: For the divisor $A$, we can construct
288: a linear code, 
289: whose codewords are  obtained by evaluating the functions
290: in $L(A)$
291: at rational places $P_1, P_2, \cdots, P_n$,
292: where $\{P_1, P_2, \cdots, P_n \} \cap \{
293: A_1, A_2, \cdots, A_n, B_1, B_2, \cdots, B_m\} = \emptyset$.
294: 
295:  
296: To prove that computing minimum
297: distances of algebraic geometry codes is NP-hard, we use
298: codes defined by curves of genus one, i.e.,  elliptic curves.
299: we first review some facts about elliptic curves.
300: An elliptic curve is a smooth cubic curve.
301: Let $\F$ be a field. If the characteristic of $\F$
302: is neither $2$ nor $3$,
303: we may assume that an elliptic curve is given
304: by an equation 
305: $$ y^2 = x^3 + ax +b, \hspace{0.5in} a,b\in \F.$$
306: The discriminant of this curve is defined as $-16(4a^3 + 27b^2)$.
307: It is essentially  the
308: discriminant of the polynomial $x^3 + ax + b$.
309: It should be non-zero for the curve is smooth.
310: For detailed information about elliptic curves,
311: we refer the reader to Silverman's book \cite{Silverman86}.
312: The
313: set of $\F$-rational
314: points on the elliptic curve consists of the solution set over $\F$
315: of the
316: equation plus a point at infinity, denoted by $O$. 
317: These points form an
318: abelian group with the infinity point as the identity.  We use $E(\F)$
319: to denote the group. From now on, let $\F$ be the finite field $\F_q$.
320: The following properties of elliptic curves
321: are relevant to our result.
322: 
323: 
324: 
325: \begin{enumerate}
326: \item Let $P_1, P_2, \cdots, P_n, P$ be elements in $E(\F_q)$.
327: If $m_1 P_1 + m_2 P_2 + \cdots + m_n P_n = P$,
328: where $m_i$, $1\leq  i \leq n$, are positive integers,
329: then
330: there is a function having zeros at $P_1, P_2, \cdots, P_n$,  with
331: multiplies $m_1, m_2, \cdots, m_n$ respectively, 
332: a pole at $P$ with multiplies $1$
333: and a pole at $O$ with multiplies $m_1 + m_2 + \cdots + m_n -1$.
334: We can compute the function in time
335: polynomial in $m_1+ m_2+\cdots+m_n$ and $\log q$ \cite{HuangIe94}.
336: \item For a given divisor $A$, we can in polynomial time
337: compute a basis of $L(A)$. 
338: In particular, since $(x)_{\infty} = 2O$,
339: $(y)_{\infty} = 3O$, and consequently, $(x^i)_{\infty} = 2iO$,  
340: $(x^{i-1} y)_{\infty} = (2i+1)O$,  we can compute a basis
341: for $L(\alpha O)$ quickly, and it contains only monomials.
342: \item If $deg(A)\geq 1$, then dimension of $L(A)$ is $deg(A)$.
343: \item Let $p \equiv 2 \pmod{3} $ be a prime.
344: The curve $y^2 =x^3 +1$ is a supersingular elliptic curve over $\F_p$.
345: The group $E(\F_p)$ contains $p+1$ elements and it is cyclic.
346: \end{enumerate}
347: 
348: \begin{lemma}\label{curveconstruction}
349: For any prime $q > 3$, we can in randomized polynomial time
350: find another prime $p = O(q^2)$ and
351: construct an elliptic curve $E/\F_p$ and a point $G\in E(\F_p)$
352: such that the  $G$ has order $q$.
353: \end{lemma}
354: 
355: \begin{proof}
356: Find another prime $p $ such that $ p \equiv -1 \pmod{q}$ and
357: $p \equiv 2 \pmod{3}$. This can be done easily if randomness is
358: allowed.
359: We can first solve the system of congruences using
360: the Chinese Remainder Theorem. If the solution is $ p = a \pmod{3q}$,
361: we select a random number $1\leq x\leq q$, and test whether $ a + 3qx$
362: is prime or not. By the Siegel-Walfisz theorem concerning
363: the density of primes in arithmetic progression,
364: the probability that we get a prime
365: is at least  $1 / \log^{O(1)} 3q$. Set $p=a+3qx$ if we find a prime.
366: 
367: 
368: 
369: 
370: Consider the curve $E: y^2 =x^3 + 1$ over $\F_p$. It is supersingular
371: hence $E(\F_p)$ is a cyclic group with order $p+1$.
372: We try to find a point $P$ in the group such that ${p+1 \over q} P \not= O$.
373: Since the group is cyclic, the number of points $P$ such that 
374: ${p+1 \over q} P = O$ is ${p+1 \over q}$,
375: so there is an overwhelming chance of success.
376: Once we find a $P$ satisfying
377: ${p+1 \over q} P \not= O$, set $G = {p+1 \over q} P$. 
378: It is easy to verify that $G \in E(\F_p)$ is a point with order
379: $q$. 
380: \end{proof}
381: 
382: 
383: The curve we construct is supersingular, therefore it is not suitable
384: for elliptic curve cryptosystems if 
385: $p$ is small, since the discrete logarithm problem on those elliptic curves
386: can be reduced to the discrete logarithm problem in $\F_{p^2}$.
387: For practical purposes, there is an efficient method  
388: based on the theory of complex multiplication to
389: construct a nonsupersingular curve of  a given order, 
390: but 
391: it seems hard to prove the performance in theory.
392: 
393: In the proof, we need randomness  to find a large order point
394: on an elliptic curve. To deterministically
395: find any point on an elliptic curve is still an
396: open problem, even though 
397: an efficient and simple Las Vegas algorithm exists.
398: 
399: 
400: \section{The NP-hardness proof of the MDPAGC}
401: 
402: We reduce the following well known subset sum problem
403: to the problem of computing minimum distances 
404: of algebraic geometry codes.
405: 
406: \begin{description}
407: \item[Instance:] A set of $n$  positive integers 
408: $A = \{a_1, a_2, a_3, \cdots, a_n\}$, a positive integer $b$ and 
409: a positive integer $k < n$.
410: \item[Question:] Is there a nonempty subset $\{ a_{i_1}, a_{i_2}, \cdots,
411: a_{i_k} \} \subseteq A$ of cardinality $k$ such that 
412: $$ a_{i_1} + a_{i_2} + \cdots + a_{i_k} = b. $$ 
413: \end{description}
414: 
415: First we prove a slight variety of the problem is also NP-hard.
416: 
417: \begin{lemma}
418: The following problem ({\em prime field subset sum problem}) is NP-hard:
419: \begin{description}
420: \item[Instance:] A prime $q$, a set of $n$  positive integers 
421: $A = \{a_1, a_2, a_3, \cdots, a_n\} $, an integer $b$ and 
422: a positive integer $k < n$.
423: \item[Question:] Is there a nonempty subset $\{ a_{i_1}, a_{i_2}, \cdots,
424: a_{i_k} \} \subseteq A $ of cardinality $k$ such that 
425: $$ a_{i_1} + a_{i_2} + \cdots + a_{i_k} = b \pmod{q}. $$ 
426: \end{description}
427: \end{lemma}
428: 
429: 
430: To prove the lemma, we simply reduce the
431: subset sum problem to it by finding a prime bigger than
432: $ a_1 + a_2 + a_3 + \cdots + a_n  +b$ in an instance of
433: the subset sum problem.
434: It is interesting to note that it seems hard to
435: prove the NP-completeness under the polynomial time 
436: Karp reduction, since such a reduction would give
437: rise to a deterministic algorithm to find 
438: a prime bigger than a given number,
439: but no such an algorithm is known. The problem was listed as
440: open in \cite{Adleman94}. Derandomizing the algorithm
441: is very interesting, given that 
442: a deterministic polynomial
443: time primality testing algorithms was discovered recently
444: \cite{AgrawalKa02}.
445: 
446: 
447: \begin{theorem}
448: Given a
449: instance of the prime field subset sum problem, 
450: we can in  randomized polynomial time, construct an algebraic geometry
451: code $[n,k]_p$ with $p = O(q^2)$ 
452: such that if the answer to the prime field subset sum problem is ``YES'',
453: then the code has minimum distance $n-k$. If the answer to the
454: prime field subset sum problem is ``NO'', then the code 
455: has minimum distance $n -k +1$.
456: \end{theorem}
457: 
458: 
459: \begin{proof}
460: Given an instance of the prime field subset sum problem,
461: by Lemma~\ref{curveconstruction}, we can construct 
462: an elliptic curve $E$ over $\F_p$, $p= O(q^2)$ , with a point $G$ of order $q$.
463: Let $Q = bG$.
464: Now consider an algebraic geometry codes generated by
465: evaluating functions in $L(Q + (k-1)O)$ at
466: $$P_1 = a_1 G, P_2 = a_2 G, \cdots, P_n = a_n G.$$
467: By the Singleton bound, we know that
468: the minimum distance is at most $n-k+1$.
469: This code has designed distance $n-k$, thus
470: the minimum distance is at least $n-k$. 
471: Let $f_1,f_2, \cdots, f_k$ be a basis of $L(Q+(k-1)O)$, 
472: the generator matrix of the code is
473: $$
474: \begin{pmatrix}
475: f_1 (P_1) & f_1(P_2) & \dots & f_1(P_n)\\
476: f_2 (P_1) & f_2(P_2) & \dots & f_2(P_n)\\
477: \hdotsfor{4}\\
478: f_k (P_1) & f_k (P_2) & \dots & f_k(P_n)
479: \end{pmatrix}
480: $$
481: 
482: If there exists a subset $\{ a_{i_1}, a_{i_2}, \cdots,
483: a_{i_k} \} \subseteq \{a_1, a_2, \cdots, a_n\} $ such that 
484: $ a_{i_1} + a_{i_2} + \cdots + a_{i_k} = b \pmod{q}, $
485: then $P_{i_1} + P_{i_2} + \cdots + P_{i_k} = Q$ in $E(\F_p)$.
486: Thus there exists a function $f$ having zeros at $P_{i_1}$,
487: $P_{i_2}$, $ \cdots, P_{i_k}$ with single multiplicity,
488: a pole at $Q$ with single multiplicity, and a pole at $O$
489: with multiplicity $k-1$. We have $f\in L(Q + (k-1)O)$.
490: Such a function is unique up to a constant factor.
491: The codeword corresponding to $f$ has weight $n-k$,
492: because it has $k$ zeros in $\{ P_1, P_2. \cdots, P_n\}$.
493: 
494: In the other direction, if the minimum weight of the codewords
495: is $n-k$, there exists a function $f \in L(Q + (k-1)O)$ whose has zeros at
496: $k$ many points in $P_1, P_2, \cdots, P_n$.
497: Denote them by $P_{i_1}, P_{i_2}, \cdots, P_{i_k}$.
498: Since it can  have no more than 
499: $k$ poles, counting multiplicities, 
500: it must have exactly $k$ zeros, and all the zeros have
501: single multiplicity.
502: Thus it must have $k$ poles as well. 
503: It has a pole at $Q$ with multiplicity $1$ and
504: a pole at $O$ with multiplicity $k-1$. That is to say
505: $(f) = P_{i_1} + P_{i_2} + \cdots + P_{i_k} -  Q - (k-1) O$.
506: Hence in $E(\F_p)$ 
507: $$P_{i_1} + P_{i_2} + \cdots + P_{i_k} =  Q.$$
508: We have
509: $$  a_{i_1}G + a_{i_2}G + \cdots + a_{i_k}G = b G. $$  
510: It implies that $a_{i_1} + a_{i_2} + \cdots + a_{i_k} = b \pmod{q}$.
511: 
512: 
513: 
514: \end{proof}
515: 
516: The reductions in the proofs are randomized.
517: We need to use randomness to find a prime of certain size
518: and a point on an elliptic curve of the prime order.
519: Once we find such a prime or point, we can
520: provide a proof of the primality or the order.
521: On the contrary, in Dumer et.al.'s work \cite{DumerMi03},
522: they need randomness to locate a good center,
523: for a Hamming ball of certain radius containing many codewords.
524: Even though with a high probability,
525: a random received word qualifies,
526: no proof of this fact can be provided.
527: 
528: 
529: \begin{corollary}
530: If there is a  polynomial time Las Vegas algorithm to compute the minimum
531: distance of an algebraic geometry code, then $NP\subseteq ZPP$.
532: If there is a  polynomial time randomized algorithm to compute
533: the minimum distance of an algebraic geometry code,
534: then $NP\subseteq RP$.
535: \end{corollary}
536: 
537: 
538: 
539: \begin{corollary}
540: Deciding whether an algebraic geometry code is
541: maximum distance separable is NP-hard.
542: \end{corollary}
543: 
544: 
545: 
546: We can also use one point divisor codes by
547: reducing the following problem to MDPAGC.
548: The detail will be left in the full paper.
549: \begin{description}
550: \item[Instance:] A set of $n$ integers $\{a_1, a_2, \cdots, a_n\}$ 
551: and $k$, a prime $q$.
552: \item[Question:] Are there $k$ integers $a_{i_1}, a_{i_2},
553: \cdots, a_{i_k}$ such that 
554: $$a_{i_1} + a_{i_2}  + \cdots + a_{i_k} \equiv 0 \pmod{q}$$
555: \end{description}
556: 
557: %Reduction: use the set partition problem.
558: 
559: 
560: %\begin{lemma}
561: %For any given prime $p$,
562: %\end{lemma}
563: 
564: \section{A time complexity lower bound for computing the minimum distance}
565: 
566: 
567: 
568: For the above analysis, it is easy to see that
569: we can in time $2^n (\log q)^{O(1)}$ compute
570: the minimum distance of an elliptic code in $[n,k]_q$.
571: Does there exist a better algorithm? 
572: If a problem is NP-hard, we do not expect to find an algorithm solving
573: it in polynomial time, no even in subexponential time. 
574: However, for NP-hard problems, sometimes we 
575: can find exponential algorithms beating the trivial
576: exhaustive search. What can we do in the case of
577: the minimum distance problem of algebraic geometry codes?
578: We can ask the same question for
579:  general linear
580: codes as well: can we compute the minimum distance 
581: in time $2^{cn} (\log q)^{O(1)}$
582: for some small $c$?
583: 
584: Ajtai et.al. \cite{AjtaiKu01} have studied the problem.
585: They proposed an algorithm
586: that solves the problem in time $2^{O(n)}$
587: if the field size is bounded by a polynomial in $n$.
588: The exact constant hidden in big-O is not calculated
589: in their paper.
590: 
591: 
592: 
593: The elliptic curve discrete logarithm problem (ECDLP) 
594: is to compute $l$ such that $Q =  lP$, given $P,Q\in E(\F_q)$.
595: It is obviously
596: an NP-easy problem, and is not believed to
597: be NP-hard. This is for sure a randomized
598: polynomial time reduction from the ECDLP 
599: to any NP-hard problem, including the minimum distance
600: problem of an algebraic geometry code.
601: In this section, we present a succinct reduction.
602: We reduce ECDLP over $\F_q$ to the problem
603: of computing the minimum distance of algebraic
604: codes in $[n, k]_q$, where $n \leq \lfloor \log q\rfloor$.
605: 
606: It is assumed in the elliptic curve cryptography that
607: there is no algorithm which runs in time $q^c$ for $c <  1/2$
608: to solve ECDLP in $\F_q$.
609: Under the assumption, we  have a
610: lower bound on the time complexity of computing
611: the minimum distance of linear codes.
612: 
613: 
614: \begin{theorem}
615: For any constant $c>0$,
616: if there is an algorithm which in time $2^{c n}(\log q)^{O(1)}$
617: computes the minimum distance of a linear code $[n,k]_q$,
618: then the ECDLP over $\F_q$ can be solved in time $q^c$.
619: \end{theorem}
620: 
621: 
622: \begin{proof}
623: 
624: Suppose that we need to compute the discrete logarithm of
625: $Q$ base $P$ on elliptic curve
626: $E(\F_q)$. W.l.o.g, we assume that $P$ has a prime order 
627: $p $. Note that we must have $p \leq q +1 -2\sqrt{q}$.
628: 
629: Denote the largest even number which is not bigger than
630: $\lfloor \log p \rfloor$ by $n$.
631: Randomly select a positive integer $r < p$, 
632: computer $R = rQ$. With probability ${ n \choose n/2}/2^n > 1/n^{O(1)}$,
633: the discrete logarithm of $R $ is an integer, when written in binary,
634: has exactly $n/2$ ones and $n/2$ zeros.
635: 
636: Now consider the code $C$ generated by evaluating functions in
637: $L(R + (n/2-1) O)$ at $P_0 = P, P_1 = 2P, P_2 = 2^2P, \cdots, 
638: P_{n-1} = 2^{n-1} P$.
639: By the similar reasoning, the minimum distance of the code
640: is $n/2$ iff $R$ can be written as a sum of $n/2$ points
641: from $P_0, P_1, \cdots, P_{n-1}$. Denote the set of these $n/2$ points
642: by $D$.
643: Let $C_i$ be the code  generated by evaluating functions in
644: $L(R + (n/2-1) O)$ at $P_0, P_1, \cdots, P_{i-1}, P_{i+1},
645: \cdots, P_{n-1}$.
646: We can find $D$
647: by asking the question where
648: the minimum distance of $C_i$, for $1\leq i\leq n$,
649: is $n/2$.
650: Basically,  $P_i\in D$ iff the answer for $C_i$ is ``No''.
651: We  solve the discrete logarithm problem immediately
652: after we get $D$.
653: \end{proof}
654: 
655: 
656: 
657: 
658: 
659: \section{The maximum likelihood decoding for AG-codes is NP-hard}
660: 
661: 
662: 
663: 
664: 
665: 
666: 
667: 
668: 
669: 
670: The dimension of linear space $L( (k-1) O) $ over $\F_q$ 
671: is $k-1$ for an elliptic curve. The dimension of linear space
672: $L(Q + (k-1) O)$, $Q \not= O$, is $k $.  Let $f_1, f_2, \cdots, f_{k-1}$
673: be a basis for $L((k-1) O) $, and  $f'$ be a function in 
674: $  L(Q + (k-1) O) - L((k-1) O) $.
675: Then $f_1, f_2, \cdots, f_{k-1}$
676: and $f'$ form a basis for $L(Q + (k-1) O)$.
677: It is fairly easy to find an $f'$. We can simply
678: pick one point $Q' \not\in \{Q, O \}$, compute $Q'' = Q - Q'$.
679: Let $l_1$ be the line passing $Q'$ and $Q''$, let $l_2$ be the line
680: passing $Q$ and $-Q$. We then set $f' = l_1/l_2$.
681: 
682: 
683: \begin{lemma}\label{distancelemma}
684: Consider the code generated by evaluating functions
685: in $L((k-1) O) $ at $P_1, P_2, \cdots, P_n$.
686: Suppose the received word is $R = (f'(P_1), f'(P_2), \cdots, f'(P_n))$.
687: Then 
688: \begin{enumerate}
689: \item the distance from $R$ to the code is either $ n - k +1$ or $ n-k $
690: \item the distance from $R$ to the code is $n- k $
691: iff there is a subset $P_{i_1}, \cdots, P_{i_{k}}$
692: of $P_1, P_2, \cdots, P_n$ such that
693:  $$  P_{i_1}+ P_{i_2} +  \cdots + P_{i_{k}} = Q  $$ 
694: \end{enumerate}
695: \end{lemma}
696: 
697: \begin{proof}
698: 
699: It is clear that  $R$ is not a codeword, since
700: if $f'\in L(Q+(k-1)O)$ takes the same values  as a function in $L((k-1) O)$
701: at $n$ distinct points, it must be equal to the function,
702: but $f'$ has a pole at $Q$.
703: 
704: If the distance is   less than $n-k$,
705: it means that there is a function 
706: $f \in L( (k-1) O)$ such that $f'-f$ has more than $k$ distinct zeros
707: in $\{P_1, P_2, \cdots, P_n \}$.
708: But $f'-f\in L(Q + (k-1)O)$, it has at most $k$ poles. A contradiction.
709: 
710: If the distance from $R$ to the code is $n-k$,
711: there is a function 
712: $f \in L( (k-1) O)$ such that $f'-f$ has $k$ distinct zeros.
713: Let them be $P_{i_1}, \cdots, P_{i_{k}} $.
714: The function $f'-f$  
715: must have a pole at $Q$ with multiplicity $1$ and a pole at $O$ with
716: multiplicity $k-1$. Therefore,
717: we have $ (f'-f) = P_{i_1} + \cdots + P_{i_{k}} - Q - (k-1) O$ and 
718: in $E(\F_p)$
719:    $$  P_{i_1} + \cdots + P_{i_{k}} = Q.  $$ 
720: 
721: 
722: In the other direction, if there is a 
723: subset $P_{i_1}, \cdots, P_{i_{k}}$
724: of $P_1, P_2, \cdots, P_n$ such that
725:  $$  P_{i_1}+ P_{i_2} +  \cdots + P_{i_{k}} = Q  $$ 
726: This implies that there is a function $g $ such that
727: $$(g) =  P_{i_1} + \cdots + P_{i_{k}} - Q - (k-1) O.$$
728: It is clear that $g\in L(Q + (k-1)O)$, thus
729: $g = f + a f'  $, where
730: $f \in L( (k-1)O)$  and $a\not= 0$.
731: The vector $R$ is at distance $n-k$ away from the codeword
732: obtained by evaluating the function
733: $-f/a $ at $P_1, P_2, \cdots, P_n$.
734: 
735: To prove that the distance is at most $n-k+1$, 
736: compute $P' = Q - P_1 - P_2 - \cdots - P_{k-1}$.
737: If $P' \in \{ P_{k}, P_{k+1}, \cdots, P_n \}$,
738: then we have shown that the distance 
739: from $R$ to the code is $n-k$. Assume that
740: it is not the case. There exists a function $g'$ such that
741: $$ (g') = P_{i_1} + \cdots + P_{i_{k-1}}+P' - Q - (k-1) O. $$
742: Since $g' \in L(Q + (k-1)O)$, we have that $g' = a f' + f$
743: for some $f\in L((k-1)O)$ and $a\in \F_q^*$. 
744: This shows that the distance from $R$ to
745: the code is not longer than $n-k+1$.
746: \end{proof}
747: 
748: 
749: 
750: 
751: \begin{theorem}
752: Given a received vector,
753: computing the distance from the vector 
754: to an elliptic code is NP-hard.
755: Therefore, the maximum likelihood decoding problem for
756: algebraic geometry codes is NP-hard.
757: \end{theorem}
758: 
759: \begin{proof}
760: Given an instance of the prime field subset sum problem,
761: we construct 
762: an elliptic curve $E$ over $\F_p$, $p= O(q^2)$ , with a point $G$ of order $q$.
763: Let $Q = bG$, and let $f'$ be a function in $L(Q+(k-1)O) - L((k-1)O)$.
764: Now consider an algebraic geometry code generated by
765: evaluating functions in $L( (k-1)O)$ at
766: $P_1 = a_1 G, P_2 = a_2 G, \cdots, P_n = a_n G$.
767: According to Lemma~\ref{distancelemma},
768: the answer to the prime field subset sum instance 
769: is ``Yes'', iff the distance
770: from $R = (f'(P_1), f'(P_2), \cdots, f'(P_n))$ to the code is $n-k$.
771: 
772: \end{proof}
773: 
774: Applying the result about the preprocessing subset sum problem 
775: \cite{Lobstein90}, we get
776: 
777: 
778: \begin{corollary}
779: There is a sequence of algebraic geometry codes $C_1, C_2, \cdots,
780: C_i, \cdots$, where $C_i \in [i,k]_{q_i}$,
781: such that the existence of polynomial size circuits
782: which solve their maximum  likelihood decoding problems implies that
783: $NP\subseteq P/poly$.
784: \end{corollary}
785: 
786: \section{Concluding remarks}
787: 
788: In this paper, we prove that computing  minimum distances
789: and the maximum likelihood decoding 
790: are NP-hard for algebraic geometry codes.
791: Our results rule out the possibility
792: of polynomial time solutions for these two problems, unless $NP=ZPP$.
793: 
794: The Reed-Solomon codes can be thought of as a special
795: case of algebraic geometry codes, in which
796: we use the rational function field. 
797: Let $O$ be the infinity point on the projective line.
798: The functions $1, x, x^2, \cdots, x^k$ form
799: a basis for $L(kO)$. In \cite{ChengWa04}, the authors study Hamming balls 
800: centered at the vectors $(r(x)/h(x))_{x\in \F_q}$, where $r$ and
801: $h$ are polynomials in order  
802: to prove that the bounded distance decoding
803: for the Reed-Solomon codes is hard.
804: The function $f(x)/h(x)$ has poles at point other than $O$.
805: Some results in  \cite{GuruswamiRu05} follow a similar line.
806: In the proof of Lemma \ref{distancelemma}, 
807: we use $f'$ to generate a received word, it has poles
808: at a place other than $O$.
809: We suspect that further exploration of this connection between rational
810: functions with a different pole 
811: and decoding  problems would prove fruitful.
812: 
813: 
814: Our results use algebraic geometry codes based on elliptic curves.
815: In many ways, the elliptic codes are very similar to
816: the Reed-Solomon codes.
817: Intuitively we expect that the decoding problem for elliptic codes
818: is the easiest among all algebraic geometry codes.
819: We leave it as an open problem to
820: prove that both problems are NP-hard for codes based on curves
821: of any fixed genus.
822: 
823: 
824: The most interesting family of algebraic geometry codes 
825: has a fixed alphabet.
826: The codes in our results have alphabets of exponential size.
827: Nonetheless, we observe that
828: all the known decoding algorithms for algebraic geometry
829: codes are not sensible to the size
830: of the alphabets. Our results indicate that if 
831: a polynomial time maximum likelihood decoding
832: algorithm for  algebraic geometry codes 
833: does exist, it can only work for codes with a small alphabet size.
834: We conjecture that the maximum likelihood decoding is NP-hard
835: even for a family of
836: algebraic geometry codes with a fixed alphabet,
837: and leave it as an open problem.
838: 
839: 
840: \section*{Acknowledgments}
841: We thank Daqing Wan and Elizabeth Murray for helpful discussions.
842: 
843: 
844: 
845: 
846: \begin{thebibliography}{10}
847: 
848: \bibitem{Adleman94}
849: Len Adleman.
850: \newblock Algorithmic number theory-the complexity contribution.
851: \newblock In {\em Proc.\ $35$th IEEE Symp.\ on Foundations of Comp.\ Science},
852:   pages 88--113, 1994.
853: 
854: \bibitem{AgrawalKa02}
855: Manindra Agrawal, Neeraj Kayal, and Nitin Saxena.
856: \newblock Primes is in {P}.
857: \newblock http://www.cse.iitk.ac.in/news/primality.pdf, 2002.
858: \newblock To appear in {\em Annals of Mathematics}.
859: 
860: \bibitem{Ajtai98}
861: Miklos Ajtai.
862: \newblock The shortest vector problem in l2 is np-hard for randomized
863:   reductions (extended abstract).
864: \newblock In {\em Proc.\ $30$th ACM Symp.\ on Theory of Computing}, pages
865:   10--19, 1998.
866: 
867: \bibitem{AjtaiKu01}
868: Miklos Ajtai, Ravi Kumar, and D.~Sivakumar.
869: \newblock A sieve algorithm for the shortest lattice vector problem.
870: \newblock In {\em Proc.\ $33$th ACM Symp.\ on Theory of Computing}, pages
871:   601--610, 2001.
872: 
873: \bibitem{BerlekampMc78}
874: Elwyn~R. Berlekamp, Robert~J. McEliece, and Henk~C. van Tilborg.
875: \newblock On the inherent intractability of certain coding problems.
876: \newblock {\em IEEE Transactions of Information Theory}, 24(3):384--386, 1978.
877: 
878: \bibitem{ChengWa04}
879: Qi~Cheng and Daqing Wan.
880: \newblock On the list and bounded distance decodibility of the reed-solomon
881:   codes (extended abstract).
882: \newblock In {\em Proc.\ $45$th IEEE Symp.\ on Foundations of Comp.\ Science},
883:   pages 335--341, 2004.
884: 
885: \bibitem{DumerMi03}
886: Ilya Dumer, Daniele Micciancio, and Madhu Sudan.
887: \newblock Hardness of approximating the minimum distance of a linear code.
888: \newblock {\em IEEE Transactions on Information Theory}, 49(1):22--37, 2003.
889: 
890: \bibitem{GuruswamiRu05}
891: Venkatesan Guruswami and Atri Rudra.
892: \newblock Limits to list decoding reed-solomon codes.
893: \newblock In {\em Proc.\ $37$th ACM Symp.\ on Theory of Computing}, 2005.
894: 
895: \bibitem{GuruswamiSu99}
896: Venkatesan Guruswami and Madhu Sudan.
897: \newblock Improved decoding of {Reed-Solomon} and algebraic-geometry codes.
898: \newblock {\em IEEE Transactions on Information Theory}, 45(6):1757--1767,
899:   1999.
900: 
901: \bibitem{GuruswamiVa05}
902: Venkatesan Guruswami and Alexander Vardy.
903: \newblock Maximum-likelihood decoding of reed-solomon codes is {NP}-hard.
904: \newblock In {\em Proceeding of SODA}, 2005.
905: 
906: \bibitem{HuangIe94}
907: Ming-Deh Huang and Doug Ierardi.
908: \newblock Efficient algorithms for the riemann-roch problem and for addition in
909:   the jacobian of a curve.
910: \newblock {\em Journal of Symbolic Computation}, 18:519--539, 1994.
911: 
912: \bibitem{Khot04}
913: Subhash Khot.
914: \newblock Hardness of approximating the shortest vector problem in lattices.
915: \newblock In {\em Proc.\ $45$th IEEE Symp.\ on Foundations of Comp.\ Science},
916:   2004.
917: 
918: \bibitem{Lobstein90}
919: Antoine Lobstein.
920: \newblock The hardness of solving subset sum with preprocessing.
921: \newblock {\em IEEE Transactions on Information Theory}, 36(4):943--946, 1990.
922: 
923: \bibitem{Shoup90}
924: Victor Shoup.
925: \newblock On the deterministic complexity of factoring polynomials over finite
926:   fields.
927: \newblock {\em Information Processing Letters}, 33:261--267, 1990.
928: 
929: \bibitem{Silverman86}
930: Joseph~H. Silverman.
931: \newblock {\em The arithmetic of elliptic curves}.
932: \newblock Springer-Verlag, 1986.
933: 
934: \bibitem{Vardy97}
935: Alexander Vardy.
936: \newblock The intractability of computing the minimum distance of a code.
937: \newblock {\em IEEE Trans. Inform. Theory}, 43(6):1757--1766, 1997.
938: 
939: \end{thebibliography}
940: 
941: 
942: 
943: 
944: 
945: \end{document}
946: 
947: