0711.3183/pal.tex
1: \documentclass[12pt]{article}
2: \usepackage{amsmath,amsthm,amsfonts}
3: \usepackage{fullpage}
4: \usepackage{graphicx}
5: \usepackage[dvips]{epsfig}
6: \usepackage{epsf}
7: \usepackage{float}
8: \usepackage{wasysym}
9: 
10: \def\divides{{\  | \ }}
11: \def\intersect{ \ \cap \ }
12: \def\union{\ \cup \ }
13: 
14: \newtheorem{theorem}{Theorem}
15: \newtheorem{corollary}[theorem]{Corollary}
16: \newtheorem{proposition}[theorem]{Proposition}
17: \newtheorem{lemma}[theorem]{Lemma}
18: \newtheorem{claim}[theorem]{Claim}
19: 
20: \title{Detecting Palindromes, Patterns and Borders in Regular Languages}
21: 
22: \author{Terry Anderson, Narad Rampersad\footnote{Author's current address:
23: Department of Mathematics and Statistics, University of Winnipeg,
24: 515 Portage Ave., Winnipeg, MB R3B 2E9, Canada.},
25: Nicolae Santean\footnote{Author's current address:
26: Department of Computer and Information Sciences,
27: Indiana University South Bend, 1700 Mishawaka Ave.,
28: P.O. Box 7111, South Bend, IN 46634, U.S.A.}, and Jeffrey Shallit\\
29: School of Computer Science\\
30: University of Waterloo\\
31: Waterloo, ON  N2L 3G1, Canada\\
32: {\tt tanderson@uwaterloo.ca} \\
33: {\tt n.rampersad@uwinnipeg.ca} \\
34: {\tt nsantean@iusb.edu} \\
35: {\tt shallit@graceland.uwaterloo.ca}\medskip\\
36: John Loftus\\
37: Luzerne County Community College\\
38: 1333 South Prospect Street\\
39: Nanticoke, PA  18634, U.S.A.\\
40: {\tt jloftus@luzerne.edu}}
41: 
42: \begin{document}
43: \date{\today}
44: \maketitle
45: 
46: \begin{abstract}
47: Given a language $L$ and a nondeterministic finite automaton $M$,
48: we consider whether we can determine efficiently (in the size of $M$)
49: if $M$ accepts at least one word in $L$, or infinitely many words.
50: Given that $M$ accepts at least one word in $L$,
51: we consider how long a shortest word can be.
52: The languages $L$ that we examine include the
53: palindromes, the non-palindromes, the $k$-powers, the non-$k$-powers,
54: the powers, the non-powers (also called primitive words), the words matching
55: a general pattern, the bordered words, and the unbordered words.
56: 
57: \bigskip\noindent
58: \textbf{Keywords:} palindrome, $k$-power, primitive word, pattern,
59: bordered word.
60: \end{abstract}
61: 
62: \section{Introduction}
63: 
64: Let $L \subseteq \Sigma^*$ be a fixed language, and let $M$ be a
65: deterministic finite automaton (DFA) or nondeterministic finite
66: automaton (NFA) with input alphabet $\Sigma$.
67: In this paper we are interested in three questions:
68: 
69: \begin{enumerate}
70: 
71: \item Can we efficiently decide (in terms of the size of $M$)
72: if $L(M)$ contains at least 
73: one element of $L$, that is, if $L(M) \intersect L \not= \emptyset$?
74: 
75: \item Can we efficiently decide if $L(M)$ contains infinitely
76: many elements of $L$, that is, if $L(M) \intersect L$ is infinite?
77: 
78: \item Given that $L(M)$ contains at least one element of $L$, what is
79: a good upper bound on a shortest element of $L(M) \intersect L$?
80: 
81: \end{enumerate}
82: 
83: We can also ask the same questions about $\overline{L}$, the
84: complement of $L$.
85: 
86: As an example, consider the case where $\Sigma = \lbrace {\tt a} \rbrace$,
87: $L$ is the set of primes written in unary, that is,
88: $\lbrace {\tt a}^i \ : \ i \text{ is prime } \rbrace$, and $M$ is a NFA with
89: $n$ states.  
90: 
91: To answer questions (1) and (2), we first rewrite $M$ in Chrobak
92: normal form \cite{Chrobak:1986}.  Chrobak normal form consists of an 
93: NFA $M'$ with a 
94: ``tail'' of $O(n^2)$ states, followed by a single nondeterministic 
95: choice to a set of disjoint cycles containing at most $n$ states.  
96: Computing this normal form can be achieved in $O(n^5)$ steps
97: by a result of Martinez \cite{Martinez:2002}.  
98: 
99: Now we examine each of the cycles produced by this transformation.
100: Each cycle accepts a finite union of sets of the form $({\tt a}^t)^*
101: {\tt a}^c$, where $t$ is the size of the cycle and $c \leq n^2 + n$;
102: both $t$ and $c$ are given explicitly from $M'$.  Now, by Dirichlet's
103: theorem on primes in arithmetic progressions, $\gcd(t,c) = 1$ for at
104: least one pair $(t,c)$ induced by $M'$ if and only if $M$ accepts
105: infinitely many elements of $L$.  This can be checked in $O(n^2)$
106: steps, and so we get a solution to question (2) in polynomial time.
107: 
108: Question (1) requires a little more work.  From our answer to question
109: (2), we may assume that $\gcd(t,c) > 1$ for all pairs $(t,c)$, for
110: otherwise $M$ accepts infinitely many elements of $L$ and hence at
111: least one element.  Each element in such a set is of length $kt+c$ for
112: some $k \geq 0$.   Let $d = \gcd(t,c) \geq 2$.  Then $kt+c = (kt/d +
113: c/d)d$.  If $k > 1$, this quantity is at least $2d$ and hence
114: composite.  Thus it suffices to check the primality of $c$ and $t+c$,
115: both of which are at most $n^2 + 2n$.  We can precompute
116: the primes $< n^2 + 2n$ in
117: $O(n^2)$ time using a modification of the sieve of Eratosthenes
118: \cite{Pritchard:1987}, and check if any of them are accepted.  This
119: gives a solution to question (1) in polynomial time.
120: 
121: On the other hand, answering question (3) essentially amounts to
122: estimating the size of the least prime in an arithmetic progression, an
123: extremely difficult question that is still not fully resolved
124: \cite{Heath-Brown:1992}, although it is known that there is a
125: polynomial upper bound.
126: 
127: Even the case where $L$ is regular can be difficult.  Suppose $L$
128: is represented as the complement of a language accepted by an NFA $M'$ with
129: $n$ states.  Then if $L(M) =\Sigma^*$, question (1) amounts to asking
130: if $L(M') \not= \Sigma^*$, which is PSPACE-complete
131: \cite[Section 10.6]{Aho&Hopcroft&Ullman:1974}.  Question (2) amounts to
132: asking if $\overline{L(M')}$ is infinite, which is also
133: PSPACE-complete \cite{Kao&Shallit&Xu:2007}.  
134: Question (3) amounts to asking for good bounds on the smallest string not
135: accepted by an NFA.  There is an evident upper bound of $2^n$, and
136: there are examples known that achieve $2^{cn}$ for some constant
137: $c > 0$, but more detailed analysis is still lacking
138: \cite{Ellul&Krawetz&Shallit&Wang:2004}.
139: 
140: Thus we see that asking these questions, even for relatively simple
141: languages $L$, can quickly take us to
142: the limits of what is known in formal language theory and number theory.
143: 
144: In this paper we examine questions (1)--(3) in the case where $M$
145: is an NFA and $L$ is either the set of palindromes, the set of
146: $k$-powers, the set of powers, the set of words matching a general pattern,
147: the set of bordered words, or their complements.
148: 
149:    In some of these cases, there is previous work.
150: For example, Ito et al.\
151: \cite{Ito&Katsura&Shyr&Yu:1988} studied several
152: circumstances in which primitive words (non-powers) may appear in
153: regular languages. As a typical result in
154: \cite{Ito&Katsura&Shyr&Yu:1988}, we mention:
155: ``A DFA over an alphabet of $2$ or more letters accepts a primitive
156: word iff it accepts one of length $\leq 3n-3$, where $n$ is the
157: number of states of the DFA''. 
158: Horv\'ath, Karhum\"aki and Kleijn \cite{Horvath&Karhumaki&Kleijn:1987} 
159: addressed the decidability problem of whether a language
160: accepted by an NFA is palindromic (i.e., every element is a palindrome).
161: They showed that the
162: language accepted by an NFA with $n$ states is palindromic
163: if and only if all its words of length shorter than $3n$
164: are palindromes. 
165: 
166:      Here is a summary of the rest of the paper.  In section~\ref{nn},
167: we define the objects of study and our notation.  
168: 
169: In section~\ref{onepal}, we begin our study of palindromes.  We give
170: efficient algorithms to test if an NFA accepts at least one palindrome,
171: or infinitely many.  We also show that a shortest palindrome accepted
172: is of length at most quadratic, and further, that quadratic examples
173: exist.  In section~\ref{algpal}, we give efficient algorithms to test
174: if an NFA accepts at least one non-palindrome, or infinitely many.
175: Further, we give a tight bound on the length of a shortest
176: non-palindrome accepted.
177: 
178: In section~\ref{pow_test}, we begin our study of patterns.  We show that
179: it is PSPACE-complete to test if a given NFA accepts a word matching a
180: given pattern.  As a special case of this problem we consider testing
181: if an NFA accepts a $k$-power.  We give a 
182: algorithm to test if a $k$-power is accepted that is polynomial in $k$.
183: If $k$ is not fixed, the problem is PSPACE-complete.  
184: We also study the problem of accepting a power of exponent $\geq k$,
185: and of accepting infinitely many $k$-powers.
186: 
187: In section~\ref{kp},
188: we give a polynomial-time algorithm to decide if a non-$k$-power
189: is accepted.  We also give upper and lower bounds
190: on the length of a shortest $k$-power accepted.  In
191: section~\ref{powers}, we give an efficient algorithm for
192: determining if an NFA accepts at least one non-power.
193: In section~\ref{smallkp}, we bound the length of the smallest power.
194: Section~\ref{add2pow} gives some additional results on powers.
195: 
196: In section~\ref{bord}, we show how to test if an NFA accepts a bordered
197: word, or infinitely many,
198: and show that a shortest bordered word accepted can be of
199: quadratic length.  In section~\ref{unbord} we give an algorithm
200: to test if an NFA accepts an unbordered word, or infinitely many,
201: and we establish a linear upper bound on the length of a shortest
202: unbordered word.
203: 
204: \section{Notions and notation}\label{nn}
205: 
206: Let $\Sigma$ be an alphabet, i.e., a nonempty, finite set
207: of symbols (letters). By $\Sigma^*$ we denote the set of
208: all finite words (strings of symbols) over $\Sigma$, and by
209: $\epsilon$, the empty word (the word having zero
210: symbols). The operation of concatenation (juxtaposition) of
211: two words $u$ and $v$ is denoted by $u\cdot v$, or simply
212: $uv$.  If $w \in \Sigma^*$ is written in the form $w=xy$ for
213: some $x,y \in \Sigma^*$, then the word $yx$ is said to be a
214: {\it conjugate} of $w$.
215: 
216: For $w\in \Sigma^*$, we denote by $w^R$ the word
217: obtained by reversing the order of symbols in $w$. A {\it
218: palindrome} is a word $w$ such that $w = w^R$. If $L$ is a
219: language over $\Sigma$, i.e., $L\subseteq \Sigma^*$, we say
220: that $L$ is {\it palindromic} if every word $w \in L$ is a
221: palindrome.
222: 
223: Let $k \geq 2$ be an integer.  A word $y$ is a
224: \emph{$k$-power} if $y$ can be written as $y = x^k$ for
225: some non-empty word $x$.  If $y$ cannot be so written for
226: any $k \geq 2$, then $y$ is \emph{primitive}. A $2$-power
227: is typically referred to as a \emph{square}, and a
228: $3$-power as a \emph{cube}.
229: 
230: Patterns are a generalization of powers.  A \emph{pattern}
231: is a non-empty word $p$ over a \emph{pattern alphabet} $\Delta$.  The
232: letters of $\Delta$ are called \emph{variables}.  A pattern
233: $p$ \emph{matches} a word $w \in \Sigma^*$ if there exists a non-erasing
234: morphism $h : \Delta^* \to \Sigma^*$ such that $h(p) = w$.  Thus,
235: a word $w$ is a $k$-power if it matches the pattern $a^k$.
236: 
237:       Bordered words are generalizations of powers.  We say a
238: word $x$ is {\it bordered}
239: if there exist words $u \in \Sigma^+$, $w \in \Sigma^*$
240: such that $x = uwu$.  In this case, the word $u$ is said to be a
241: {\it border} for $x$.  Otherwise, $x$ is {\it unbordered}.
242: 
243: A nondeterministic finite automaton (NFA) over $\Sigma$
244: is a $5$-tuple $M=(Q, \Sigma, \delta, q_0, F)$ where $Q$
245: is a finite set of states, $\delta : Q\times \Sigma
246: \rightarrow 2^{Q}$ is a next-state
247: function, $q_0$ is an initial state and $F\subseteq Q$ is a
248: set of final states. We sometimes view $\delta$ as a
249: transition table, i.e., as a set consisting of tuples $(p,
250: a, q)$ with $p, q\in Q$ and $a\in\Sigma$.
251: The machine $M$ is deterministic (DFA) if $\delta$ is a function
252: mapping $Q\times \Sigma\rightarrow Q$. We consider only {\em
253: complete} DFAs, that is, those whose transition function
254: is a total function.  Sometimes we use NFA-$\epsilon$, which are
255: NFAs that also allow transitions on the empty word.
256: 
257: The size of $M$ is the total number $N$ of its
258: states and transitions. When we want to emphasize the components of $M$,
259: we say $M$ has $n$ states and $t$ transitions, and define $N := n+t$.
260: The language of $M$,
261: denoted by $L(M)$, belongs to the family of {\em regular
262: languages} and consists of those words accepted by $M$ in
263: the usual sense. A {\em successful path}, or {\em
264: successful computation} of $M$ is any computation starting
265: in the initial state and ending in a final state. The label
266: of a computation is the input word that triggered it; thus,
267: the language of $M$ is the set of labels of all successful
268: computations of $M$.
269: 
270: A state of $M$ is {\em accessible} if there exists a path in the
271: associated transition graph, starting from $q_0$ and ending
272: in that state. By convention, there exists a path from each
273: state to itself labeled with $\epsilon$. A state $q$ is
274: {\em coaccessible} if there exists a path from $q$ to some
275: final state. A state which is both accessible and
276: coaccessible is called {\em useful}, and if it is not
277: coaccessible it is called {\em dead}.
278: 
279: We note that if $M$ is an NFA or NFA-$\epsilon$, we can remove all states that
280: are not useful in linear time (in the number of states and transitions)
281: using depth-first search.  We observe that $L(M) \not= \emptyset$ if
282: and only if any states remain after this process, which can be
283: tested in linear time.  Similarly, if $M$ is a NFA,  then $L(M)$ is infinite
284: if and only if the corresponding digraph has a directed cycle.  
285: This can also be tested in linear time.
286: 
287: If $M$ is an NFA-$\epsilon$, then to check if $L(M)$ is infinite
288: we need to know not only that the corresponding digraph has a cycle, but
289: that it has a cycle labeled by a non-empty word.  This can also be
290: checked in linear time as follows.  Let us suppose that all non-useful
291: states of $M$ have been removed.  We wish to test whether there is
292: some edge of the digraph of $M$ that is part of some cycle and is not
293: labeled by the empty word.  We now observe that an edge of a digraph
294: belongs to a directed cycle if and only if both of its endpoints lie within
295: the same strongly connected component.  It is well known that the strongly
296: connected components of a graph can be computed in linear time
297: (see \cite[Section~22.5]{CLRS01}).  Once the strongly connected components
298: of the NFA-$\epsilon$ are known, we simply check the edges not
299: labeled by $\epsilon$ to determine if there is such an edge with both
300: endpoints in the same strongly connected component.  Thus we can
301: determine if $L(M)$ is infinite in linear time.
302: 
303: Although the results of this paper are generally stated as applying
304: to NFA's, by virtue of the preceding algorithm, one sees that the
305: results apply equally well to NFA-$\epsilon$'s.
306: 
307: We will also need the following well-known results
308: \cite{Hopcroft&Ullman:1979}:
309: 
310: \begin{theorem}
311: Let $M$ be an NFA with $n$ states.  Then
312: \begin{itemize}
313: \item[(a)]   $L(M) \not= \emptyset$ if and only if $M$ accepts a word
314: of length $< n$.
315: 
316: \item[(b)] $L(M)$ is infinite if and only if $M$ accepts a word
317: of length $\ell$, $n \leq \ell < 2n$.
318: \end{itemize}
319: \label{hopcroft}
320: \end{theorem}
321: 
322: If $L \subseteq \Sigma^*$ is a language, the \emph{Myhill--Nerode equivalence
323: relation} $\equiv_L$ is the equivalence relation defined as
324: follows:  for $x,y \in \Sigma^*$, $x \equiv_L y$ if for all $z \in \Sigma^*$,
325: $xz \in L$ if and only if $yz \in L$.  The classical Myhill--Nerode theorem
326: asserts that if $L$ is regular, the equivalence relation $\equiv_L$ has
327: only finitely many equivalence classes.
328: 
329: For a background on finite automata and regular languages
330: we refer the reader to Yu \cite{YU97}.
331: 
332: \section{Testing if an NFA accepts at least one palindrome}
333: \label{onepal}
334: 
335:      Over a unary alphabet, every string is a palindrome, so problems
336: (1)--(3) become trivial.  Let us assume, then, that the alphabet $\Sigma$
337: contains at least two letters.  Although the palindromes over such an
338: alphabet are not regular, the language
339: $$ \lbrace x \in \Sigma^* \ : \ x x^R \in  L(M) \text{ or there exists } a \in \Sigma \text{ such that } x a x^R \in L(M) \rbrace$$
340: is, in fact, regular, as is often shown in a beginning course in formal
341: languages \cite[p.\ 72, Exercise 3.4 (h)]{Hopcroft&Ullman:1979}.  We
342: can take advantage of this as follows:
343: 
344: \begin{lemma}
345:       Let $M$ be an NFA with $n$ states and $t$ transitions.  Then there
346: exists an NFA-$\epsilon$ $M'$ with $n^2+1$ 
347: states and $\leq 2t^2$ transitions such that
348: $$L(M') = \lbrace x \in \Sigma^* \ : \ x x^R \in L(M) \text{ or there
349: 	exists } a \in \Sigma \text{ such that } x a x^R \in L(M) \rbrace.$$
350: \label{pal-con}
351: \end{lemma}
352: 
353: \begin{proof}
354: Let $M=(Q, \Sigma, \delta, q_0, F)$ be an NFA
355: with $n$ states. We construct an NFA-$\epsilon$
356: $M' = (Q', \Sigma, \delta', q'_0, F')$ as follows:
357: We let $Q'=Q\times Q\cup \{q_0'\}$, where $q_0'$ is the new initial state,
358: and we define 
359: the set of final states by
360: $$F' = \lbrace [p, p] \ : \ p\in Q\}\cup \{[p, q] \ : \ \text{ there exists }
361: a \in \Sigma \text{ such that } q \in \delta(p,a) \rbrace.$$
362: The transition function $\delta'$ is defined as follows:
363: $$ \delta'(q'_0, \epsilon) = \lbrace [q_0,q] \ : \ q \in F \rbrace$$
364: and
365: $$\delta'([p,q], a) = \lbrace [r,s] \ : \ r \in \delta(p,a) \text{ and }
366: q \in \delta(s, a) \rbrace.$$
367: 
368: It is clear that $M'$ accepts the desired language and consists of at most
369: $n^2+1$ states and $2t^2$ transitions.
370: \end{proof}
371: 
372: \begin{corollary}
373:     Given an NFA $M$ with $n$ states and $t$ transitions,
374: we can determine if $M$ accepts a palindrome in $O(n^2 + t^2)$ time.
375: \end{corollary}
376: 
377: \begin{proof}
378:       We create $M'$ as in the proof of Lemma~\ref{pal-con},
379: and remove all states that are not useful, and
380: their associated transitions.  Now $M$ accepts
381: at least one palindrome if and only if $L(M')\not=\emptyset$, which can
382: be tested in time linear in the number of transitions and states of $M'$.
383: \end{proof}
384: 
385:       From Lemma~\ref{pal-con}, we obtain two other interesting
386: corollaries.
387: 
388: \begin{corollary}
389:       Given an NFA $M'$, we can determine if $L(M)$ contains infinitely
390: many palindromes in quadratic time.
391: \label{inf-pal}
392: \end{corollary}
393: 
394: \begin{proof}
395:        We create $M'$ as in the proof of Lemma~\ref{pal-con}, and remove
396: all states that are not useful, and their associated transitions.
397: $M$ accepts infinitely many palindromes if and only if $L(M')$ is infinite,
398: which can be tested in linear time, as described in Section~\ref{nn}.
399: \end{proof}
400: 
401: \begin{corollary}
402:      If an NFA $M$ accepts at least one palindrome, it accepts a
403: palindrome of length $\leq 2n^2 -1$.
404: \end{corollary}
405: 
406: \begin{proof}
407:       Suppose $M$ accepts at least one palindrome.  Then $M'$, as in
408: Lemma~\ref{pal-con}, accepts a word.  Although $M'$ has $n^2+1$ states,
409: the only transition from the initial state $q'_0$ is 
410: an $\epsilon$-transition to one of the other $n^2$ states.  Thus if
411: $M'$ accepts a word, it must accept a word of length $\leq n^2 - 1$.
412: Then $M$ accepts
413: either $w w^R$ or $w a w^R$, and both are palindromes, so $M$
414: accepts a palindrome of length at most $2(n^2 - 1) + 1 = 2n^2 - 1$.
415: \end{proof}
416: 
417:      For a different proof of this corollary, see Rosaz \cite{Rosaz:2002}.
418: 
419:       We observe that the quadratic bound is tight, up to 
420: a multiplicative constant, in the case of alphabets with at
421: least two letters, and even for DFAs:
422: 
423: \begin{proposition}
424: For infinitely many $n$ there exists a DFA $M$ with $n$ states
425: over a $2$-letter alphabet such that
426:     \begin{itemize}
427:     \item[(a)] $M$ has $n$ states;
428:     \item[(b)] The shortest palindrome accepted by $M_n$ is
429:     of length $\geq n^2/2 - 3n + 5$.
430:     \end{itemize}
431: \end{proposition}
432: 
433: \begin{proof}
434:      For $t \geq 2$,
435: consider the language $L_t = ({\tt a}^t)^+ {\tt b} ({\tt a}^{t-1})^+$.
436: This language evidently can be accepted by a DFA with $n = 2t+2$ states.
437: For a word $w \in L_t$ to be a palindrome, we must have
438: $w = {\tt a}^{c_1 t} {\tt b} {\tt a}^{c_2 (t-1)}$, for some
439: integers $c_1, c_2 \geq 1$, with $c_1 t = c_2 (t-1)$.  Since $t$ and
440: $t-1$ are relatively prime, we must have $t-1 \divides c_1$ and
441: $t \divides c_2$.  Thus the shortest palindrome in $L_n$ is
442: ${\tt a}^{t(t-1)} {\tt b} {\tt a}^{t(t-1)}$, which is of length
443: $2t^2 - 2t + 1 = n^2/2 - 3n + 5$.  
444: \end{proof}
445: 
446: \section{Testing if an NFA accepts at least one non-palindrome}
447: \label{algpal}
448: 
449:     In this section we consider the problem of deciding if an 
450: NFA accepts at least one non-palindrome.  Evidently, if an NFA
451: fails to accept a non-palindrome, it must accept nothing but
452: palindromes, and so we discuss the opposite decision problem, 
453: 
454: \medskip
455: 
456: \centerline{Given an NFA $M$, is $L(M)$ palindromic?}
457: 
458: \medskip
459: 
460:     Again, the problem is trivial for a unary alphabet, so we 
461: assume $|\Sigma| \geq 2$.
462: 
463: Horv\'ath, Karhum\"aki, and Kleijn
464: \cite{Horvath&Karhumaki&Kleijn:1987} proved that the
465: question is recursively solvable.
466: In particular, they proved the following theorem:
467: 
468: \begin{theorem}
469: $L(M)$ is palindromic if and only if $\lbrace x \in L(M) \
470: : \ |x| < 3n \rbrace$ is palindromic, where $n$ is the
471: number of states of $M$. \label{hkk}
472: \end{theorem}
473: 
474: While a naive implementation of Theorem~\ref{hkk} would
475: take exponential time, in this section we show how to
476: test palindromicity in polynomial time. We
477: also show the bound of $3n$ in
478: Theorem~\ref{hkk} is tight for NFAs, and we improve the bound for
479: DFAs.
480: 
481: First, we show how to construct a ``small'' NFA $M'_s$, for
482: some integer $s >1$, that has the following properties:
483: 
484: \begin{itemize}
485: \item[(a)] no word in $L(M'_s)$ is a palindrome;
486: 
487: \item[(b)] $M'_s$ accepts all non-palindromes of length $< s$  (in addition to some
488: other non-palindromes).
489: 
490: \end{itemize}
491:      The idea in this construction is the following:  on input
492: $w$ of length $r<s$, we ``guess'' an index $i$, $1 \leq i
493: \leq r/2$, such that $w[i] \not= w[r+1-i]$.  We then
494: ``verify'' that there is indeed a mismatch $i$ characters
495: from each end. We can re-use states, as illustrated in
496: Figure~\ref{fig:pred2} for the case $\Sigma = \lbrace {\tt
497: a,b,c} \rbrace$ and $s = 10$.
498: 
499: \begin{figure}[H]
500: \input nonpal.tex
501: \caption{Accepting non-palindromes over $\lbrace {\tt
502: a,b,c} \rbrace$ for $s = 10$.} \label{fig:pred2}
503: \end{figure}
504: 
505:       The resulting NFA $M'_s$ has
506: % $(|\Sigma| + 2)(\lfloor (t-1)/2 \rfloor)$
507: $O(|\Sigma| s)$ states
508: and
509: % $2 (\lfloor (t-1)/2 \rfloor) |\Sigma| (|\Sigma| + 1) - 2|\Sigma|$
510: $O(|\Sigma|^2 s)$ transitions.  A similar construction appears
511: in \cite{Shallit&Breitbart:1996}.
512: 
513: Given an NFA $M$ with $n$ states, we now construct the
514: cross-product with $M'_{3n}$, and obtain an NFA $A$ that
515: accepts $L(M) \ \cap \ L(M'_{3n})$. We claim that $L(A) =
516: \emptyset$ if and only if $L(M)$ is palindromic. For if
517: $L(A) = \emptyset$, then $M$ accepts no non-palindrome of
518: length $< 3n$, and so by Theorem~\ref{hkk}, $L(M)$ is
519: palindromic. If $L(A) \not= \emptyset$, then since
520: $L(M'_{3n})$ contains only non-palindromes, we see that
521: $L(M)$ is not palindromic.
522: 
523: We can determine if $L(A) = \emptyset$ efficiently by
524: adding a new final state $q_f$ and
525: $\epsilon$-transitions from all the final states of $A$
526: to $q_f$, then performing a depth-first search to detect
527: whether there are any paths from $q_0$ to $q_f$.  This can
528: be done in time linear in the number of states and
529: transitions of $A$.  If $M$ has $n$ states and $t$
530: transitions, then $A$ has $O(n^2)$ states and
531: $O(tn)$ transitions.   Hence we have proved the following theorem.
532: 
533: \begin{theorem}
534: Let $M$ be an NFA with $n$ states and $t$ transitions.
535: The algorithm sketched above determines whether
536: $M$ accepts a palindromic language in $O(n^2 + tn)$ time.
537: \label{thm2}
538: \end{theorem}
539: 
540:       A different method runs slightly slower, but allows us
541: to do a little more.    We can mimic the construction for palindromes
542: in Section~\ref{onepal}, but adapt it for non-palindromes.  Given
543: an NFA $M$, we construct an NFA-$\epsilon$ $M'$ that accepts the language
544: \begin{eqnarray*}
545: \lbrace x \in \Sigma^* &:& \text{there exists } x' \in \Sigma^*,
546: a \in \Sigma \text{ such that } |x| = |x'|, x \not= {x'}^R,\\
547: &&\text{ and } x x' \in L(M) \text{ or } x a x' \in L(M) \rbrace.
548: \end{eqnarray*}
549: The construction is similar to that in Lemma~\ref{pal-con}.  On input
550: $x$, we simulate $M$ on $x x'$ and $x a x'$ symbol-by-symbol, moving
551: forward from the start state and backward from a final state.
552: We need an additional boolean ``flag'' for each state to record whether or not
553: we have processed a character in $x'$ that would mismatch the corresponding
554: character in $x$.   If $M$ has $n$ states and $t$ transitions,
555: this construction produces an NFA-$\epsilon$ $M'$ with
556: $\leq 1+2n^2$ states and $O(t^2)$ transitions.  From this we get,
557: in analogy with Corollary~\ref{inf-pal}, the following proposition.
558: 
559: \begin{proposition}
560:      Given an NFA $M$ with $n$ states and $t$ transitions, we can determine in 
561: $O(n^2 + t^2)$ time if $M$ accepts infinitely many non-palindromes.
562: \end{proposition}
563: 
564: We now turn to the question of the optimality of the $3n$
565: bound given in Theorem~\ref{hkk}. For an NFA over an
566: alphabet of at least $2$ symbols, the bound is indeed
567: optimal, as the following example shows.
568: 
569: \begin{proposition}
570: Let $\Sigma$ be an alphabet of at least two symbols, containing the
571: letters $\tt a$ and $\tt b$.
572: For $n \geq 1$
573: define $L_n =  ({\tt a}^{n-1} \Sigma)^* {\tt a}^{n-1}$.
574: Then $L_n$ can be accepted by an NFA with $n$ states and a shortest
575: non-palindrome in $L_n$ is ${\tt a}^{n-1} {\tt a} {\tt a}^{n-1} {\tt b}
576: {\tt a}^{n-1}$.
577: \label{prope}
578: \end{proposition}
579: 
580: \begin{proof}
581: The details are straightforward.
582: \end{proof}
583: 
584: For DFAs, however, the bound of $3n$ can be improved to
585: $3n-3$. To show this, we first prove the following lemma. A
586: language $L$ is called {\it slender} if there is a constant
587: $C$ such that, for all $n \geq 0$, the number of words of
588: length $n$ in $L$ is less than $C$. The following
589: characterization of slender regular languages has been
590: independently rediscovered several times
591: \cite{Kunze&Shyr&Thierrin:1981,Shallit:1994,Paun&Salomaa:1995}.
592: 
593: \begin{theorem}
594: \label{slender}
595: Let $L \subseteq \Sigma^*$ be a regular language.  Then $L$ is slender
596: if and only if it can be written
597: as a finite union of languages of the form $u v^* w$, where
598: $u,v,w \in \Sigma^*$.
599: \end{theorem}
600: 
601: Next we prove the following useful lemma concerning DFAs accepting
602: slender languages.
603: 
604: \begin{lemma}
605: Let $L$ be a slender language accepted by a DFA $M$ with
606: $n$ states, over an alphabet of two or more symbols.  Then
607: $M$ must have a dead state. \label{dead-lemma}
608: \end{lemma}
609: 
610: \begin{proof}
611: Without loss of generality, assume that every state of $M =
612: (Q, \Sigma, \delta, q_0, F)$ is reachable from $q_0$, and
613: that $\Sigma$ contains the symbols $a$ and $b$. We distinguish two
614: cases:
615: \begin{enumerate}
616: \item $M$ accepts a finite language. Consider the states reached
617: from $q_0$ on $a$, $a^2$, $a^3, \ldots$ Eventually some
618: state $q$ must be repeated.  This state $q$ must be a dead
619: state, for if not, $M$ would accept an infinite language.
620: 
621: \item $M$ accepts an infinite language. Then $M$ has at
622: least one {\em fruitful} cycle, that is, a cycle that
623: produces infinitely many words in $L(M)$ as labels of
624: paths starting at $q_0$, entering the cycle, going around
625: the cycle some number of times,  then exiting and
626: eventually reaching a final state. Let $C_1$ be one
627: fruitful cycle, and consider the following successful path
628: involving $C_1$: $q_0 {\buildrel\alpha\over\longrightarrow}
629: q {\buildrel u \over\longrightarrow} q {\buildrel \beta
630: \over\longrightarrow} f$, where $f\in F$ and the repetition
631: of $q$ denotes the cycle $C_1$, labeled with $u$. Without
632: loss of generality assume the first letter of $u$ is $a$.
633: Since $M$ is complete, denote $p=\delta(q, b)$.
634: 
635: We claim that from $p$ one cannot reach a fruitful cycle $C_2$.
636: Indeed, let's assume the contrary; this means that there exists
637: a successful path $q_0 {\buildrel\alpha\over\longrightarrow}
638: q {\buildrel u \over\longrightarrow} q {\buildrel \gamma
639: \over\longrightarrow} r {\buildrel v \over\longrightarrow}
640: r {\buildrel \mu \over\longrightarrow f'}$, with $f'\in F$
641: and the repetition of $r$ denotes the cycle $C_2$ labeled
642: with $v$. Let $n$ be an arbitrary integer, and $0 \leq i \leq n$.
643: There exist two integers $k, l$ such that
644: $k|u|=l|v|=m$. With this notation, observe that the
645: words $\alpha u^{k(n-i)}\gamma v^{l(n+i)}\mu$ are all
646: accepted by $M$ and have the same length $2mn + |\alpha\gamma\mu|$.
647: Since there are $n+1$
648: such words, this proves that $L(M)$ has $\Omega(n)$ words of length $n$
649: for large $n$---a contradiction.
650: 
651: Thus, there exist a finite number of successful paths
652: starting from $p$. However, considering the states reached
653: from $p$ by the words $a$, $a^2$, $a^3, \ldots$, one such
654: state must repeat. This state is dead, for the alternative would
655: contradict the finiteness of successful paths from $p$.
656: \end{enumerate}
657: \end{proof}
658: 
659: \begin{corollary}
660:     If $M$ is a DFA over an alphabet of at least two letters
661: and $L(M)$ is palindromic, then $M$ has a dead state.
662: \label{dead-cor}
663: \end{corollary}
664: 
665: \begin{proof}
666:     If $L(M)$ is palindromic, then by
667: \cite[Theorem 8]{Horvath&Karhumaki&Kleijn:1987}
668: it can be written as a finite union of languages of the form
669: $u v (tv)^* u^R$, where $u, v, t \in \Sigma^*$ and $v, t$ are
670: palindromes.  By Theorem~\ref{slender}, this means
671: $L(M)$ is slender.  By Lemma~\ref{dead-lemma}, $M$ has a dead state.
672: \end{proof}
673: 
674:    We are now ready to prove the improved bound of $3n-3$ for DFAs.
675: 
676: \begin{theorem}
677: Let $M$ be a DFA with $n$ states.  Then $L(M)$ is palindromic if and
678: only if $\lbrace x \in L(M) \ : \ |x| < 3n-3 \rbrace$ is palindromic.
679: \end{theorem}
680: 
681: \begin{proof}
682:       One direction is clear.
683: 
684:      If $M = (Q, \Sigma, \delta, q_0, F)$ is over a unary alphabet,
685: then $L(M)$ is always palindromic, so the criterion is trivially true.
686: 
687:     Otherwise $M$ is over an alphabet of at least two letters.
688: Assume  $\lbrace x \in L(M) \ : \ |x| < 3n-3 \rbrace$ is palindromic.  From
689: Corollary~\ref{dead-cor}, we see that $M$ must have a dead state.
690: But then we can delete such a dead state and all associated transitions,
691: and all states reachable from the deleted dead state, to get a new NFA $M'$
692: with at most $n-1$ states that accepts the same language.
693: We know from Theorem~\ref{hkk} that the palindromicity of 
694: $\lbrace x \in L(M') \ : \ |x| < 3n-3 \rbrace$ implies that
695: $M'$ is palindromic.
696: \end{proof}
697: 
698:    Finally, we observe that $3n-3$ is the best possible bound
699: in the case of DFAs.  To do so, we simply use the language $L_n$
700: from Proposition~\ref{prope} and observe it can be accepted by
701: a DFA with $n+1$ states; yet the shortest non-palindrome is of
702: size $3n-1$.
703: 
704: We end this section by noting that the related, but fundamentally
705: different, problem of testing if $L = L^R$ was shown by Hunt
706: \cite{Hunt:1973} to be PSPACE-complete.
707: 
708: \section{Testing if an NFA accepts a word matching a pattern}
709: \label{pow_test}
710: 
711: In this section we consider the computational complexity of testing
712: if an NFA accepts a word matching a given pattern.
713: Specifically, we consider the following decision problem.
714: 
715: \begin{quotation}
716: \noindent{\bf NFA PATTERN ACCEPTANCE}
717: 
718: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$ and a
719: pattern $p$ over some alphabet $\Delta$.
720: 
721: \noindent QUESTION: Does there exist $x \in \Sigma^+$ such that
722: $x \in L(M)$ and $x$ matches $p$?
723: \end{quotation}
724: 
725: Since the pattern $p$ is given as part of the input, this problem
726: is actually somewhat more general than the sort of problem
727: formulated as Question~1 of the introduction, where the language
728: $L$ was fixed.
729: 
730: We first consider the following result of Restivo and Salemi
731: \cite{Restivo&Salemi:2001} (a more detailed proof appears in
732: \cite{Castiglione&Restivo&Salemi:2004}).  We give here a boolean matrix
733: based proof (see Zhang \cite{Zhang:1999} for a study of this boolean matrix
734: approach to automata theory) that illustrates our general approach to
735: the other problems treated in this section.
736: 
737: \begin{theorem}[Restivo and Salemi]
738: \label{res_sal}
739: Let $L$ be a regular language and let $\Delta$ be an alphabet.
740: The set $P_\Delta$ of all non-empty patterns $p \in \Delta^*$
741: such that $p$ matches a word in $L$ is effectively regular.
742: \end{theorem}
743: 
744: \begin{proof}
745: Let $M = (Q,\Sigma,\delta,q_0,F)$ be an NFA such that $L(M) = L$.
746: Suppose that $Q = \{0,1,\ldots,n-1\}$.
747: For $a \in \Sigma$, let $B_a$ be the $n \times n$ boolean matrix whose
748: $(i,j)$ entry is $1$ if $j \in \delta(i,a)$ and $0$ otherwise.
749: Let $\mathcal{B}$ denote the semigroup generated by the $B_a$'s
750: along with the identity matrix.
751: For $w = w_0 w_1 \cdots w_s$, where $w_i \in \Sigma$ for $i = 0,\ldots,s$,
752: we write $B_w$ to denote the matrix product $B_{w_0} B_{w_1} \cdots B_{w_s}$.
753: 
754: Without loss of generality, let $\Delta = \{1,2,\ldots,k\}$.
755: Observe that there exists a non-empty pattern
756: $p = p_0 p_1 \cdots p_r$, where $p_i \in \Delta$ for $i = 0,\ldots,r$,
757: and a non-erasing morphism $h : \Delta^* \to \Sigma^*$ such that $h(p) \in L$
758: if and only if there exist $k$ boolean matrices
759: $B_1,\ldots,B_k \in \mathcal{B}$ such that $B_i = B_{h(i)}$ for
760: $i \in \Delta$ and $B = B_{p_0} B_{p_1} \cdots B_{p_r}$ describes
761: an accepting computation of $M$.
762: 
763: We construct an NFA $M' = (Q',\Delta,\delta',P,F')$ for $P_\Delta$
764: as follows.  For simplicity, we permit $M'$ to have multiple initial states,
765: as specified by the set $P$.  We define $Q' = \mathcal{B}^{k+1}$.
766: The set $P$ of initial states is given by $P = \mathcal{B}^k \times I$,
767: where $I$ denotes the identity matrix.  In other words, the NFA $M'$ uses the
768: first $k$ components of its state to record an initial guess of $k$ boolean
769: matrices $B_1,\ldots,B_k \in \mathcal{B}$.  Let $[B_1,\ldots,B_k,A]$
770: denote some arbitrary state of $M'$.  For $i=1,\ldots,k$, the
771: transition function $\delta'$ maps $[B_1,\ldots,B_k,A]$ to
772: $[B_1,\ldots,B_k,AB_i]$.  In other words, on input
773: $p = p_0 p_1 \cdots p_r\in \Delta^*$, $M'$ uses the last component of
774: its state to compute the product $B = B_{p_0} B_{p_1} \cdots B_{p_r}$.
775: The set $F'$ of final states of $M'$ consists of all states of the form
776: $[B_1,\ldots,B_k,B]$, where the matrix $B$ contains a $1$ in some entry
777: $(0,j)$, where $j \in F$.  In other words, $M'$ accepts if and only if
778: $B$ describes an accepting computation of $M$.
779: \end{proof}
780: 
781: By consider unary patterns of the form $a^k$, we obtain the following
782: corollary of Theorem~\ref{res_sal}.
783: 
784: \begin{corollary}
785: Let $L \subseteq \Sigma^*$ be a regular language.  The set of exponents $k$
786: such that $L$ contains a $k$-power is the union of a finite set with a finite
787: union of arithmetic progressions.  Further, this set of exponents is
788: effectively computable.
789: \end{corollary}
790: 
791: Observe that Theorem~\ref{res_sal} implies the decidability
792: of the {\bf NFA PATTERN ACCEPTANCE} problem.  We prove the following
793: stronger result.
794: 
795: \begin{theorem}
796: \label{pattern}
797: The {\bf NFA PATTERN ACCEPTANCE} problem is PSPACE-complete.
798: \end{theorem}
799: 
800: \begin{proof}
801: We first show that the problem is in PSPACE.  By Savitch's theorem
802: \cite{Savitch:1970} it suffices to give an NPSPACE algorithm.
803: Let $M = (Q,\Sigma,\delta,q_0,F)$, where $Q = \{0,1,\ldots,n-1\}$.
804: For $a \in \Sigma$, let $B_a$ be the $n \times n$ boolean matrix whose
805: $(i,j)$ entry is $1$ if $j \in \delta(i,a)$ and $0$ otherwise.
806: Let $\mathcal{B}$ denote the semigroup generated by the $B_a$'s along with
807: the identity matrix.  For $w = w_0 w_1 \cdots w_s \in \Sigma^*$, we write
808: $B_w$ to denote the matrix product $B_{w_0} B_{w_1} \cdots B_{w_s}$.
809: 
810: Let $\Delta$ be the set of letters occuring in $p$.  We may suppose that
811: $\Delta = \{1,2,\ldots,k\}$.  First, we non-deterministically guess
812: $k$ boolean matrices $B_1, \ldots, B_k$.  Next, for each $i$, we
813: verify that $B_i$ is in the semigroup $\mathcal{B}$ by
814: non-deterministically guessing a word $w = w_0 w_1 \cdots w_s$
815: such that $B_i = B_w$.  Since there are at most
816: $2^{n^2}$ possible $n \times n$ boolean matrices, we may assume that
817: $s \leq 2^{n^2}$.  We thus guess $w$ symbol-by-symbol and compute a
818: sequence of matrices
819: \[
820: B_{w_1}, B_{w_1w_2}, \ldots, B_{w_1w_2 \cdots w_s},
821: \]
822: reusing space after perfoming each matrix multiplication.
823: We maintain an $O(n^2)$ bit counter to keep track of the
824: length $s$ of our guessed word $w$.  If $s$ exceeds $2^{n^2}$, we reject
825: on this branch of the non-deterministic computation.
826: 
827: Finally, if $p = p_0 p_1 \cdots p_r$, we compute the matrix product
828: $B = B_{p_0} B_{p_1} \cdots B_{p_r}$ and accept if and only if $B$
829: describes an accepting computation of $M$.
830: 
831: To show hardness we reduce from the following PSPACE-complete problem 
832: \cite[Problem AL6]{Garey&Johnson:1979}.
833: 
834: \begin{quotation}
835: \noindent{\bf DFA INTERSECTION}
836: 
837: \noindent INSTANCE: An integer $k \geq1$ and $k$ DFAs
838: $A_1,A_2,\ldots,A_k$, each over the alphabet $\Sigma$.
839: 
840: \noindent QUESTION: Does there exist $x \in \Sigma^*$ such that $x$
841: is accepted by each $A_i$, $1 \leq i \leq k$?\qed
842: \end{quotation}
843: 
844: Let $\#$ be a symbol not in $\Sigma$.  We construct, in linear time, a
845: DFA $M$ to accept the language
846: $L(A_1) \,\#\, L(A_2) \,\#\, \cdots L(A_k) \,\#$.
847: Any word in $L(M)$ matching the pattern $a^k$ is of the form $(x\#)^k$.
848: It follows that $M$ accepts a word matching $a^k$ if and only if there
849: exists $x$ such that $x \in L(A_i)$ for $1 \leq i \leq k$.
850: This completes the reduction.
851: \end{proof}
852: 
853: We may define various variations or special cases of the
854: {\bf NFA PATTERN ACCEPTANCE} problem, such as:
855: {\bf NFA ACCEPTS A $k$-POWER},
856: {\bf NFA ACCEPTS A $\geq k$-POWER},
857: {\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS},
858: {\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS}, etc.
859: We define and consider the computational complexity of these variations
860: below.
861: 
862: \begin{quotation}
863: \noindent{\bf NFA ACCEPTS A $k$-POWER}.
864: 
865: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$ and an
866: integer $k \geq 2$.
867: 
868: \noindent QUESTION: Does there exist $x \in \Sigma^+$ such that
869: $M$ accepts $x^k$?
870: \end{quotation}
871: 
872: \begin{quotation}
873: \noindent{\bf NFA ACCEPTS A $\geq k$-POWER}.
874: 
875: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$.
876: 
877: \noindent QUESTION: Does there exist $x \in \Sigma^+$ and an integer
878: $\ell \geq k$ such that $M$ accepts $x^\ell$?
879: \end{quotation}
880: 
881:     The {\bf NFA ACCEPTS A $\geq k$-POWER} problem is actually an infinite
882: family of problems, each indexed by an integer $k \geq 2$.
883: If $k$ is fixed, the {\bf NFA ACCEPTS A $k$-POWER} problem can
884: be solved in polynomial time, as we now demonstrate.
885: 
886: \begin{proposition}\label{fixed-k}
887: Let $M$ be an NFA with $n$ states and $t$ transitions, and set
888: $N = n+t$, the size of $M$.
889: For any fixed integer $k \geq 2$, there is an algorithm running in
890: $O(n^{2k-1} t^k) = O(N^{2k-1})$ time
891: to determine if $M$ accepts a $k$-power.
892: \end{proposition}
893: 
894: \begin{proof}
895: For a language $L \subseteq \Sigma^*$, we define
896: $$L^{1/k} = \{ x \in \Sigma^* : x^k \in L \}.$$
897: 
898: Let $M = (Q, \Sigma, \delta, q_0, F)$ be an NFA with $n$ states.  
899: We will construct an NFA-$\epsilon$ $M'$ such that $L(M') = L(M)^{1/k}$.
900: To determine whether or not $M$ accepts a $k$-power, 
901: it suffices to check whether or not $M'$ accepts a non-empty word.
902: 
903: The idea behind the construction of $M'$ is as follows.
904: On input $x$, $M'$ first guesses $k-1$ states
905: $g_1, g_2, \ldots, g_{k-1} \in Q$ and then checks that
906: \begin{itemize}
907: \item
908:   $g_1 \in \delta(q_0,x)$,
909: \item
910:   $g_{i+1} \in \delta(g_i,x)$ for $i = 1,2,\ldots,k-2$, and
911: \item
912:   $\delta(g_{k-1},x) \cap F \neq \emptyset$.
913: \end{itemize}
914: It is clear that such states $g_1, g_2, \ldots, g_{k-1}$ exist
915: if and only if $x^k \in L(M)$.
916: 
917: Formally, the construction of $M'$ is as follows.
918: We define the NFA $M' = (Q', \Sigma, \delta', q_0', F')$ such that:
919: 
920: \begin{itemize}
921: \item $Q' = \{ q_0' \} \cup Q^{2k-1}$.  That is, except for $q_0'$,
922: each state of $M'$ is a $(2k-1)$-tuple of the form
923: $[g_1, g_2, \ldots, g_{k-1}, p_0, p_1, \ldots, p_{k-1}]$.
924: The state $g_i$ represents the $i$-th state guessed from $M$.
925: The NFA $M'$ will simulate in parallel the computations of $M$ on input
926: $x$ starting from states $q_0, g_1, g_2, \ldots, g_{k-1}$ respectively.
927: The state $p_0$ represents the current state of the simulation beginning
928: from state $q_0$, and the states $p_1, p_2, \ldots, p_{k-1}$ represent the
929: current states of the simulations beginning from states
930: $g_1, g_2, \ldots, g_{k-1}$, respectively.
931: 
932: \item $q_0'$ is an additional state not in $Q^{2k-1}$.
933: This state will have outgoing $\epsilon$-transitions for each
934: different combination of guesses $g_i$.
935: The transition function on the start state is defined as
936: $$\delta'( q_0', \epsilon) = \{ [g_1, g_2, \ldots, g_{k-1},
937: q_0, g_1, g_2, \ldots, g_{k-1}] : 
938: \forall i \in \{1,2,\ldots,k-1\}, g_i \in Q \}.$$
939: 
940: \item We define the transition function $\delta'$ on all other states
941: as:
942: \begin{eqnarray*}
943: \lefteqn{\delta'([g_1, g_2, \ldots, g_{k-1}, p_0, p_1, \ldots, p_{k-1}], a)=}\\
944: & & \{[g_1, g_2, \ldots, g_{k-1}, p_0', p_1', \ldots, p_{k-1}'] :
945: \forall i \in \{0,1,\ldots,k-1\}, p_i' \in \delta (p_i, a)\}
946: \end{eqnarray*}
947: for all $a \in \Sigma$.
948: 
949: \item $F' = \{ [g_1, g_2, \ldots, g_{k-1}, g_1, g_2, \ldots, g_{k-1}, t]: t \in F \}$.
950: That is, we reach a state in $F'$ on input $x$ exactly when the guessed
951: states $g_i$ verify the conditions described above.
952: \end{itemize}
953: 
954: It should be clear from the construction that $M'$ accepts $L(M)^{1/k}$.
955: The number of states in $M'$ is $n^{2k-1} + 1$, as, except for $q_0'$,
956: each state is a $(2k-1)$-tuple in which each coordinate can take on
957: $|Q|=n$ possible values.  For each state there are at most $t^k$ distinct
958: transitions.  Testing whether or not $L(M')$ accepts a non-empty
959: word can be done in linear time (since the only $\epsilon$-transitions are
960: transitions outgoing from $q_0'$), so the running time of our algorithm is
961: $O(n^{2k-1} t^k)$.
962: \end{proof}
963: 
964:     As before, we can use the same automaton to test if infinitely many
965: $k$-powers are accepted.
966: 
967: \begin{corollary}
968:      We can decide if an NFA $M$ with $n$ states and $t$ transitions
969: accepts infinitely many $k$-powers in $O(n^{2k-1} t^k)$ time.
970: \end{corollary}
971: 
972: If $k$ is not fixed, we have the following result, which is an immediate
973: consequence of Theorem~\ref{pattern} if $k$ is given in unary.  However,
974: the problem remains in PSPACE even if $k$ is given in binary, as we now
975: demonstrate.
976:      
977: \begin{theorem}
978: \label{kpow_alg}
979: The problem {\bf NFA ACCEPTS A $k$-POWER} is PSPACE-complete.
980: \end{theorem}
981: 
982: \begin{proof}
983: We first show that the problem is in PSPACE.  By Savitch's theorem
984: \cite{Savitch:1970} it suffices to give an NPSPACE algorithm.
985: Let $M = (Q,\Sigma,\delta,q_0,F)$, where $Q = \{0,1,\ldots,n-1\}$.
986: For $a \in \Sigma$, let $B_a$ be the $n \times n$ boolean matrix whose
987: $(i,j)$ entry is $1$ if $j \in \delta(i,a)$ and $0$ otherwise.
988: Let $\mathcal{B}$ denote the semigroup generated by the $B_a$'s.
989: 
990: We non-deterministically guess a boolean matrix $B$ and
991: verify that $B \in \mathcal{B}$ (i.e., $B = B_x$ for some $x \in \Sigma^*$),
992: as illustrated in the proof of Theorem~\ref{pattern}.
993: Finally, we compute $B_x^k$ efficiently by repeated squaring
994: and verify that $B_x^k$ contains a $1$ in  position $(q_0,f)$ for some
995: $f \in F$.
996: 
997: The proof for PSPACE-hardness is precisely that given in the proof
998: of Theorem~\ref{pattern}.
999: \end{proof}
1000: 
1001: \begin{theorem}
1002: \label{pow_alg}
1003: For each integer $k \geq 2$, the problem {\bf NFA ACCEPTS A $\geq k$-POWER}
1004: is PSPACE-complete.
1005: \end{theorem}
1006: 
1007: \begin{proof}
1008: To show that the problem is in PSPACE, we use the same algorithm
1009: as in the proof of Theorem~\ref{kpow_alg}, with the following modification.
1010: In order to verify that $M$ accepts an $\ell$-power for some $\ell \geq k$,
1011: we first observe that by the same argument as in the proof of
1012: Proposition~\ref{exponent_bd} below, if $M$ accepts such an $\ell$-power,
1013: then $M$ accepts an $\ell$-power for $k \leq \ell < k+n$.
1014: Thus, after non-deterministically computing $B_x$, we must compute
1015: $B_x^\ell$ for all $k \leq \ell < k+n$, and verify that at least
1016: one $B_x^\ell$ contains a $1$ in position $(q_0,f)$ for some $f \in F$.
1017: 
1018: To show PSPACE-hardness, we again reduce from the {\bf DFA INTERSECTION}
1019: problem.  Suppose that we are given $r$ DFAs $A_1,A_2,\ldots,A_r$ and we wish
1020: to determine if the $A_i$'s accept a common word $x$.  We may suppose
1021: that $r \geq k$, since for any fixed $k$ such a restriction does not affect
1022: the PSPACE-completeness of the {\bf DFA INTERSECTION} problem.
1023: Let $j$ be the smallest non-negative integer such that $r+j$ is prime.
1024: By Bertrand's Postulate \cite[Theorem~418]{Hardy&Wright:1979},
1025: we may take $j \leq r$.  We now construct, in linear time,
1026: a DFA $M$ to accept the language
1027: $L(A_1) \,\#\, L(A_2) \,\#\, \cdots L(A_r) \,\# (\Sigma^* \,\#)^j$.
1028: The DFA $M$ accepts a $\geq k$-power if and only if it accepts an
1029: $(r+j)$-power.  Moreover, $M$ accepts an $(r+j)$-power if and only if
1030: there exists $x$ such that $x \in L(A_i)$ for $1 \leq i \leq r$.
1031: This completes the reduction.
1032: \end{proof}
1033: 
1034: In a similar fashion, we now show that the following decision problems
1035: are PSPACE-complete:
1036: 
1037: \begin{quotation}
1038: \noindent{\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS}.
1039: 
1040: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$ and an
1041: integer $k \geq 2$.
1042: 
1043: \noindent QUESTION: Does $M$ accept $x^k$ for infinitely many words $x$?
1044: \end{quotation}
1045: 
1046: \begin{quotation}
1047: \noindent{\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS}.
1048: 
1049: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$.
1050: 
1051: \noindent QUESTION: Are there infinitely many pairs $(x,i)$ such that
1052: $i \geq k$ and $M$ accepts $x^i$?
1053: \end{quotation}
1054: 
1055:       Again, the {\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS}
1056: problem is actually an infinite family of problems, each indexed by
1057: an integer $k \geq 2$.
1058: We will prove that these decision problems are PSPACE-complete
1059: by reducing from the following problem.
1060: 
1061: \begin{quotation}
1062: \noindent{\bf INFINITE CARDINALITY DFA INTERSECTION}.
1063: 
1064: \noindent INSTANCE: An integer $k \geq 1$ and  $k$ DFAs
1065: $A_1,A_2,\ldots,A_k$, each over the alphabet $\Sigma$.
1066: 
1067: \noindent QUESTION: Do there exist infinitely many
1068: $x \in \Sigma^*$ such that $x$
1069: is accepted by each $A_i$, $1 \leq i \leq k$?
1070: \end{quotation}
1071: 
1072: \begin{lemma}
1073:      The decision problem {\bf INFINITE CARDINALITY DFA INTERSECTION}
1074: is PSPACE-complete.
1075: \end{lemma}
1076: 
1077: \begin{proof}
1078:      First, let's see that the problem is in PSPACE.  If the largest
1079: DFA has $n$ states, then there is a DFA with at most $n^k$ states
1080: that accepts $\bigcap_{1 \leq i \leq k}  L(A_i)$.  Now from
1081: Theorem~\ref{hopcroft} (b), we know that there exist infinitely many
1082: $x$ accepted by each $A_i$ if and only if there is a word $x$
1083: length $\ell$, $n^k \leq \ell < 2n^k$, accepted by all the $A_i$.
1084: We can simply guess the symbols of $x$, ensuring with a counter that
1085: $n^k \leq |x| < 2n^k$, and checking by simulation that $x$ is accepted
1086: by all the $A_i$.  The counter uses at most $k \log n + \log 2$ bits,
1087: which is polynomial in the size of the input.  This shows the problem
1088: is in nondeterministic polynomial space, and hence, by Savitch's theorem
1089: \cite{Savitch:1970}, in PSPACE.
1090: 
1091:      Now, to see that {\bf INFINITE CARDINALITY DFA INTERSECTION}
1092: is PSPACE-hard, we reduce from {\bf DFA INTERSECTION}.  For each
1093: DFA $A_i = (Q_i, \Sigma, \delta_i, q_{0,i}, F_i)$,
1094: we modify it to $B_i$ as follows:  we add a new initial
1095: state $q'_{0,i}$, and add the same transitions from it as from $q_{0,i}$.
1096: We then change all final states to non-final, and we make $q'_{0,i}$
1097: final.  
1098: We add a transition from all states that were
1099: previously final on a new letter $\cent$ (the same letter is used for
1100: each $A_i$), and a transition from all other states on $\cent$ to a new
1101: dead state $d$.  Finally, we add transitions on all letters from $d$ 
1102: to itself.   We claim $B_i$ is a DFA and
1103: $L(B_i) = (L(A_i)\cent)^*$.   Furthermore,
1104: $\bigcap_{1 \leq i \leq k}  L(A_i) \not= \emptyset$ if and only if
1105: $\bigcap_{1 \leq i \leq k} L(B_i)$ is infinite.
1106: 
1107: Suppose $\bigcap_{1 \leq i \leq k}  L(A_i) \not= \emptyset$.  Then
1108: there exists $x$ accepted by each of the $A_i$.  Then $(x\cent)^*$ 
1109: is accepted by each of the $B_i$, so 
1110: $\bigcap_{1 \leq i \leq k}  L(B_i) $ is infinite.
1111: 
1112: Now suppose $\bigcap_{1 \leq i \leq k}  L(B_i) $ is infinite.  Choose
1113: any nonempty 
1114: $x \in \bigcap_{1 \leq i \leq k}  L(B_i)  = \bigcap_{1 \leq i \leq k}
1115: (L(A_i)\cent)^*$.  Thus $x$ must be of the form $y_1 \cent y_2 \cent
1116: \cdots y_j \cent$
1117: for some $j \geq 1$, where each $y_i$ is accepted by all the $A_i$.  
1118: Hence, in particular, $y_1$ is accepted by all the $A_i$, and so
1119: $\bigcap_{1 \leq i \leq k}  L(A_i) \not= \emptyset$.
1120: \end{proof}
1121: 
1122: We are now ready to prove
1123: 
1124: \begin{theorem}  
1125: The decision problem {\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS} is
1126: PSPACE-complete.
1127: \end{theorem}
1128: 
1129: \begin{proof}
1130:     First, let's see that the problem is in PSPACE.  We claim that
1131: an NFA $M$ with $n$ states accepts infinitely many $k$-powers if and only
1132: if it accepts a $k$-power $x^k$ with $2^{n^2} \leq |x| < 2^{n^2+1}$.
1133: 
1134: One direction is clear.  For the other direction,
1135: we use boolean matrices, as in the proof of Theorem~\ref{kpow_alg}.  
1136: We can construct a DFA $M' = (Q', \Sigma, \delta', q'_0, F')$
1137: of $2^{n^2}$ states that accepts $L^{1/k} =
1138: \lbrace x \in \Sigma^* \ : \ x^k \in L(M) \rbrace$, as follows:  the states are
1139: $n \times n$ boolean matrices.  The initial state $q'_0$ is the
1140: identity matrix. If $B_a$ is the boolean matrix with
1141: a $1$ in entry $(i,j)$ if $j \in \delta(q_i,a)$ and $0$ otherwise,
1142: then $\delta'(B, a) = B B_a$.  The set of final states is
1143: $F' = \lbrace B \ : \ \text{the $(0,j)$ entry of $B^k$ is $1$ for some }
1144: q_j \in F \rbrace.$  
1145: 
1146: The idea of this construction is that  if $x = a_1 a_2 \cdots a_i$, then
1147: $\delta(q'_0, x) = B_{a_1} \cdots B_{a_i}$.    Now we use
1148: Theorem~\ref{hopcroft} (b) to conclude that $M'$ accepts infinitely
1149: many words if and only if it accepts a word $x$ with
1150: $2^{n^2} \leq |x| < 2^{n^2+1}$.  But $L(M') = L(M)^{1/k}$.
1151: 
1152: Thus, to check if $M$ accepts infinitely many $k$-powers, we simply
1153: guess the symbols of $x$, stopping when $2^{n^2} \leq |x| < 2^{n^2+1}$,
1154: and verify that $M$ accepts $x^k$.  We can do this by accumulating
1155: $B_{a_1} \cdots B_{a_k}$ and raising the result to the $k$-th power, as before.
1156: We need $n^2+1$ bits to keep track of the counter, so the result is in
1157: NPSPACE, and hence in PSPACE.
1158: 
1159: Now we argue that {\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS} is
1160: PSPACE-hard.  To do so, we reduce from 
1161: {\bf INFINITE CARDINALITY DFA INTERSECTION}.  Given DFAs
1162: $A_1, A_2, \ldots, A_k$, we can easily construct a DFA $A$ to accept
1163: $L(A_1) \# \cdots L(A_k) \#$.  Clearly $A$ accepts infinitely many
1164: $k$-powers if and only if $\bigcap_{1 \leq i \leq k} L(A_i)$ is infinite.
1165: \end{proof}
1166: 
1167: \begin{theorem}  
1168: For each integer $k \geq 2$, the problem
1169: {\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS} is PSPACE-complete.
1170: \end{theorem}
1171: 
1172: \begin{proof}
1173: Left to the reader.
1174: \end{proof}
1175: 
1176: 
1177: \section{Testing if an NFA accepts a non-$k$-power}
1178: \label{kp}
1179: 
1180: In the previous section we showed that it is computationally hard
1181: to test if an NFA accepts a $k$-power (when $k$ is not fixed).
1182: In this section we show how to test if an NFA accepts a
1183: non-$k$-power.  Again, we find it more congenial to discuss the
1184: opposite problem, which is whether an NFA accepts nothing but
1185: $k$-powers.
1186: 
1187: First, we need several classical
1188: results from the theory of combinatorics on words.  
1189: The following theorem is due to Lyndon and Sch\"utzenberger
1190: \cite{Lyndon&Schutzenberger:1962}.
1191: 
1192: \begin{theorem}
1193: \label{ls_eqn}
1194: If $x$, $y$, and $z$ are words satisfying an equation $x^i y^j = z^k$,
1195: where $i,j,k \geq 2$, then they are all powers of a common word.
1196: \end{theorem}
1197: 
1198: The next result is also due to Lyndon and Sch\"utzenberger.
1199: 
1200: \begin{theorem}
1201: \label{lyn_schu}
1202: Let $u$ and $v$ be non-empty words.  If $uv = vu$, then there exists
1203: a word $x$ and integers $i,j \geq 1$, such that $u = x^i$ and $v = x^j$.
1204: In other words, $u$ and $v$ are powers of a common word.
1205: \end{theorem}
1206: 
1207: The following result can be derived from Theorem~\ref{lyn_schu}.
1208: 
1209: \begin{corollary}
1210: \label{ls_cor}
1211: Let $u$ and $v$ be non-empty words.  If $u^r = v^s$ for some $r,s \geq 1$,
1212: then $u$ and $v$ are powers of a common word.
1213: \end{corollary} 
1214: 
1215: Ito, Katsura, Shyr, and Yu \cite{Ito&Katsura&Shyr&Yu:1988}
1216: gave a proof of the next proposition.
1217: 
1218: \begin{proposition}
1219: \label{shyr}
1220: Let $u$ and $v$ be non-empty words.  If $u$ and $v$ are not powers of
1221: a common word, then for any integers $r,s \geq 1$, $r \neq s$,
1222: at least one of $u^rv$ or $u^sv$ is primitive.
1223: \end{proposition}
1224: 
1225: The next result is due to Shyr and Yu \cite{Shyr&Yu:1994}.
1226: 
1227: \begin{theorem}
1228: \label{p+q+}
1229: Let $p$ and $q$ be primitive words, $p \neq q$.  The set $p^+q^+$
1230: contains at most one non-primitive word.
1231: \end{theorem}
1232: 
1233: 
1234: Next we prove the following analogue
1235: of Theorem~\ref{hkk}, from which we will derive an
1236: efficient algorithm for testing if a finite automaton
1237: accepts only $k$-powers.
1238: 
1239: \begin{theorem}
1240: \label{k-pow}
1241: Let $L$ be accepted by an $n$-state NFA $M$ and let $k \geq 2$ be an integer.
1242: \begin{enumerate}
1243: \item Every word in $L$ is a $k$-power if and only if every word in the set
1244: $\lbrace x \in L : |x| \leq 3n \rbrace$ is a $k$-power.
1245: \item All but finitely many words in $L$ are $k$-powers if and only if
1246: every word in the set $\lbrace x \in L : n \leq |x| \leq 3n \rbrace$
1247: is a $k$-power.
1248: \end{enumerate}
1249: Further, if $M$ is a DFA over an alphabet of size $\geq 2$, then the bound $3n$
1250: may be replaced by $3n-3$.
1251: \end{theorem}
1252: 
1253: Ito, Katsura, Shyr, and Yu \cite{Ito&Katsura&Shyr&Yu:1988}
1254: proved a similar result for primitive words: namely, that
1255: if $L$ is accepted by an $n$-state DFA over an alphabet of
1256: two or more letters and contains a
1257: primitive word, then it contains a primitive word of length
1258: $\leq 3n-3$. In other words, every word in $L$ is a power if and only if
1259: every word in the set $\lbrace x \in L : |x| \leq 3n-3 \rbrace$ is a power.
1260: However, this result does not imply
1261: Theorem~\ref{k-pow}, as one can easily construct a regular
1262: language $L$ where every word in $L$ that is not a
1263: $k$-power is nevertheless non-primitive:  for example, $L = \lbrace a^{k+1}
1264: \rbrace$.  
1265: 
1266: We shall use the next result to characterize those regular languages
1267: consisting only of $k$-powers.
1268: 
1269: \begin{proposition}
1270: \label{prim} Let $u$, $v$, and $w$ be words, $v \neq
1271: \epsilon$, $uw \neq \epsilon$, and let $f,g \geq 1$ be integers, $f \neq g$.
1272: If $uv^fw$ and $uv^gw$ are non-primitive, then $uv^nw$ is
1273: non-primitive for all integers $n \geq 1$. Further, if
1274: $uvw$ and $uv^2w$ are $k$-powers for some integer $k \geq
1275: 2$, then $v$ and $uv^nw$ are $k$-powers for all integers $n
1276: \geq 1$.
1277: \end{proposition}
1278: 
1279: \begin{proof}
1280: Suppose $uv^fw$ and $uv^gw$ are non-primitive.  Then $v^fwu$ and
1281: $v^gwu$ are non-primitive. Let $x$ and $y$ be the primitive roots of $v$ and
1282: $wu$, respectively, so that $v = x^i$ and $wu = y^j$ for some integers
1283: $i,j \geq 1$.  If $x \neq y$, then by Proposition~\ref{shyr}, one concludes
1284: that at least one of $v^fwu$ or $v^gwu$ is primitive, a contradiction.
1285: 
1286: If $x = y$, then for all integers $n \geq 1$, $v^nwu = x^{ni+j}$ is clearly
1287: non-primitive, and consequently, $uv^nw$ is non-primitive, as required.
1288: Let us now suppose that $uvw$ and $uv^2w$ are $k$-powers for some $k \geq 2$.
1289: Then $vwu = x^{i+j}$ and $v^2wu = x^{2i+j}$ are both $k$-powers as well.
1290: We claim that the following must hold:
1291: \begin{eqnarray*}
1292: i + j & \equiv & 0 \pmod k \\
1293: 2i + j & \equiv & 0 \pmod k.
1294: \end{eqnarray*}
1295: To see this, write $vwu = z^k$ for some word $z$.  Then $z^k = x^{i+j}$,
1296: so by Corollary~\ref{ls_cor} $z$ and $x$ are powers of a common word.
1297: Since $x$ is primitive it follows that $z$ is a power of $x$.
1298: In particular, $|x|$ divides $|z|$ and $i + j$ is a multiple of $k$,
1299: as claimed.  A similar argument applies to $v^2wu$.
1300: 
1301: We conclude that $i \equiv j \equiv 0 \pmod k$,
1302: and hence, $v = x^i$ is a $k$-power.  Moreover, $v^nwu = x^{ni+j}$ is also a
1303: $k$-power for all integers $n \geq 1$, and consequently, $uv^nw$ is a
1304: $k$-power, as required.
1305: \end{proof}
1306: 
1307: The characterization due to Ito et al.
1308: \cite[Proposition~10]{Ito&Katsura&Shyr&Yu:1988} (see also D\"om\"osi,
1309: Horv\'ath, and Ito \cite[Theorem~3]{Domosi&Horvath&Ito:2004})
1310: of the regular languages consisting only of powers,
1311: along with Theorem~\ref{slender}, implies
1312: that any such language is slender.  A simple application of the
1313: Myhill--Nerode Theorem gives the following weaker result.
1314: 
1315: \begin{proposition}
1316: \label{my-ner}
1317: Let $L$ be a regular language and let $k \geq 2$ be an integer.  If
1318: all but finitely many words of $L$ are $k$-powers, then $L$ is slender.
1319: In particular, if $L$ is accepted by an $n$-state DFA and all words in $L$ of
1320: length $\geq \ell$ are $k$-powers, then for all $r \geq \ell$,
1321: the number of words in $L$ of length $r$ is at most $n$.
1322: \end{proposition}
1323: 
1324: \begin{proof}
1325: Let $x^k$ and $y^k$ be distinct words in $L$ of length $r \geq \ell$.
1326: Then $x$ and $y$ are inequivalent with respect to the Myhill--Nerode
1327: equivalence relation, since $y^k \in L$ but $xy^{k-1} \not\in L$.
1328: The Myhill--Nerode equivalence relation on $L$ thus has index at least
1329: as large as the number of distinct words of length $r$ in $L$.  Since
1330: the index of the Myhill--Nerode relation is at most $n$, it follows that
1331: there is a bounded number of words of length $r$ in $L$, so that $L$
1332: is slender, as required.
1333: \end{proof}
1334: 
1335: The following characterization is analogous to the characterization
1336: of palindromic regular languages given in
1337: \cite[Theorem~8]{Horvath&Karhumaki&Kleijn:1987}.
1338: 
1339: \begin{theorem}
1340: Let $L \subseteq \Sigma^*$ be a regular language and let $k \geq 2$ be
1341: an integer.  The language $L$ consists only of $k$-powers if and only if
1342: it can be written as a finite union of languages of the form
1343: $uv^*w$, where $u,v,w \in \Sigma^*$ satisfy the following:
1344: there exists a primitive word $x \in \Sigma^*$ and integers $i,j \geq 0$
1345: such that $v = x^{ik}$ and $wu = x^{jk}$.
1346: \end{theorem}
1347: 
1348: \begin{proof}
1349: The ``if'' direction is clear; we prove the ``only if'' direction.
1350: Let $L$ consist only of $k$-powers.  Then by Proposition~\ref{my-ner},
1351: $L$ is slender.  By Theorem~\ref{slender}, $L$ can be written
1352: as a finite union of languages of the form $uv^*w$.  By examining the proof
1353: of Proposition~\ref{prim}, one concludes that $u$, $v$, and $w$ have the
1354: desired properties.
1355: \end{proof}
1356: 
1357: We shall need the following lemma for the proof of Theorem~\ref{k-pow}.
1358: 
1359: \begin{lemma}\label{inf_many}
1360: Let $L$ be a regular language accepted by an $n$-state NFA $M$ and let
1361: $k \geq 2$ be an integer.  If $L$ contains a non-$k$-power of length
1362: $\geq n$, then $L$ contains infinitely many non-$k$-powers.
1363: \end{lemma}
1364: 
1365: \begin{proof}
1366: Let $s \in L$ be a non-$k$-power such that $|s| \geq n$.  Consider
1367: an accepting computation of $M$ on $s$.  Such a computation must contain
1368: at least one repeated state.  It follows that there exists a decomposition
1369: $s = uvw$, $v \neq \epsilon$, such that $uv^*w \subseteq L$.
1370: Let $x$ be the primitive root of $v$, so that $v = x^i$ for some positive
1371: integer $i$.
1372: 
1373: Suppose that $wu = \epsilon$.  Since $s = v = x^i$ is not a $k$-power,
1374: it follows that $i \not\equiv 0 \pmod k$.  Moreover, there exist
1375: infinitely many positive integers $\ell$ such that
1376: $\ell i \not\equiv 0 \pmod k$, and so by Corollary~\ref{ls_cor}, there
1377: exist infinitely many
1378: words of the form $v^\ell = x^{\ell i}$ that are
1379: non-$k$-powers in $L$, as required.
1380: 
1381: Suppose then that $wu \neq \epsilon$.
1382: Let $y$ be the primitive root of $wu$, so that $wu = y^j$ for some positive
1383: integer $j$.  We have two cases.
1384: 
1385: Case 1: $x = y$.  Since $uvw$ is a not a $k$-power, $vwu$ is also not
1386: a $k$-power, and thus we have $i + j \not\equiv 0 \pmod k$.
1387: Moreover, there are infinitely many
1388: positive integers $\ell$ such that $\ell i + j \not\equiv 0 \pmod k$.
1389: For all such $\ell$, the word $v^\ell wu = x^{\ell i + j}$ is not
1390: a $k$-power, and hence the word $uv^\ell w$ is a non-$k$-power in $L$.
1391: We thus have infinitely many non-$k$-powers in $L$, as required.
1392: 
1393: Case 2: $x \neq y$.  By Theorem~\ref{p+q+}, $v^*wu$ contains
1394: infinitely many primitive words.  Thus, $uv^*w$ contains infinitely
1395: many non-$k$-powers, as required.
1396: \end{proof}
1397: 
1398: We are now ready to prove Theorem~\ref{k-pow}.
1399: 
1400: \begin{proof}[Proof of Theorem~\ref{k-pow}]
1401: The proof is similar to that of \cite[Proposition~7]{Ito&Katsura&Shyr&Yu:1988}.
1402: It suffices to prove statement (2) of the theorem, since statement (1)
1403: follows immediately from (2) and Lemma~\ref{inf_many}.
1404: 
1405: Suppose that $L$ contains infinitely many non-$k$-powers.  Then
1406: $L$ contains a non-$k$-power $s$ with $|s| \geq n$.  Suppose, contrary to
1407: statement (2), that a shortest such $s$ has $|s| > 3n$.
1408: Then any computation of $M$ on
1409: $s$ must repeat some state at least $4$ times.  It follows that
1410: there exists a decomposition $s = u v_1 v_2 v_3 w$,
1411: $v_1,v_2,v_3 \neq \epsilon$, such that $u v_1^* v_2^* v_3^* w \subseteq L$.
1412: We may assume further that $|v_1v_2v_3| \leq 3n$, so that $wu \neq \epsilon$.
1413: 
1414: Let $p_1$, $p_2$, $p_3$, and $q$ be the primitive roots of
1415: $v_1$, $v_2$, $v_3$, and $wu$, respectively.
1416: Let $v_1 = p_1^{i_1}$, $v_2 = p_2^{i_2}$, $v_3 = p_3^{i_3}$, and $wu = q^j$,
1417: for some integers $i_1,i_2,i_3,j>0$.  We consider three cases.
1418: 
1419: Case~1: $p_1 = p_2 = p_3 = q$.
1420: Without loss of generality, suppose that $|v_1| \leq |v_2| \leq |v_3|$.
1421: Since $|s| > 3n$, we must have $|uv_3w| \geq n$, and thus
1422: $|uv_1v_3w| \geq n$ and $|uv_2v_3w| \geq n$.  By assumption, the words
1423: $v_3wu = q^{i_3+j}$, $v_1v_3wu = q^{i_1+i_3+j}$, and
1424: $v_2v_3wu = q^{i_2+i_3+j}$ are $k$-powers, whereas the word
1425: $v_1v_2v_3wu = q^{i_1+i_2+i_3+j}$ is not.  Applying Corollary~\ref{ls_cor},
1426: we deduce that the following system of equations
1427: \begin{eqnarray*}
1428: i_1 + i_2 + i_3 + j & \not\equiv & 0 \pmod k \\
1429: i_3 + j & \equiv & 0 \pmod k \\
1430: i_1 + i_3 + j & \equiv & 0 \pmod k \\
1431: i_2 + i_3 + j & \equiv & 0 \pmod k
1432: \end{eqnarray*}
1433: must be satisfied.  However, it is easy to see that this is impossible.
1434: 
1435: Case~2: $p_1 \neq q$ and $p_2 = p_3 = q$.  If $|v_1wu| \leq n$,
1436: then let $\ell$ be the smallest positive integer
1437: such that $n \leq |v_1^\ell wu| < |v_1^{\ell+1} wu| \leq |s|$.  Then
1438: by Proposition~\ref{shyr}, one of the words $v_1^\ell wu$ or
1439: $v_1^{\ell+1} wu$ is primitive.  Hence,
1440: at least one of the words $u v_1^\ell w$ or $u v_1^{\ell+1} w$
1441: is a primitive word in $L$, contradicting the minimality of $s$.
1442: 
1443: If, instead, $|v_1wu| > n$, then we have $n < |v_1wu| < |v_1v_2wu| \leq |s|$.
1444: Again, by Proposition~\ref{shyr}, one of the words $v_1wu$ or
1445: $v_1v_2wu$ is primitive.  Hence, at least one of the words
1446: $uv_1w$ or $uv_1v_2w$ is a primitive word in $L$, contradicting
1447: the minimality of $s$.
1448: 
1449: Case~3: $p_1 \neq q$ and $p_2 \neq q$.  In this case we choose the
1450: smaller of $v_1$ and $v_2$ to ``pump'', so without loss of generality,
1451: suppose $|v_1| \leq |v_2|$.  Let $\ell$ be the smallest positive integer
1452: such that $n \leq |v_1^\ell wu| < |v_1^{\ell+1} wu| \leq |s|$.  Note
1453: that $|v_1^2 wu| \leq |v_1v_2wu| < |s|$, so such an $\ell$ must exist.
1454: Then by Proposition~\ref{shyr}, one of the words $v_1^\ell wu$ or
1455: $v_1^{\ell+1} wu$ is primitive.  Hence,
1456: at least one of the words $u v_1^\ell w$ or $u v_1^{\ell+1} w$
1457: is a primitive word in $L$, contradicting the minimality of $s$.
1458: 
1459: All remaining possibilities are symmetric to the cases considered above.
1460: Since in all cases we derive a contradiction, it follows that if $L$
1461: contains infinitely many non-$k$-powers, it contains a non-$k$-power
1462: $s$, where $n \leq |s| \leq 3n$.
1463: 
1464: It remains to consider the situation where $M$ is a DFA over an alphabet
1465: of size $\geq 2$.  Let $a \neq b$ be alphabet symbols of $M$.  If $M$ does
1466: not have a dead state, then for every integer $i \geq n-1$,
1467: there exists a word $x$, $|x| \leq n-1$, such
1468: that $a^ibx \in L$.  These words $a^ibx$ are all distinct and primitive.
1469: Thus, whenever $M$ has no dead state, $M$ always accepts infinitely many
1470: non-$k$-powers, and, in particular, $M$ accepts a non-$k$-power $s$,
1471: where $n \leq |s| \leq 2n-1$.
1472: 
1473: If, on the other hand, $M$ does have a dead state,
1474: then we may delete this dead state and apply
1475: the earlier argument with the bound $3n-3$ in place of $3n$.
1476: 
1477: Finally, the converse of statement (2) follows immediately from
1478: Lemma~\ref{inf_many}.
1479: \end{proof}
1480: 
1481: We can now deduce the following algorithmic result.
1482: 
1483: \begin{theorem}\label{alg_allkpow}
1484: Let $k \geq 2$ be an integer.  Given an NFA $M$ with $n$ states and
1485: $t$ transitions, it is
1486: possible to determine if every word in $L(M)$ is a $k$-power in
1487: $O(n^3 + t n^2)$ time.
1488: \end{theorem}
1489: 
1490: \begin{proof}
1491:      The proof is exactly analogous to that of Theorem~\ref{thm2}, and
1492: we only indicate what needs to be changed.  Suppose $M$ has $t$ states.
1493: We create an NFA, $M'_r$, for $r = 3t$, such that
1494: no word in $L(M'_r)$ is a $k$-power, and $M'_r$ accepts all non-$k$-powers
1495: of length $\leq r$ (and perhaps some other non-$k$-powers).
1496: 
1497: Note that we may assume that $k \leq r$.  If $k > r$, then no word of
1498: length $\leq r$ is a $k$-power.  In this case, to obtain the desired
1499: answer it suffices to test if the set $\{ x \in L(M) : |x| \leq r \}$
1500: is empty.  However, this set is empty if and only if $L(M)$ is empty, and
1501: this is easily verified in linear time.
1502: 
1503: We now form a new NFA $A$ as
1504: the cross product of $M'_r$ with $M$.  From Theorem~\ref{k-pow}, it follows
1505: that $L(A) = \emptyset$ iff
1506: every word in $L(M)$ is a $k$-power.  We can determine if
1507: $L(A) = \emptyset$ by
1508: checking (using depth-first search)
1509: whether any final states of $A$ are reachable from the start state.
1510: 
1511:      It remains to see how $M'_r$ is constructed.
1512: If the length of a word $x$ accepted by $M_r'$ is a multiple of $k$, 
1513: $x$ can be partitioned into $k$ sections of equal length.  In order 
1514: for $M_r'$ to accept $x$, the NFA must `verify' a symbol mismatch between 
1515: two symbols found in different sections but in the same position.
1516: 
1517: If $x$ is a non-$k$-power, then a symbol mismatch will occur between two 
1518: sections of $x$, call them $s_i$ and $s_j$.  This means that $s_i$ and 
1519: $s_j$ differ in at least one position.  Comparing $s_i$ and $s_j$ to 
1520: $s_1$, the first section of $x$, we notice that at least one of $s_i$ or 
1521: $s_j$ must have a symbol mismatch with $s_1$ (otherwise $s_1=s_i=s_j$, 
1522: which would give a contradiction).  Therefore, when checking $x$ for a 
1523: symbol mismatch, it is sufficient to only check $s_1$ against each of the 
1524: remaining $k-1$ sections, as opposed to checking all $k \choose 
1525: 2$ possibilities.
1526: 
1527: In order to construct $M_r'$, we create a series of `lobes', each of which 
1528: is connected to the start state by an $\epsilon$-transition.  Each lobe 
1529: represents three simultaneous `guesses' made by the NFA, which are:
1530: 
1531: \begin{itemize}
1532: \item Which alphabet symbols will conflict and in which order.  The number 
1533: of possible conflict pairs is $| \Sigma | \left( |\Sigma| - 1 \right)$.
1534: 
1535: \item The section in which there will be a symbol mismatch with the first 
1536: section.  There are $k-1$ possible sections.
1537: 
1538: \item The position in which the conflict will occur.  In the worst case 
1539: when the length of the input is $r$, there will be at most
1540: $r/k$ possible positions.
1541: \end{itemize}
1542: 
1543: This gives a total of at most $|\Sigma|\left( |\Sigma| - 1 \right) \cdot (k-1) 
1544: \cdot  r/k $ lobes.  The construction of each lobe is
1545: illustrated in Figure~\ref{fig:module}.
1546: 
1547: \begin{figure}[hbt]%55
1548: \input module.tex
1549: \caption{One lobe of the NFA for $k=3$, $r=12$ and
1550: $0,1$ conflicting symbols.}
1551: \label{fig:module}
1552: \end{figure}
1553: 
1554: Each lobe contains at most $r+1$ states.
1555: In addition to these lobes, we also require a $k$-state submachine to
1556: accept all words whose lengths are not a multiple of $k$.
1557: 
1558: In total, $M_r'$ has at most
1559: $$|\Sigma| \left( |\Sigma| - 1 \right) \cdot (k-1) \cdot 
1560: {r \over k}  \cdot (r+1) + k + 1 \in O(r^2)$$ states
1561: (since $k \leq r$), and similarly, $O(r^2)$ 
1562: transitions.
1563: After constructing the cross-product, this gives a $O(n^3 + tn^2)$ 
1564: bound on the time required to determine if every word in $L(M)$ is a 
1565: $k$-power.
1566: \end{proof}
1567: 
1568:      Theorem~\ref{k-pow} suggests the following question:  if $M$ is an
1569: NFA with $n$ states that accepts at least one non-$k$-power, how long
1570: can a shortest non-$k$-power be?   Theorem~\ref{k-pow} proves an
1571: upper bound of $3n$.   A lower bound of $2n-1$ for infinitely many
1572: $n$ follows easily from the obvious $(n+1)$-state NFA accepting 
1573: ${\tt a}^{n} ({\tt a}^{n+1})^*$, where $n$ is divisible by $k$.  
1574: However, Ito, Katsura, Shyr, and Yu \cite{Ito&Katsura&Shyr&Yu:1988}
1575: gave a very interesting example that improves this lower bound:
1576: if $x = ( (ab)^n a)^2$ and $y = ba x ab$,  then $x$ and $xyx$ are
1577: squares, but $xyxyx$ is not a power.  Hence, the obvious $(8n+8)$-state NFA
1578: that accepts $x(yx)^*$ has the property that the shortest non-$k$-power
1579: accepted is of length $20n+18$.  This improves the lower bound  for
1580: infinitely many $n$.
1581: 
1582:        We now generalize their lower bound.
1583: 
1584: \begin{proposition}
1585:        Let $k \geq 2$ be fixed.  There exist infinitely many NFAs $M$
1586: with the property that if $M$ has $r$ states, then the shortest
1587: non-$k$-power accepted is of length $\left(2+ {1 \over{2k-2}}\right) r - O(1)$.
1588: \end{proposition}
1589: 
1590: \begin{proof}
1591: Let $u = (ab)^n a$, $x = u^k$, and $ y = x^{-1} (x ba u^{-1} x)^k x^{-1}$.
1592: Thus $xyx = (x ba u^{-1} x)^k$.
1593: Hence $x$ and $xyx$ are both $k$-powers.
1594: 
1595: However, $xyxyx$ is not a $k$-power.  To see this,
1596: assume it is, and write $xyxyx = g_1 g_2 \cdots g_k$.
1597: Look at the character in position $2kn-2n+k$ (indexing beginning with 1)
1598: in $g_1$ and $g_k$.  In $g_1$ it is $a$, and in $g_k$ it is $b$, so
1599: $xyxyx$ is not a $k$-power.
1600: 
1601: We can accept $x(yx)^*$ with an NFA using $|xy|$ states.  
1602: The shortest non-$k$-power is $xyxyx$, which is of length $m$.
1603: 
1604: We have $|u| = 2n+1$, $|x| = k(2n+1)$, $|y| = k(4kn - 6n + 2k - 1)$,
1605: $ r = |xy| = 2k(2kn - 2n + k)$, and $ m = |xyxyx| = k(8kn - 6n + 4k + 1)$.
1606: Thus $m = {{4k-3} \over {2k-2}}r - {k \over {k-1}} =
1607: \left(2 + {1 \over {2k-2}}\right)r - O(1)$.
1608: \end{proof}
1609: 
1610: Next, we apply part~(2) of Theorem~\ref{k-pow} to obtain an algorithm
1611: to check if an NFA accepts infinitely many non-$k$-powers.
1612: 
1613: \begin{theorem}
1614: Let $k \geq 2$ be an integer.  Given an NFA $M$ with $n$ states and
1615: $t$ transitions, it is possible to determine if all but finitely many words
1616: in $L(M)$ are $k$-powers in $O(n^3 + t n^2)$ time.
1617: \end{theorem}
1618: 
1619: \begin{proof}
1620: The proof is similar to that of Theorem~\ref{alg_allkpow}.
1621: The only difference is that in view of part~(2) of Theorem~\ref{k-pow}
1622: we instead construct $M_r'$ to accept all non-$k$-powers $s$,
1623: where $n \leq |s| \leq 3n$.  We leave the details to the reader.
1624: \end{proof}
1625: 
1626: \section{Automata accepting only powers}
1627: \label{powers}
1628: 
1629: In this section we move from the problem of testing if an automaton
1630: accepts only $k$-powers to the problem of testing if it accepts only
1631: powers (of any kind).  Just as Theorem~\ref{k-pow} was the starting
1632: point for our algorithmic results in Section~\ref{kp}, the
1633: following theorem of Ito, Katsura, Shyr, and Yu
1634: \cite{Ito&Katsura&Shyr&Yu:1988} is the starting point for our
1635: algorithmic results in this section.  We state the theorem in
1636: a stronger form than was originally presented by Ito et al.
1637: 
1638: \begin{theorem}
1639: \label{ito}
1640: Let $L$ be accepted by an $n$-state NFA $M$.
1641: \begin{enumerate}
1642: \item Every word in $L$ is a power if and only if every word in the set
1643: $\lbrace x \in L : |x| \leq 3n \rbrace$ is a power.
1644: \item All but finitely many words in $L$ are powers if and only if
1645: every word in the set $\lbrace x \in L : n \leq |x| \leq 3n \rbrace$
1646: is a power.
1647: \end{enumerate}
1648: Further, if $M$ is a DFA over an alphabet of size $\geq 2$, then the bound $3n$
1649: may be replaced by $3n-3$.
1650: \end{theorem}
1651: 
1652: We next prove an analogue of Proposition~\ref{my-ner}.  We
1653: need the following result, first proved by Birget \cite{Birget:1992},
1654: and later, independently, in a weaker form, by Glaister and Shallit
1655: \cite{Glaister&Shallit:1996}.
1656: 
1657: \begin{theorem}
1658: \label{birget}
1659: Let $L \subseteq \Sigma^*$ be a regular language.  Suppose there exists
1660: a set of pairs
1661: \[
1662: S = \{(x_i,y_i) \in \Sigma^* \times \Sigma^* : 1 \leq i \leq n \}
1663: \]
1664: such that
1665: \begin{itemize}
1666: \item $x_iy_i \in L$ for $1 \leq i \leq n$, and
1667: \item either $x_iy_j \notin L$ or $x_jy_i \notin L$ for $1 \leq i,j \leq n$,
1668: $i \neq j$.
1669: \end{itemize}
1670: Then any NFA accepting $L$ has at least $n$ states.
1671: \end{theorem}
1672: 
1673: \begin{proposition}
1674: \label{slender_7n}
1675: Let $M$ be an $n$-state NFA and let $\ell$ be a non-negative integer
1676: such that every word in $L(M)$ of length $\geq \ell$ is a power.
1677: For all $r \geq \ell$, the number of words in $L(M)$ of length $r$
1678: is at most $7n$.
1679: \end{proposition}
1680: 
1681: \begin{proof}
1682: Let $r \geq \ell$ be an arbitrary integer.  The proof consists of three steps.
1683: 
1684: Step~1.  We consider the set $A$ of words $w$ in $L(M)$ such that
1685: $|w| = r$ and $w$ is a $k$-power for some $k \geq 4$.  For each such $w$,
1686: write $w = x^i$, where $x$ is a primitive word, and define a pair
1687: $(x^2,x^{i-2})$.  Let $S_A$ denote the set of such pairs.
1688: Consider two pairs in $S_A$: $(x^2,x^{i-2})$ and
1689: $(y^2,y^{j-2})$.  The word $x^2y^{j-2}$ is primitive by Theorem~\ref{ls_eqn}
1690: and hence is not in $L(M)$.  The set $S_A$ thus satifies the conditions
1691: of Theorem~\ref{birget}.  Since $L(M)$ is accepted by an $n$-state
1692: NFA, we must have $|S_A| \leq n$ and thus $|A| \leq n$.
1693: 
1694: Step~2.  Next we consider the set $B$ of cubes of length $r$ in $L(M)$.
1695: For each such cube $w = x^3$, we define a pair $(x,x^2)$.  Let
1696: $S_B$ denote the set of such pairs.  Consider two pairs in $S_B$:
1697: $(x,x^2)$ and $(y,y^2)$.  Suppose that $xy^2$ and $yx^2$ are both in
1698: $L(M)$.  The word $xy^2$ is certainly not a cube; we claim that it
1699: cannot be a square.  Suppose it were.  Then $|x|$ and $|y|$ are even,
1700: so we can write $x = x_1 x_2$ and $y = y_1 y_2$ where
1701: $|x_1| = |x_2| = |y_1| = |y_2|$.  Now if $xy^2 = x_1 x_2 y_1 y_2 y_1 y_2$ 
1702: is a square, then $x_1 x_2 y_1 = y_2 y_1 y_2$, and so $y_1 = y_2$.
1703: Thus $y$ is a square; write $y = z^2$.
1704: By Theorem~\ref{ls_eqn}, $yx^2 = z^2x^2$ is primitive,
1705: contradicting our assumption that $yx^2 \in L(M)$.  It must be
1706: the case then that $xy^2$ is a $k$-power for some $k \geq 4$.
1707: Thus, $xy^2 = u^k$ for some primitive $u$ uniquely determined by $x$ and $y$.
1708: With each pair of cubes $x^3$ and $y^3$ such that both $xy^2$ and $yx^2$
1709: are in $L(M)$ we may therefore associate a $k$-power $u^k \in L(M)$ of length
1710: $r$, where $k \geq 4$.  We have already established in Step~1 that
1711: the number of such $k$-powers is at most $n$.  It follows that
1712: by deleting at most $n$ pairs from the set $S_B$ we obtain
1713: a set of pairs satisfying the conditions of Theorem~\ref{birget}.
1714: We must therefore have $|S_B| \leq 2n$ and thus $|B| \leq 2n$.
1715: 
1716: Step~3.  Finally we consider the set $C$ of squares of length $r$ in
1717: $L(M)$.  For each such square $w = x^2$, we define a pair $(x,x)$.
1718: Let $S_C$ denote the set of such pairs.  Consider two pairs in
1719: $S_C$: $(x,x)$ and $(y,y)$.  Suppose that $xy$ and $yx$ are both
1720: in $L(M)$.  The word $xy$ is not a square and must therefore be
1721: a $k$-power for some $k \geq 3$.  We write $xy = u^k$ for some
1722: primitive $u$ uniquely determined by $x$ and $y$.  In Steps~1 and 2
1723: we established that the number of $k$-powers of length $r$, $k \geq 3$,
1724: is $|A| + |B| \leq 3n$.  It follows that
1725: by deleting at most $3n$ pairs from the set $S_C$ we obtain
1726: a set of pairs satisfying the conditions of Theorem~\ref{birget}.
1727: We must therefore have $|S_C| \leq 4n$ and thus $|C| \leq 4n$.
1728: 
1729: Putting everything together, we see that there are
1730: $|A| + |B| + |C| \leq 7n$ words of length $r$ in $L(M)$,
1731: as required.
1732: \end{proof}
1733: 
1734:      The bound of $7n$ in Proposition~\ref{slender_7n} is almost
1735: certainly not optimal.  
1736: 
1737: We now prove the following algorithmic result.
1738: 
1739: \begin{theorem}
1740: Given an NFA $M$ with $n$ states, it is
1741: possible to determine if every word in $L(M)$ is a power in
1742: $O(n^5)$ time.
1743: \label{kats}
1744: \end{theorem}
1745: 
1746: \begin{proof}
1747:      First, we observe that we can test whether a word $w$ of length
1748: $n$ is a power in $O(n)$ time, using a linear-time string matching
1749: algorithm, such as Knuth-Morris-Pratt \cite{Knuth&Morris&Pratt:1977}.  
1750: To do so, search for $w = a_1 a_2 \cdots a_n$ in the word
1751: $x = a_2 \cdots a_n a_1 \cdots a_{n-1}$.  Then $w$ appears in $x$ iff
1752: $w$ is a power.  Furthermore, if the leftmost occurrence of
1753: $w$ in $x$ appears beginning at $a_i$, then $w$ is a $n/(i-1)$ power, and
1754: this is the largest exponent of a power that $w$ is.
1755: 
1756:      Now, using Theorem~\ref{ito}, it suffices to test all words
1757: in $L(M)$ of length $\leq 3n$;  every word in $L(M)$ is a power iff all
1758: of these words are powers.  On the other hand, by
1759: Proposition~\ref{slender_7n}, if all words are powers, then
1760: the number of words of each length is bounded by $7n$.  Thus, it
1761: suffices to enumerate the words in $L(M)$ of lengths $1,2, \ldots, 3n$,
1762: stopping if the number of such words in any length exceeds $7n$.  If all
1763: these words are powers, then every word is a power.  Otherwise, if we
1764: find a non-power, or if the number of words in any length exceeds $7n$,
1765: then not every word is a power.
1766: 
1767:       By the work of M\"akinen \cite{Makinen:1997} or 
1768: Ackerman \& Shallit \cite{Ackerman&Shallit:2007}, we can enumerate 
1769: these words in $O(n^5)$ time.
1770: \end{proof}
1771: 
1772:       Using part~(2) of Theorem~\ref{ito} along with
1773: Proposition~\ref{slender_7n}, we can prove the following.
1774: 
1775: \begin{theorem}
1776:      Given an NFA $M$ with $n$ states,
1777: we can decide if all but finitely many words in $L(M)$ are
1778: non-powers in $O(n^5)$ time.
1779: \end{theorem}
1780: 
1781: \begin{proof}
1782:       The proof is analogous to that of Theorem~\ref{kats}.  The only
1783: difference is that here we need only enumerate the words in $L(M)$ of
1784: lengths $n,n+1,\ldots,3n$.
1785: \end{proof}
1786: 
1787: 
1788: \section{Bounding the length of a smallest power}
1789: \label{smallkp}
1790: 
1791: In Section~\ref{kp} we gave an upper bound on the length of
1792: a smallest non-$k$-power accepted by an $n$ state NFA.  In this section
1793: we study the complementary problem of bounding the length of
1794: the smallest $k$-power accepted by an $n$-state NFA.
1795: 
1796: \begin{proposition}
1797: \label{upper_bd}
1798: Let $M$ be an NFA with $n$ states and let $k \geq 2$ be an integer.
1799: If $L(M)$ contains a $k$-power, then $L(M)$ contains a $k$-power
1800: of length $\leq kn^k$.
1801: \end{proposition}
1802: 
1803: \begin{proof}
1804: Consider the NFA-$\epsilon$ $M'$ accepting $L(M)^{1/k}$ defined in the proof of
1805: Proposition~\ref{fixed-k}.  The only transitions from the start
1806: state of $M'$ are $\epsilon$-transitions to submachines whose states are
1807: $(2k-1)$-tuples of the form
1808: $[g_1, g_2, \ldots, g_{k-1}, p_0, p_1, \ldots, p_{k-1}]$,
1809: where the first $(k-1)$-elements of the tuple are fixed.  Thus we may
1810: consider $L(M')$ as a finite union of languages, each accepted by
1811: an NFA of size $n^k$.  It follows that if $M'$ accepts a non-empty
1812: word $w$, it accepts such a $w$ of length $\leq n^k$.  However,
1813: $M'$ accepts $w$ if and only if $M$ accepts $w^k$.  We conclude that
1814: if $L(M)$ contains a $k$-power, it contains one of length $\leq kn^k$.
1815: \end{proof}
1816: 
1817: We now give a lower bound on the size of the smallest $k$-power
1818: accepted by an $n$-state DFA.
1819: 
1820: \begin{proposition}
1821: Let $k \geq 2$ be an integer.  There exist infinitely many DFAs
1822: $M_n$ such that
1823: 
1824: \begin{itemize}
1825: \item[(a)] $M_n$ has $O(kn)$ states;
1826: \item[(b)] The shortest $k$-power accepted by $M_n$ is of length
1827: $k\cdot\Omega\left({n \choose k}\right)$.
1828: \end{itemize}
1829: \end{proposition}
1830: 
1831: \begin{proof}
1832: For $n \geq k$, let
1833: \[
1834: L_n = ({\tt a}^n)^+ {\tt b} ({\tt a}^{n-1})^+ {\tt b} \cdots
1835: ({\tt a}^{n-k+1})^+ {\tt b}.
1836: \]
1837: Then $L_n$ is accepted by a DFA with $O(kn)$ states,
1838: and the shortest $k$-power in $L_n$ is $({\tt a}^\ell{\tt b})^k$,
1839: where
1840: \[
1841: \ell = \text{lcm}(n,n-1,\ldots,n-k+1) \geq n(n-1)\cdots(n-k+1)/k!
1842: = {n \choose k},
1843: \]
1844: as required.
1845: \end{proof}
1846: 
1847: Next we consider the length of a smallest power (rather than $k$-power).
1848: 
1849: \begin{proposition}
1850: \label{exponent_bd}
1851: Let $M$ be an NFA with $n$ states.  If $L(M)$ contains a power,
1852: it contains a $k$-power for some $k$, $2 \leq k \leq n+1$.
1853: \end{proposition}
1854: 
1855: \begin{proof}
1856: Suppose to the contrary
1857: that the smallest $k$ for which $L(M)$ contains a $k$-power $w^k$
1858: satisfies $k > n+1$.  For some accepting computation of $M$ on $w^k$ let
1859: $q_1,q_2,\ldots,q_{k-1}$ be the states reached by $M$ after
1860: reading $w,w^2,\ldots,w^{k-1}$ respectively.  Since $k > n+1$, there
1861: exist $i$ and $j$ where $1 \leq i < j \leq k-1$ and $q_i = q_j$.
1862: It follows that $M$ accepts $w^\ell$ for some $\ell$, $2 \leq \ell < k$,
1863: contradicting the minimality of $k$.  We conclude that if $L(M)$ contains a
1864: $k$-power, we may take $k \leq n+1$.
1865: \end{proof}
1866: 
1867: \begin{proposition}
1868: Let $M$ be an NFA with $n$ states.  If $L(M)$ contains a power,
1869: then $L(M)$ contains a power of length $\leq (n+1)n^{n+1}$.
1870: \end{proposition}
1871: 
1872: \begin{proof}
1873: Apply Propositions~\ref{exponent_bd} and \ref{upper_bd}.
1874: \end{proof}
1875: 
1876: We now give a lower bound.
1877: 
1878: \begin{proposition}
1879: \label{smallest_pow}
1880: There exist infinitely many DFAs $M_n$ such that
1881: 
1882: \begin{itemize}
1883: \item $M_n$ has $O(n)$ states;
1884: \item The shortest power accepted by $M_n$ is of length
1885: $e^{\Omega(\sqrt{n \log n})}$.
1886: \end{itemize}
1887: \end{proposition}
1888: 
1889: \begin{proof}
1890: Let $p_i$ denote the $i$-th prime number.  For any integer
1891: $n \geq 2$, let $P(n) = p_k$ be the largest prime number such that
1892: $p_1 + p_2 + \cdots + p_k \leq n$.  We define
1893: \[
1894: L_n = ({\tt a}^{p_1})^+ {\tt b} ({\tt a}^{p_2})^+ {\tt b} \cdots
1895: ({\tt a}^{p_k})^+ {\tt b}.
1896: \]
1897: Then $L_n$ is accepted by a DFA with $O(n)$ states.
1898: 
1899: If $k$ is itself prime,
1900: the shortest power in $L_n$ is $w = ({\tt a}^\ell{\tt b})^k$,
1901: where $\ell = p_1p_2 \cdots p_k$.  For $n \geq 2$, let
1902: \[
1903: F(n) = \prod_{p \leq P(n)} p,
1904: \]
1905: where the product is over primes $p$.
1906: We have $F(n) \in e^{\Omega(\sqrt{n \log n})}$ \cite[Theorem~1]{Miller:1987}.
1907: This lower bound is valid
1908: for all sufficiently large $n$; in particular, it holds for infinitely
1909: many $n$ such that $n = p_1 + p_2 + \cdots + p_k$, where $k$ is prime.
1910: This gives the desired result.
1911: \end{proof}
1912: 
1913: \section{Additional results on powers}
1914: \label{add2pow}
1915: 
1916: D\"om\"osi, Mart\'{\i}n-Vide, and Mitrana
1917: \cite[Theorem~10]{Domosi&Martin-Vide&Mitrana:2004} proved that if $L$
1918: is a slender regular language over $\Sigma$, and $Q_\Sigma$ is the
1919: set of primitive words over $\Sigma$, then $L \cap Q_\Sigma$ is regular.
1920: This result is somewhat surprising, since it is widely believed
1921: that $Q_\Sigma$ is not even context-free for $|\Sigma| \geq 2$.  In this
1922: section we apply a variation of their argument to show that $Q_\Sigma$ may be
1923: replaced by the language of squares, (cubes, etc.) over $\Sigma$.
1924: 
1925: For any integer $k \geq 2$ and alphabet $\Sigma$, let $P(k,\Sigma)$
1926: denote the set of $k$-powers over $\Sigma$.  Clearly, for $|\Sigma| \geq 2$,
1927: $P(k,\Sigma)$ is not context-free.
1928: 
1929: \begin{proposition}
1930: If $L \subseteq \Sigma^*$ is a slender regular language, then for all
1931: integers $k \geq 2$, $L \cap P(k,\Sigma)$ is regular.
1932: \end{proposition}
1933: 
1934: \begin{proof}
1935: If $L$ is slender, then by Theorem~\ref{slender} it
1936: suffices to consider $L = uv^*w$.
1937: The result is clearly true if $v$ is empty, so we suppose $v$ is
1938: non-empty.  Let $x$ and $y$ be the primitive roots of $v$ and $wu$
1939: respectively.  If $x = y$, then the set of $k$-powers in $v^*wu$
1940: is given by $v^*wu \cap (x^k)^*$, so the set of $k$-powers in $uv^*w$
1941: is regular.  If $x \neq y$, then by Theorem~\ref{p+q+},
1942: the set $v^*wu$ contains only finitely many $k$-powers.
1943: The set of $k$-powers in $uv^*w$ is therefore finite, and,
1944: a fortiori, regular.
1945: \end{proof}
1946: 
1947: \section{Testing if an NFA accepts a bordered word}
1948: \label{bord}
1949: 
1950: In this section we give an efficient algorithm to test if an
1951: NFA accepts a bordered word.  We also give upper and lower
1952: bounds on the length of a shortest bordered word accepted by
1953: an NFA.
1954: 
1955: \begin{proposition}
1956:       Given an NFA $M$ with $n$ states and $t$ transitions,
1957: we can decide if $M$ accepts at least one
1958: bordered word in $O(n^3 t^2)$ time.
1959: \label{border}
1960: \end{proposition}
1961: 
1962: \begin{proof}
1963:       Given an NFA $M = (Q, \Sigma, \delta, q_0, F)$,
1964:       we can easily create an NFA-$\epsilon$ $M'$ that
1965: accepts 
1966: $$\lbrace u \in \Sigma^* \ : \ \text{there exists
1967: $w \in \Sigma^*$ such that } uwu \in L \rbrace$$
1968: by ``guessing'' the state we would be in after reading $uw$, and
1969: then verifying it.   More formally, we let $M' = (Q', \Sigma, 
1970: \delta', q'_0, F')$ where $Q' = \lbrace q'_0 \rbrace
1971: \cup \ \lbrace [p,q,r] \ : \ p, q, r \in Q \rbrace$,
1972: $F' = \lbrace [p,q,r] \ : \ r \in F \text{ and there exists
1973: $w \in \Sigma^*$ such that } q \in \delta(p,w) \rbrace$.
1974: The transitions are defined as follows:
1975: $\delta(q'_0, \epsilon) = \lbrace [q_0, p, p] \ : \ p \in Q \rbrace$
1976: and
1977: $$\delta([p,q,r],a) = \lbrace [p', q, r'] \ : \ p' \in \delta(p,a),
1978: 	r' \in \delta(r,a) \rbrace.$$
1979: If $M$ has $n$ states and $t$ transitions,
1980: then $M'$ has $n^3 + 1$ states and at most $n + n^3 t^2$ transitions.
1981: Now get rid of all useless states and their associated transitions.
1982: We can compute the final states by doing $n$ depth-first searches,
1983: starting at each node, at a cost of $O(n(n+t))$ time.  
1984: Now we just test to see if $L(M')$ accepts a nonempty
1985: string, which can be
1986: done in linear time in the size of $M'$.
1987: \end{proof}
1988: 
1989: \begin{corollary}
1990:      If $M$ is an NFA with $n$ states, and it accepts at least one
1991: bordered word, it must accept a bordered word of length
1992: $< 2n^2 + n$.
1993: \end{corollary}
1994: 
1995: \begin{proof}
1996:     Consider the DFA $M'$ constructed in the proof of the
1997: previous theorem, which accepts
1998: $$L' = \lbrace u \in \Sigma^* \ : \ \text{there exists
1999: $w \in \Sigma^*$ such that } uwu \in L \rbrace.$$
2000: If $M$ accepts a bordered string, then $M'$ accepts a nonempty string.
2001: Although $M'$ has $n^3+1$ states, once a computation 
2002: leaves $q'_0$ and enters a triple of the form $[p,q,r]$, it never
2003: enters a state $[p',q',r']$ with $q \not= q'$.  Thus we may view
2004: the NFA $M'$ as implicitly defining a union of $n$ disjoint languages,
2005: each accepted by an NFA with $n^2$ states.     Therefore, if $M'$
2006: accepts a nonempty string $u$, it accepts one of length at most $n^2$.
2007: Now the corresponding bordered string is $uwu$.  The string $w$
2008: is implicitly defined in the previous proof as a path from a state
2009: $p$ to a state $q$.  If such a path exists, it is of length at most
2010: $n-1$.  Thus there exists $uwu \in L(M)$  with $|uwu| \leq 2n^2 + n-1$.
2011: \end{proof}
2012: 
2013: \begin{proposition}
2014:        For infinitely many $n$ there is an DFA of $n$ states
2015: such that the shortest bordered word accepted is of length
2016: $n^2/2 - 6n +43/2$.  
2017: \end{proposition}
2018: 
2019: \begin{proof}
2020: Consider $a (b^t)^+ c a (b^{t-1})^+ c$.  An obvious DFA can accept
2021: this using $2t+5$ states.  However, the 
2022: shortest bordered word accepted is $a b^{t(t-1)} c 
2023: a b^{t(t-1)} c$, which is of length $2t(t-1)+ 4 = n^2/2 - 6n + 43/2$.
2024: \end{proof}
2025: 
2026:     We now consider
2027: testing if an NFA accepts infinitely many bordered words.  
2028: 
2029: \begin{corollary}
2030:      If an NFA $M$ has $n$ states and $t$ transitions,
2031: we can test whether $M$ accepts infinitely many bordered words
2032: in $O(n^6 t^2)$ time.
2033: \end{corollary}
2034: 
2035: \begin{proof}
2036:       If an NFA $M$ accepts infinitely many words of the form $uwu$,
2037: there are two possibilities, at least one of which must hold:
2038: 
2039: \begin{itemize}
2040: \item[(a)] there is a single word $u$ such
2041: that there are infinitely many $w$ with $uwu \in L(M)$, or
2042: 
2043: \item[(b)] there
2044: are infinitely many $u$, with possibly different $w$ depending on $u$,
2045: such that $uwu \in L(M)$.  
2046: \end{itemize}
2047: 
2048:       To check these possibilities, we return to the NFA-$\epsilon$ $M'$
2049: constructed in the proof of Theorem~\ref{border}.  First, for each pair
2050: of states $q_i$ to $q_j$, we determine whether there exists a nonempty
2051: path from $q_i$ to $q_j$.  This can be done with
2052: $n$ different depth-first searches, starting at each vertex, at a cost
2053: of $O(n^3(n^3+t^2))$ time.
2054: In particular, for each vertex, we learn whether there
2055: is a nonempty cycle beginning and ending at that vertex.
2056: 
2057: Now let us check whether (a) holds.  After removing all useless states
2058: and their associated transitions, look at the remaining final states
2059: $[p,q,r]$ of $M'$ and determine if there is a path from $p$ to $q$
2060: that goes through a vertex with a cycle.   This can be done by
2061: testing, for each vertex $s$ that has a cycle, whether there is a non-empty
2062: path from $p$ to $s$ and then $s$ to $q$.  If such a vertex exists, then
2063: there are infinitely many $w$ in some $uwu$.  
2064: 
2065: To check whether (b) holds, we just need to know whether $M'$ accepts
2066: infinitely many strings, which we can easily check by looking for a 
2067: directed cycle.
2068: 
2069: The total cost is therefore $O(n^3(n^3 t^2))$.
2070: \end{proof}
2071: 
2072: We now prove the following decomposition theorem for regular languages
2073: consisting only of bordered words.
2074: 
2075: \begin{theorem}
2076: If every word in a regular language $L$ is bordered, then there is a 
2077: decomposition of $L$ as a finite union of regular languages of the
2078: form $JKJ$, where each $J$ and $K$ are regular and $\epsilon \not\in J$.
2079: \end{theorem}
2080: 
2081: \begin{proof}
2082: Let $L$ be accepted by an NFA $M = (Q,\Sigma,\delta,q_0,F)$.
2083: For each $x \in \Sigma^+$, define an automaton $M_x = (Q,\Sigma,\delta,I',F')$
2084: (for $M_x$ we permit multiple initial states), where the set of
2085: initial states is $I' = \delta(q_0, x)$,
2086: and the set of final states is $F' = \{q \in Q : \delta(q,x) \in F\}$.
2087: Then $M_x$ has the property that for every $w \in L(M_x)$, we have
2088: $xwx \in L(M)$.  Note that there are only finitely many distinct automata
2089: $M_x$.  
2090: 
2091: For each automaton $M_x$, define the regular language
2092: \[
2093: L_x = \{y : \delta(q,y) = I' \text{ and } \{q \in Q: \delta(q,y) \in F\} = F'\}.
2094: \]
2095: Note that again there are only finitely many distinct languages $L_x$.
2096: 
2097: For every $x \in \Sigma^+$, every word in $L_x L(M_x) L_x$ is in $L$.
2098: Furthermore, if $w \in L$ is bordered, then there exists $x \in \Sigma^+$
2099: such that $w \in L_x L(M_x) L_x$.  Thus, if every word of $L$
2100: is bordered, then $L = \cup_{x \in \Sigma^+} L_x L(M_x) L_x$.
2101: Since there are only finitely many languages $L_x$ and $L(M_x)$,
2102: this union is finite, as required.
2103: \end{proof}
2104: 
2105: \section{Testing if an NFA accepts an unbordered word}
2106: \label{unbord}
2107: 
2108: We present a simple test to determine if all words in a regular language 
2109: are bordered, and to determine if a regular language contains infinitely many 
2110: unbordered words.     
2111: We first need the following well-known result about words, which is due to 
2112: Lyndon and Sch\"utzenberger \cite{Lyndon&Schutzenberger:1962}.
2113: 
2114: \begin{lemma}\label{loft1}
2115: Suppose $x$, $y$ and $z$ are non-empty words, and that $xy = yz$.  Then 
2116: there is a non-empty word $p$, a word $q$ and a non-negative 
2117: integer $k_1$ for which we can write $x = pq$, $z = qp$, and $y = (pq)^{k_1}p$.
2118: \end{lemma}
2119:   
2120: We also need the following result, which is just a variation of the 
2121: pumping lemma. 
2122: 
2123: \begin{lemma}\label{loft2}
2124: Let $M = (Q,\Sigma,\delta,q_0,F)$ be an $n$-state NFA.
2125: Let $L$ be the language accepted by $M$.
2126: Let $d$ be a positive integer. 
2127: Let $(X,y,Z)$ be a $3$-tuple of words 
2128: for which $|y|$ is a multiple of $d$, $|y| \ge nd$ and $XyZ \in L$.  
2129: Then there are words $r$, $s$ and $t$, whose lengths are multiples of $d$,
2130: with $|s| \ge d$, for which we can 
2131: write $y = rst$, and, for all $z \ge 0$, $Xrs^ztY \in L$.    
2132: \end{lemma}
2133: 
2134: \begin{proof}
2135: Set $l := |X|$ and $m := |y|/d$, $\gamma := XyZ$, and $k := |\gamma|$.    
2136: First, write $\gamma$ as a sequence of letters, that is, 
2137: $\gamma := \gamma_1 \gamma_2 \cdots
2138: \gamma_k$ with each $\gamma_i$ a letter.  By $\gamma[i,j]$ for $1 \le i,j 
2139: \le |\gamma|$ we
2140: mean the subsequence that consists of the $i-j+1$ consecutive letters of $\gamma$ 
2141: starting at position $i$ and ending at position $j$, that is, $\gamma_i 
2142: \gamma_{i+1}\cdots \gamma_j$.
2143: If $i > j$ we take $\gamma[i,j]$ to be the empty word.
2144: Now we have the following sequence of $k$ states
2145: \[q_1 \in \delta(q_0, \gamma_1), q_2 \in \delta(q_1, \gamma_2), \dots,
2146: q_k \in \delta(q_{k-1}, \gamma_k).\]
2147: We'll choose $q_k$ to be a final state.
2148: 
2149: Note that $y = \gamma[l+1, l+md]$, and consider the following sequence 
2150: of $m+1$ states of $M$:
2151: 
2152: \[q_l, q_{l+d}, q_{l+2d}, \dots, q_{l+md}.\]
2153: 
2154: There are integers $i$ and $j$, with $0 \le i < j \le m$ for which 
2155: $q_{l+id} = q_{l+jd}$.  Set $r := \gamma[l+1, l+id]$, $s := \gamma[l+id+1, 
2156: l+jd]$, and $t := \gamma[l+jd+1, l+md]$, so $y = rst$.  Note that $|s| \ge d$, 
2157: and the desired conclusion follows immediately.    
2158: \end{proof}
2159: 
2160: \begin{lemma}\label{loft3}
2161: Let $M$ be an $n$-state NFA.  Let $L$ be the language accepted by $M$.
2162: Let $(X,Y,Z)$ be a $3$-tuple of words for which $XYZ \in L$.
2163: Then there is a word $y$ for which $|y| < n$ and $XyZ \in L$.  
2164: \end{lemma}
2165:     
2166: \begin{proof}
2167: Let $S := \{u \in \Sigma^{*} : XuZ \in L \}$.  Let $y$ be an element of 
2168: $S$ of minimal length.  We proceed by contradiction, and suppose $|y| \ge n$.
2169: We apply Lemma~\ref{loft2} to $(X,y,Z)$, with $d = 1$, and write $y = rst$
2170: with $s$ non-empty.  Then $XrtZ \in L$, which violates the minimality of $|y|$.
2171: \end{proof}
2172: 
2173: \begin{lemma}\label{loft4}
2174: Suppose there are words $\Psi_L$, $\Psi_R$, $e$, $f$, $g$ and $h$ with
2175: $|\Psi_L| = |\Psi_R|$, $|e| < |\Psi_L|$, $|g| < |\Psi_L|$, and for which 
2176: \begin{equation}\label{star1}
2177: b_\zeta := \Psi_Le = f\Psi_R,
2178: \end{equation}
2179: and
2180: \begin{equation}\label{star2}
2181: b_\eta := \Psi_Lg = h\Psi_R.  
2182: \end{equation}
2183: Suppose further that $|b_\eta| < |b_\zeta|$.  
2184: Then we can write $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$ 
2185: for $p$ a non-empty word, $q$ a word for which $|g| + |pq| = |f|$,
2186: and $k$ a positive integer. 
2187: \end{lemma}    
2188: 
2189: \begin{proof}
2190: Since $|b_\eta| < |b_\zeta|$, we must have $|g| < |e| < |\Psi_R|$.  
2191: This last observation, together with (\ref{star1}) and (\ref{star2}) 
2192: above allows us to assert that there are non-empty words $s_1$ and $s_2$, with 
2193: $|s_2| > |s_1|$, such that $\Psi_R = s_1e = s_2g$.
2194: This last fact combined again with (\ref{star1}) and (\ref{star2}) yields that
2195: \begin{equation}\label{star3}
2196: \Psi_L = f s_1 = hs_2,
2197: \end{equation}
2198: and
2199: \begin{equation}\label{star4}
2200: \Psi_R = s_1e  = s_2g.
2201: \end{equation}
2202: 
2203: Now we can apply (\ref{star3}) and (\ref{star4}) to assert that there are
2204: non-empty words $r_1$ and $r_2$ for which $s_1 r_1 = s_2 = r_2 s_1$; that is,
2205: \begin{equation}\label{star5}
2206: s_1 r_1 = r_2 s_1.  
2207: \end{equation}
2208: 
2209: Now apply Lemma~\ref{loft1} to (\ref{star5}) to get that there is a non-empty
2210: word $p$, a word $q$ and an integer $k_1 \ge 0$ for which
2211: $s_1 = (pq)^{k_1}p$, $r_1 = qp$, and $r_2 = pq$.  Set $k := k_1 + 1$.
2212: Then $s_2 = (pq)^{k}p$, and (\ref{star3}) gives $\Psi_L = h(pq)^{k}p$,
2213: and (\ref{star4}) gives $\Psi_R = (pq)^{k}pg$.  
2214: Also $s_2 = r_2 s_1$ combined with (\ref{star3}) above gives that $f = hr_2$, 
2215: so $|g| + |pq| = |h| + |pq| = |h| + |r_2| = |f|$. 
2216: \end{proof}
2217: 
2218: Theorems~\ref{loft_thm1} and \ref{loft_thm3} below are the main results. 
2219: 
2220: \begin{theorem}\label{loft_thm1}
2221: Let $M$ be an $n$-state NFA. Let $L$ be the language accepted by $M$.  
2222: Let $N$ be a non-negative integer.  
2223: Suppose all words in $L$ of length in the interval $[N, 2N+6n+1]$ are bordered.
2224: Then all words in $L$ of length greater than $2N+6n+1$ are bordered. 
2225: Hence, if all words in $L$ of length at most $6n+1$ are bordered, then all the words 
2226: in $L$ must be bordered. 
2227: \end{theorem}
2228: 
2229: \begin{proof}
2230: We'll prove Theorem~\ref{loft_thm1} by making  the following series
2231: of observations.
2232: Throughout, we'll assume that all words in $L$ of length in the interval 
2233: $[N, 2N + 6n+1]$ are bordered, and we'll assume $w$ is an unbordered word in $L$
2234: for which $|w| > 2N+6n+1$, with $|w|$ minimal.  We write $w$ as $u \theta v$ with 
2235: $\theta$ a word for which $|\theta| \le 1$ and $u$ and $v$ words for 
2236: which $|u| = |v| > 3n + N$. 
2237: 
2238: \begin{claim}\label{claim1}
2239: Write $u$ as $\Psi_L X_L$ and $v$ as
2240: $X_R \Psi_R$, for words $\Psi_L$, $X_L$, 
2241: $\Psi_R$, $X_R$ for which $|X_L| = |X_R| = n$.  
2242: (So that $w$ is  $\Psi_L X_L \theta X_R \Psi_R$.)
2243: Then there are words $x_L$ and $x_R$, both of length less than $n$, for 
2244: which:
2245: \begin{itemize}
2246: \item[(i)] $\zeta := \Psi_L x_L \theta X_R \Psi_R \in L$, and
2247: \item[(ii)] $\eta := \Psi_L X_L \theta x_R \Psi_R \in L$.
2248: \end{itemize}
2249: Further, $N \le |\zeta | < |w|$, and $N \le |\eta| < |w|$.  
2250: \end{claim}
2251: 
2252: To justify (i), apply Lemma~\ref{loft3} to the 3-tuple $(\Psi_L, X_L, 
2253: \theta X_R \Psi_R)$.  Similarly, to arrive at (ii), apply Lemma~\ref{loft3}
2254: again to the 3-tuple $(\Psi_L X_L \theta, X_R, \Psi_R)$.     
2255: 
2256: \begin{claim}\label{claim2}
2257: We can write $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$ 
2258: for $p$ a non-empty word, $g$, $h$ and $q$ words for which $|g| = |h|$, 
2259: $|pq| + |g| \le n$, and $k$ a positive integer. 
2260: Hence $w$ can be written as $h(pq)^{k}p X_L \theta X_R (pq)^{k}pg$. 
2261: \end{claim}
2262: 
2263: To justify Claim~\ref{claim2}, first recall $w = \Psi_L X_L \theta X_R 
2264: \Psi_R$ and $|\Psi_L| = |\Psi_R| > 2n$. 
2265: From Claim~\ref{claim1} above we get that $\zeta$ and $\eta$
2266: are bordered words, so we can assert that there exist non-empty words
2267: $b_{\zeta}$ and  $b_{\eta}$, and words $p_{\zeta}$ and $p_{\eta}$, for which:
2268: \begin{itemize}
2269: \item[(I)]    $\zeta = \Psi_L x_L \theta X_R 
2270: \Psi_R = b_{\zeta} p_{\zeta} b_{\zeta}$, and 
2271: \item[(II)]   $\eta = \Psi_L X_L \theta x_R \Psi_R = 
2272: b_{\eta}p_{\eta} b_{\eta}$.
2273: \end{itemize}
2274: 
2275: Note that, if  $|b_{\zeta}| \le |\Psi_L|$ then by (I) $b_{\zeta}$ 
2276: would be a border for $w$.  So we must have  $|b_{\zeta}| > |\Psi_L|$.
2277: Similarly, (II) gives that  $|b_{\eta}| > |\Psi_L|$.
2278: These latter facts together with (I) and (II) give that there exists 
2279: non-empty words $e$, $f$, $g$, $h$, for which $|e| = |f|$, $|g| = |h|$,
2280: and for which
2281: \begin{equation}\label{2nd_star1}
2282: b_{\zeta} = \Psi_L e = f \Psi_R,
2283: \end{equation}
2284: and
2285: \begin{equation}\label{2nd_star2}
2286: b_{\eta} = \Psi_L g = h \Psi_R.
2287: \end{equation}
2288: Further, $|\zeta| < |w|$ implies that $|f| \le n$, and similarly
2289: $|\eta| < |w|$ implies that $|h| \le n$.
2290: 
2291: Suppose $|b_{\eta}| = |b_{\zeta}|$.  Then from (\ref{2nd_star1}) and
2292: (\ref{2nd_star2}) above, $|e| = |g|$.
2293: But $e$ and $g$ are suffixes of $\Psi_R$, so 
2294: we get that $e = g$.  Hence $b_{\zeta} = \Psi_L e = \Psi_L g 
2295: = b_{\eta}$.  Set $b := b_{\zeta} = b_{\eta}$.   
2296: Then from (II) above, as $|b| \le |\Psi_L| + n$, $b$ is a prefix of 
2297: $\Psi_L X_L$.  And from (I) above, $b$ is a suffix of $X_R \Psi_R$.
2298: So $b$ is a non-empty prefix of $w$, and a suffix of $w$.  Hence, as $|b| 
2299: \le \frac{|w|} {2}$, $b$ is a border for $w$.  
2300: 
2301: So we must have $|b_{\eta}| \neq |b_{\zeta}|$.  Suppose first that 
2302: $|b_{\eta}| < |b_{\zeta}|$.  Now apply Lemma~\ref{loft4} to get that 
2303: there is a positive integer $k$, a non-empty word $p$ and a word $q$ for which 
2304: $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$.  And finally observe that 
2305: $|pq| + |g| = |f| \le n$. 
2306: If $|b_{\eta}| > |b_{\zeta}|$, the argument is similar, so Claim~\ref{claim2}
2307: is established.
2308: 
2309: \begin{claim}\label{claim3}
2310: Let $x := pq$ in the statement of Claim~\ref{claim2}.  There is a 
2311: conjugate $c_L$ of $x$ which is a prefix of 
2312: $\Psi_L$, and there is a conjugate $c_R$ of $x$ which is a suffix of 
2313: $\Psi_R$.
2314: \end{claim}
2315: 
2316: To justify Claim~\ref{claim3}, let $S_L$ be the prefix of length $n$ of
2317: $\Psi_L$.  So there is a word $T_L$ for which we can write
2318: $\Psi_L X_L \theta X_R = S_LT_L$.  (So $w$ is $S_L T_L \Psi_R$.)
2319: Now apply Lemma~\ref{loft3} to $(S_L, T_L, \Psi_R)$, obtaining a word $t_L$, 
2320: with $|t_L| < n$ for which $w_1 := S_L t_L \Psi_R \in L$.  
2321: By supposition, since $N \le |w_1| < |w|$, $w_1$ has a border, say $b_1$.  
2322: Further, if $|b_1| \le n$ then $b_1$ would be a border for $w$.
2323: So we must have $|b_1| >  n$.  And $|b_1| \le \frac{|w_1|} {2}$ implies
2324: $|b_1| \le |\Psi_R|$.  
2325: 
2326: So $b_1$ is a suffix of $\Psi_R$ of length greater than $n$; hence by
2327: Claim~\ref{claim2} above we can write $b_1 = s_x x^{k_2}pg$ for some integer
2328: $k_2 \ge 0$, with $s_x$ a suffix of $x$.  Write $x = p_xs_x$, and recall that
2329: $p$ is a prefix of $x$.  Then  $|s_x x^{k_2}pg| > n$ and $|x| + |g| \le n$
2330: (from Claim~\ref{claim2}) yields that $s_xp_x$ is a prefix of  
2331: $s_xx^{k_2}pg$, that is, $s_xp_x$ is a prefix of $b_1$.  So set $c_L := 
2332: s_xp_x$.  Since $b_1$ is a prefix of $w_1$,
2333: $c_L$ must be a prefix of $w_1$, and $|c_L| \le n = |S_L|$ gives that
2334: $c_L$ is a prefix of $S_L$, and the first statement of Claim~\ref{claim3}
2335: follows. 
2336: 
2337: To get the second statement of Claim~\ref{claim3}, similarly 
2338: let $S_R$ be the suffix of length $n$ of $\Psi_R$.
2339: So there is a word $T_R$ for which we can write $X_L \theta X_R \Psi_R =
2340: T_RS_R$.  (So $w$ is $\Psi_L T_R S_R$.)
2341: Now apply Lemma~\ref{loft3} to $(\Psi_L, T_R, S_R)$, obtaining a word $t_R$, with $|t_R| < n$ for
2342: which $w_2 := \Psi_L t_R S_R \in L$.  
2343: By supposition, since $N \le |w_2| < |w|$,  $w_2$ has a border, say $b_2$.  
2344: Further, if $|b_2|
2345: \le n$ then $b_2$ would be a border for $w$.  
2346: So we can assert that $n < |b_2| \le |\Psi_L|$.  
2347: 
2348: So $b_2$ is a prefix of $\Psi_L$ of length greater than $n$; hence by 
2349: Claim~\ref{claim2} we can write $b_2 = hx^{k_3}\rho_x$ for some integer
2350: $k_3 \ge  0$, with $\rho_x$
2351: a prefix of $x$.  Write $x = \rho_x \sigma_x$.  Then $|hx^{k_3} \rho_x | > n$ 
2352: and $|x| + |h| \le n$ (from Claim~\ref{claim2}) yields that $\sigma_x \rho_x$
2353: is a suffix of $hx^{k_3} \rho_x$, that is, $\sigma_x \rho_x$ is a suffix of
2354: $b_2$.  So  set $c_R := \sigma_x \rho_x$.  Since $b_2$ is a suffix of $w_2$,
2355: $c_R$ must be a suffix of $w_2$, and also $|c_R| \le n = |S_R|$ yields 
2356: that $c_R$ is a suffix of $S_R$, and the second statement of 
2357: Claim~\ref{claim3} follows. 
2358: 
2359: To complete the proof of Theorem~\ref{loft_thm1}, note that,
2360: since $c_L$ and $c_R$ are both conjugates of $x$,
2361: $c_L$ and $c_R$ are non-empty words which are conjugates.
2362: So there is a non-empty word $\alpha$ and 
2363: a word $\beta$ for which we can write $c_L = \alpha \beta$ and 
2364: $c_R = \beta \alpha$.  Then $\alpha$ is a prefix of 
2365: $\Psi_L$, and $\alpha$ is a suffix of $\Psi_R$, which gives that $\alpha$ 
2366: is a border for $w$, and gives a contradiction.
2367: \end{proof}
2368: 
2369: \begin{corollary}
2370: The problem of determining if an NFA accepts an unbordered word
2371: is decidable.
2372: \end{corollary}
2373: 
2374: \begin{proof}
2375: Let $M$ be an NFA with $n$ states.  To determine if $M$ accepts
2376: an unbordered word, it suffices to test whether $M$ accepts
2377: an unbordered word of length at most $6n+1$.
2378: \end{proof}
2379: 
2380: We do not know if there is a polynomial-time algorithm to
2381: test if an NFA accepts an unbordered word or if the problem is
2382: computationally intractable.
2383: 
2384: Theorem~\ref{loft_thm1} gives an upper bound of $6n+1$ on the length
2385: of a shortest unbordered word accepted by an $n$-state NFA.  The best
2386: lower bound we are able to come up with is $2n-3$, as illustrated by the
2387: following example: an NFA of $n$ states accepts
2388: $a b^{n-3} a b^*$, and the shortest unbordered word accepted is
2389: $a b^{n-3} a b^{n-2}$, which is of length $2n-3$.  
2390: 
2391: \begin{theorem}\label{loft_thm2}
2392: Let $M$ be an $n$-state NFA, and let $L$ be the language accepted by $M$.  
2393: Suppose there is an unbordered word in $L$ of length greater than $4n^2 + 6n + 1$.
2394: Then $L$ contains infinitely many unbordered words. 
2395: \end{theorem}
2396:     
2397: \begin{proof}
2398: Suppose $L$ contains only finitely many unbordered words. 
2399: Let $w$ be an unbordered word in $L$ of length greater than $4n^2 + 6n + 1$,
2400: with $|w|$ maximal.  
2401: Write $w$ as $\Psi_L X_L \theta X_R \Psi_R$ for words  $\Psi_L$, $X_L$, 
2402: $\theta$, $\Psi_R$, $X_R$ for which $|X_L| = |X_R| = n$, $|\Psi_L| = 
2403: |\Psi_R| > 2n^2 + 2n$, and $|\theta| \le 1$.    
2404:  We proceed by making the following series of observations.  
2405: 
2406: \begin{claim}\label{2nd_claim1}
2407: There are words $x_L$, $u_L$, $y_L$ and  $x_R$, $u_R$, $y_R$, 
2408: with $u_L$ and $u_R$ both non-empty, $X_L = x_Lu_Ly_L$, $X_R = x_Ru_Ry_R$, and 
2409: for which:  
2410: \begin{itemize}
2411: \item[(i)] $\zeta := \Psi_L x_Lu_Lu_Ly_L\theta X_R \Psi_R \in L$, and
2412: \item[(ii)] $\eta := \Psi_L X_L \theta x_Ru_Ru_Ry_R \Psi_R \in L$.
2413: \end{itemize}
2414: Further, $|\zeta | > |w|$, and $|\eta| > |w|$.  
2415: \end{claim}
2416: 
2417: To justify (i), apply Lemma~\ref{loft2} (with $d = 1$) to the 3-tuple $(\Psi_L, X_L, 
2418: \theta X_R \Psi_R)$.  Similarly, to arrive at (ii), apply Lemma~\ref{loft2} again 
2419: (also with $d = 1$) to the 3-tuple $(\Psi_L X_L \theta, X_R, \Psi_R)$.     
2420: 
2421: \begin{claim}\label{2nd_claim2}
2422: We can write $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$ 
2423: for $p$ a non-empty word, $g$, $h$ and $q$ words for which $|g| = |h|$, 
2424: $|pq| + |g| \le 2n$, and $k$ an integer $\ge n$.    
2425: Hence $w$ can be written as $h(pq)^{k}p X_L \theta X_R (pq)^{k}pg$. 
2426: \end{claim}
2427: 
2428: To justify Claim~\ref{2nd_claim2},
2429: first recall that $w = \Psi_L x_Lu_Ly_L \theta x_Ru_Ry_R  
2430: \Psi_R$, and $X_L = x_Lu_Ly_L$, $X_R = x_Ru_Ry_R$.    
2431: From Claim~\ref{2nd_claim1} above and the maximality of $|w|$ 
2432: we get that $\zeta$ and $\eta$ are bordered words, so  
2433: we can assert that there exist non-empty words $b_{\zeta}$ and 
2434: $b_{\eta}$, and words $p_{\zeta}$ and $p_{\eta}$, for which:
2435: \begin{itemize}
2436: \item[(I)]    $\zeta = \Psi_L x_Lu_Lu_Ly_L \theta X_R  
2437: \Psi_R = b_{\zeta} p_{\zeta} b_{\zeta}$, and 
2438: \item[(II)]   $\eta = \Psi_L X_L \theta x_Ru_Ru_Ry_R \Psi_R = 
2439: b_{\eta}p_{\eta} b_{\eta}$.
2440: \end{itemize}
2441: 
2442: Note that, if  $|b_{\zeta}| \le |\Psi_L|$ then by (I) $b_{\zeta}$ 
2443: would be a border for $w$.  So we must have  $|b_{\zeta}| > |\Psi_L|$.
2444: Similarly, (II) gives that  $|b_{\eta}| > |\Psi_L|$.
2445: These latter facts together with (I) and (II) give that there exists 
2446: non-empty words $e$, $f$, 
2447: $g$, $h$, for which $|e| = |f|$, $|g| = |h|$, and for which
2448: \begin{equation}\label{3rd_star1}
2449: b_{\zeta} = \Psi_L e = f \Psi_R,
2450: \end{equation}
2451: and
2452: \begin{equation}\label{3rd_star2}
2453: b_{\eta} = \Psi_L g = h \Psi_R.
2454: \end{equation}
2455:   
2456: Further, the reader can verify that $|e| \le 2n < |\Psi_R|$, and $|g| \le 2n < |\Psi_R|$. 
2457: 
2458: Suppose $|b_{\eta}| = |b_{\zeta}|$.  Then from (\ref{3rd_star1}) and (\ref{3rd_star2}) above, 
2459: $|e| = |g|$.  But $e$ and $g$ are suffixes of $\Psi_R$, so 
2460: we get that $e = g$.  Hence $b_{\zeta} = \Psi_L e = \Psi_L g 
2461: = b_{\eta}$.  Set $b := b_{\zeta} = b_{\eta}$.   
2462: Now $|u_Ly_L \theta X_R| > |x_Lu_L|$, so from (I) above, we must have 
2463: $|b| \le |u_Ly_L \theta X_R \Psi_R|$, that is, $b$ is a suffix of $u_Ly_L \theta X_R \Psi_R$. 
2464: Similarly, $|X_L \theta x_Ru_R| > |u_Ry_R|$, so from (II) above we get that
2465: $|b| \le |\Psi_L X_L \theta x_Ru_R|$, that is, $b$ is a prefix of $\Psi_L X_L \theta x_Ru_R$. 
2466: So $b$ is a non-empty prefix of $w$, and a suffix of $w$.
2467: Hence $w$ must be bordered, which is a contradiction. 
2468: 
2469: So we must have $|b_{\eta}| \neq |b_{\zeta}|$.  First, suppose 
2470: $|b_{\eta}| < |b_{\zeta}|$.
2471: Now apply Lemma~\ref{loft4} to get that 
2472: there is a positive integer $k$, a non-empty word $p$ and a word $q$ for which 
2473: $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$.  
2474: And finally observe that $|pq| + |g| = |f| \le 2n$, 
2475: and since $|\Psi_L| > 2n^2 + 2n$ and $|pq| \le 2n$, we get that $k \ge n$.
2476: The case $|b_{\eta}| > |b_{\zeta}|$ is symmetric,  so Claim~\ref{2nd_claim2}
2477: is established.     
2478: 
2479: \begin{claim}\label{2nd_claim3}
2480: Let $x := pq$ in the statement of Claim~\ref{2nd_claim2}.  There is a 
2481: conjugate $c_L$ of $x$ which is a prefix of 
2482: $\Psi_L$, and there is a conjugate $c_R$ of $x$ which is a suffix of 
2483: $\Psi_R$.
2484: \end{claim}
2485: 
2486: To justify Claim~\ref{2nd_claim3}, recall from Claim~\ref{2nd_claim2}
2487: that $w$ is $\Psi_L X_L \theta X_R x^{k}pg$. 
2488: And since $k \ge n$, we can apply Lemma~\ref{loft2} to the 3-tuple of words
2489: $(\Psi_LX_L \theta X_R, x^k, pg)$, with $d := |x|$, obtaining a 
2490: positive integer $J_1$ for which, for all $z \ge 0$, we have
2491: $\Psi_LX_L \theta X_R x^{k+J_1z}pg \in L$.  
2492: So choose $z_1 := |\Psi_LX_L \theta X_R|$, and define $w_1 :=
2493: \Psi_LX_L \theta X_R x^{k+J_1z_1}pg$.  By supposition $w_1$ is a bordered word, say 
2494: with border $b_1$.  Further, if $|b_1| \le |\Psi_R|$ then $b_1$ would be a border for $w$.  
2495: So we must have $|b_1| > |\Psi_R|$.  And $|b_1| \le \frac{|w_1|} {2}$ implies 
2496: $|b_1| \le |x^{k+J_1z_1}pg|$.  
2497: 
2498: So $b_1$ is a suffix of $x^{k+J_1z_1}pg$ of length greater than $|\Psi_R| > 2n$, 
2499: hence by Claim~\ref{2nd_claim2} above we can write
2500: $b_1 = s_x x^{k_2}pg$ for some integer $k_2 \ge 0$, 
2501: with $s_x$ a suffix of $x$.  Write $x = p_xs_x$, and recall that $p$ is a 
2502: prefix of $x$.  Then  $|s_x x^{k_2}pg| > 2n$ and $|x| + |g| \le 2n$ (from 
2503: Claim~\ref{2nd_claim2}) yields that $s_xp_x$ is a prefix of  
2504: $s_xx^{k_2}pg$, that is, $s_xp_x$ is a prefix of $b_1$.  So set $c_L := 
2505: s_xp_x$.  Since $b_1$ is a prefix of $w_1$,
2506: $c_L$ must be a prefix of $w_1$, and $|c_L| \le 2n$ gives that
2507: $c_L$ is a prefix of $\Psi_L$, and the first statement of
2508: Claim~\ref{2nd_claim3} follows. 
2509: 
2510: To justify the second statement of Claim~\ref{2nd_claim3},
2511: we proceed similarly; that is, we recall that
2512: $w$ is $hx^kpX_L \theta X_R \Psi_R$, and 
2513: apply Lemma~\ref{loft2} to the 3-tuple of words
2514: $(h, x^k, pX_L \theta X_R \Psi_R)$, with $d := |x|$, allowing us to assert that there is a
2515: positive integer $J_2$ for which, for all $z \ge 0$, we have
2516: $hx^{k+J_2z}pX_L \theta X_R \Psi_R \in L$.   
2517: So choose $z_2 := |pX_L \theta X_R \Psi_R|$, and define 
2518: $w_2 : = hx^{k+J_2z_2}pX_L \theta X_R \Psi_R $.  By supposition $w_2$ is a bordered word, say 
2519: with border $b_2$.   
2520: Further, if $|b_2| \le |\Psi_L|$ then $b_2$ would be a border for $w$.  So we must have $|b_2| > 
2521: |\Psi_L|$.  And $|b_2| \le \frac{|w_2|} {2}$ implies $|b_2| \le |hx^{k+J_2z_2}p|$.  
2522: 
2523: So $b_2$ is a prefix of $hx^{k+J_2z_2}p$ of length greater than $|\Psi_L| > 2n$; 
2524: hence by Claim~\ref{2nd_claim2} we can write
2525: $b_2 = hx^{k_3}\rho_x$ for some integer $k_3 \ge 
2526: 0$, with $\rho_x$ a prefix of $x$.  Write $x = \rho_x \sigma_x$.  
2527: Then $|hx^{k_3} \rho_x | > 2n$ and $|x| + |h| \le 2n$
2528: (from Claim~\ref{2nd_claim2}) yields that $\sigma_x \rho_x$ is a suffix of
2529: $hx^{k_3} \rho_x$, that is, $\sigma_x \rho_x$ is a suffix of $b_2$.  So 
2530: set $c_R := \sigma_x \rho_x$.  Since $b_2$ is a suffix of $w_2$,
2531: $c_R$ must be a suffix of $w_2$, and also $|c_R| \le 2n$ yields 
2532: that $c_R$ is a suffix of $\Psi_R$, and the second statement of 
2533: Claim~\ref{2nd_claim3} follows. 
2534: 
2535: To complete the proof of Theorem~\ref{loft_thm2},
2536: note that, since $c_L$ and $c_R$ are 
2537: both conjugates of $x$, $c_L$ and $c_R$ are non-empty 
2538: words which are conjugates.  So there is a non-empty word $\alpha$ and 
2539: a word $\beta$ for which we can write $c_L = \alpha \beta$ and 
2540: $c_R = \beta \alpha$.  Then $\alpha$ is a prefix of 
2541: $\Psi_L$, and $\alpha$ is a suffix of $\Psi_R$, which gives that $\alpha$ 
2542: is a border for $w$, which is a contradiction.  So we're forced to conclude
2543: that $L$ contains infinitely many unbordered words.   
2544: \end{proof}
2545: 
2546: \begin{theorem}\label{loft_thm3}
2547: Let $M$ be an $n$-state NFA, and let $L$ be the language accepted by $M$.  
2548: Then the following are equivalent:
2549: \begin{enumerate}
2550: \item $L$ contains infinitely many unbordered words. 
2551: \item There is an unbordered word $w$ in $L$, with $4n^2+6n+2 \le |w| \le 8n^2 + 18n + 5$.
2552: \end{enumerate}
2553: \end{theorem}
2554:     
2555: \begin{proof}
2556: (1) $\rightarrow$ (2).  Suppose all words $w \in L$ whose lengths are in 
2557: $[4n^2+6n+2, 8n^2 + 18n + 5]$ are bordered words.
2558: Then by Theorem~\ref{loft_thm1}, (with $N = 4n^2+6n+2$),
2559: we have that any word in $L$ whose length is at least $4n^2+6n+2$ is bordered, i.e., $L$
2560: contains at most finitely many unbordered words.  
2561: 
2562: (2) $\rightarrow$ (1).  This follows immediately from Theorem~\ref{loft_thm2}.
2563: \end{proof}
2564: 
2565: \begin{corollary}
2566: The problem of determining if an NFA accepts infinitely many unbordered words
2567: is decidable.
2568: \end{corollary}
2569: 
2570: \begin{proof}
2571: Let $M$ be an NFA with $n$ states.  To determine if $M$ accepts
2572: infinitely many unbordered words, it suffices to test whether $M$ accepts
2573: an unbordered word $w$, where $4n^2+6n+2 \le |w| \le 8n^2 + 18n + 5$.
2574: \end{proof}
2575: 
2576: We do not know if there is a polynomial-time algorithm to
2577: test if an NFA accepts infinitely many unbordered words or if the problem is
2578: computationally intractable.
2579: 
2580: \section{Final remarks}\label{concl}
2581: 
2582:       In this paper we examined the complexity of checking various
2583: properties of regular languages, such as consisting only of palindromes,
2584: containing at least one palindrome, consisting only of powers, or containing
2585: at least one power.  In each case (except for the unbordered words),
2586: we were able to provide an efficient algorithm or show that the problem
2587: is likely to be hard.  Our results are summarized in the following table.
2588: Here $M$ is an NFA with $n$ states and $t$ transitions.
2589: When $L$ is the language of unbordered words, it is an open problem
2590: to either find polynomial time algorithms to test if 
2591: (a) $L(M) \intersect L = \emptyset$, and (b) $L(M) \intersect L$ is infinite,
2592: or to show the intractability of these problems.
2593: 
2594: \bigskip
2595: \begin{figure}[H]
2596: \begin{center}
2597: \begin{tabular}{|c|c|c|c|c|}
2598: \hline
2599:      & decide if & decide if & upper bound on & worst-case  \\
2600: $L$  & $L(M) \intersect L = \emptyset$ & $L(M) \intersect L$ & shortest element  & lower bound  \\
2601:      &      & infinite & of $L(M) \intersect L$ & known  \\
2602: \hline
2603: palindromes & $O(n^2+t^2)$ & $O(n^2+t^2)$ & $2n^2-1$ & ${{n^2}\over 2} - 3n+ 5$  \\
2604: \hline
2605: non-palindromes & $O(n^2+tn)$ & $O(n^2+t^2)$ & $3n-1$ & $3n-1$ \\
2606: \hline
2607: $k$-powers       & $O(n^{2k-1} t^k)$ & $O(n^{2k-1} t^k)$ & $kn^k$ &
2608: 	$\Omega(n^k)$  \\
2609: ($k$ fixed)  & & & &\\
2610: \hline
2611: $k$-powers & PSPACE- & PSPACE- & &  \\
2612: ($k$ part of input) & complete & complete & & \\
2613: \hline
2614: non-$k$-powers & $O(n^3 + t n^2)$ & $O(n^3 + t n^2)$ & $3n$ & $(2+{1 \over {2k-2}}) n - O(1)$ \\
2615: \hline
2616: powers & PSPACE- & PSPACE- & $(n+1)n^{n+1}$ & $e^{\Omega(\sqrt{n\log n})}$ \\
2617: & complete & complete & & \\
2618: \hline
2619: non-powers & $O(n^5)$ & $O(n^5)$ & $3n$  & ${5 \over 2} n - 2$\\
2620: \hline
2621: bordered words & $O(n^3 t^2)$ & $O(n^6 t^2)$ & $2n^2 + n- 1$ & $ {{n^2}\over 2} - 6n+ {{43} \over 2}$ \\
2622: \hline
2623: unbordered & decidable & decidable & $6n+1$ & $2n-3$ \\
2624: words & & & & \\
2625: \hline
2626: \end{tabular}
2627: \end{center}
2628: \end{figure}
2629: 
2630: \section*{Acknowledgments}
2631: 
2632: The algorithm mentioned in Section~\ref{nn} for testing if an NFA-$\epsilon$
2633: accepts infinitely many words was suggested to us by Timothy Chan.
2634: We would like to thank both him and Jack Zhao for their ideas on this subject.
2635: 
2636: \bibliography{abbrevs,pal}
2637: \bibliographystyle{new}
2638: 
2639: \end{document}
2640: