0711:0711.3183/pal.tex

1: \documentclass[12pt]{article}

2: \usepackage{amsmath,amsthm,amsfonts}

3: \usepackage{fullpage}

4: \usepackage{graphicx}

5: \usepackage[dvips]{epsfig}

6: \usepackage{epsf}

7: \usepackage{float}

8: \usepackage{wasysym}

9:

10: \def\divides{{\  | \ }}

11: \def\intersect{ \ \cap \ }

12: \def\union{\ \cup \ }

13:

14: \newtheorem{theorem}{Theorem}

15: \newtheorem{corollary}[theorem]{Corollary}

16: \newtheorem{proposition}[theorem]{Proposition}

17: \newtheorem{lemma}[theorem]{Lemma}

18: \newtheorem{claim}[theorem]{Claim}

19:

20: \title{Detecting Palindromes, Patterns and Borders in Regular Languages}

21:

22: \author{Terry Anderson, Narad Rampersad\footnote{Author's current address:

23: Department of Mathematics and Statistics, University of Winnipeg,

24: 515 Portage Ave., Winnipeg, MB R3B 2E9, Canada.},

25: Nicolae Santean\footnote{Author's current address:

26: Department of Computer and Information Sciences,

27: Indiana University South Bend, 1700 Mishawaka Ave.,

28: P.O. Box 7111, South Bend, IN 46634, U.S.A.}, and Jeffrey Shallit\\

29: School of Computer Science\\

30: University of Waterloo\\

31: Waterloo, ON  N2L 3G1, Canada\\

32: {\tt tanderson@uwaterloo.ca} \\

33: {\tt n.rampersad@uwinnipeg.ca} \\

34: {\tt nsantean@iusb.edu} \\

35: {\tt shallit@graceland.uwaterloo.ca}\medskip\\

36: John Loftus\\

37: Luzerne County Community College\\

38: 1333 South Prospect Street\\

39: Nanticoke, PA  18634, U.S.A.\\

40: {\tt jloftus@luzerne.edu}}

41:

42: \begin{document}

43: \date{\today}

44: \maketitle

45:

46: \begin{abstract}

47: Given a language $L$ and a nondeterministic finite automaton $M$,

48: we consider whether we can determine efficiently (in the size of $M$)

49: if $M$ accepts at least one word in $L$, or infinitely many words.

50: Given that $M$ accepts at least one word in $L$,

51: we consider how long a shortest word can be.

52: The languages $L$ that we examine include the

53: palindromes, the non-palindromes, the $k$-powers, the non-$k$-powers,

54: the powers, the non-powers (also called primitive words), the words matching

55: a general pattern, the bordered words, and the unbordered words.

56:

57: \bigskip\noindent

58: \textbf{Keywords:} palindrome, $k$-power, primitive word, pattern,

59: bordered word.

60: \end{abstract}

61:

62: \section{Introduction}

63:

64: Let $L \subseteq \Sigma^*$ be a fixed language, and let $M$ be a

65: deterministic finite automaton (DFA) or nondeterministic finite

66: automaton (NFA) with input alphabet $\Sigma$.

67: In this paper we are interested in three questions:

68:

69: \begin{enumerate}

70:

71: \item Can we efficiently decide (in terms of the size of $M$)

72: if $L(M)$ contains at least

73: one element of $L$, that is, if $L(M) \intersect L \not= \emptyset$?

74:

75: \item Can we efficiently decide if $L(M)$ contains infinitely

76: many elements of $L$, that is, if $L(M) \intersect L$ is infinite?

77:

78: \item Given that $L(M)$ contains at least one element of $L$, what is

79: a good upper bound on a shortest element of $L(M) \intersect L$?

80:

81: \end{enumerate}

82:

83: We can also ask the same questions about $\overline{L}$, the

84: complement of $L$.

85:

86: As an example, consider the case where $\Sigma = \lbrace {\tt a} \rbrace$,

87: $L$ is the set of primes written in unary, that is,

88: $\lbrace {\tt a}^i \ : \ i \text{ is prime } \rbrace$, and $M$ is a NFA with

89: $n$ states.

90:

91: To answer questions (1) and (2), we first rewrite $M$ in Chrobak

92: normal form \cite{Chrobak:1986}.  Chrobak normal form consists of an

93: NFA $M'$ with a

94: ``tail'' of $O(n^2)$ states, followed by a single nondeterministic

95: choice to a set of disjoint cycles containing at most $n$ states.

96: Computing this normal form can be achieved in $O(n^5)$ steps

97: by a result of Martinez \cite{Martinez:2002}.

98:

99: Now we examine each of the cycles produced by this transformation.

100: Each cycle accepts a finite union of sets of the form $({\tt a}^t)^*

101: {\tt a}^c$, where $t$ is the size of the cycle and $c \leq n^2 + n$;

102: both $t$ and $c$ are given explicitly from $M'$.  Now, by Dirichlet's

103: theorem on primes in arithmetic progressions, $\gcd(t,c) = 1$ for at

104: least one pair $(t,c)$ induced by $M'$ if and only if $M$ accepts

105: infinitely many elements of $L$.  This can be checked in $O(n^2)$

106: steps, and so we get a solution to question (2) in polynomial time.

107:

108: Question (1) requires a little more work.  From our answer to question

109: (2), we may assume that $\gcd(t,c) > 1$ for all pairs $(t,c)$, for

110: otherwise $M$ accepts infinitely many elements of $L$ and hence at

111: least one element.  Each element in such a set is of length $kt+c$ for

112: some $k \geq 0$.   Let $d = \gcd(t,c) \geq 2$.  Then $kt+c = (kt/d +

113: c/d)d$.  If $k > 1$, this quantity is at least $2d$ and hence

114: composite.  Thus it suffices to check the primality of $c$ and $t+c$,

115: both of which are at most $n^2 + 2n$.  We can precompute

116: the primes $< n^2 + 2n$ in

117: $O(n^2)$ time using a modification of the sieve of Eratosthenes

118: \cite{Pritchard:1987}, and check if any of them are accepted.  This

119: gives a solution to question (1) in polynomial time.

120:

121: On the other hand, answering question (3) essentially amounts to

122: estimating the size of the least prime in an arithmetic progression, an

123: extremely difficult question that is still not fully resolved

124: \cite{Heath-Brown:1992}, although it is known that there is a

125: polynomial upper bound.

126:

127: Even the case where $L$ is regular can be difficult.  Suppose $L$

128: is represented as the complement of a language accepted by an NFA $M'$ with

129: $n$ states.  Then if $L(M) =\Sigma^*$, question (1) amounts to asking

130: if $L(M') \not= \Sigma^*$, which is PSPACE-complete

131: \cite[Section 10.6]{Aho&Hopcroft&Ullman:1974}.  Question (2) amounts to

132: asking if $\overline{L(M')}$ is infinite, which is also

133: PSPACE-complete \cite{Kao&Shallit&Xu:2007}.

134: Question (3) amounts to asking for good bounds on the smallest string not

135: accepted by an NFA.  There is an evident upper bound of $2^n$, and

136: there are examples known that achieve $2^{cn}$ for some constant

137: $c > 0$, but more detailed analysis is still lacking

138: \cite{Ellul&Krawetz&Shallit&Wang:2004}.

139:

140: Thus we see that asking these questions, even for relatively simple

141: languages $L$, can quickly take us to

142: the limits of what is known in formal language theory and number theory.

143:

144: In this paper we examine questions (1)--(3) in the case where $M$

145: is an NFA and $L$ is either the set of palindromes, the set of

146: $k$-powers, the set of powers, the set of words matching a general pattern,

147: the set of bordered words, or their complements.

148:

149:    In some of these cases, there is previous work.

150: For example, Ito et al.\

151: \cite{Ito&Katsura&Shyr&Yu:1988} studied several

152: circumstances in which primitive words (non-powers) may appear in

153: regular languages. As a typical result in

154: \cite{Ito&Katsura&Shyr&Yu:1988}, we mention:

155: ``A DFA over an alphabet of $2$ or more letters accepts a primitive

156: word iff it accepts one of length $\leq 3n-3$, where $n$ is the

157: number of states of the DFA''.

158: Horv\'ath, Karhum\"aki and Kleijn \cite{Horvath&Karhumaki&Kleijn:1987}

159: addressed the decidability problem of whether a language

160: accepted by an NFA is palindromic (i.e., every element is a palindrome).

161: They showed that the

162: language accepted by an NFA with $n$ states is palindromic

163: if and only if all its words of length shorter than $3n$

164: are palindromes.

165:

166:      Here is a summary of the rest of the paper.  In section~\ref{nn},

167: we define the objects of study and our notation.

168:

169: In section~\ref{onepal}, we begin our study of palindromes.  We give

170: efficient algorithms to test if an NFA accepts at least one palindrome,

171: or infinitely many.  We also show that a shortest palindrome accepted

172: is of length at most quadratic, and further, that quadratic examples

173: exist.  In section~\ref{algpal}, we give efficient algorithms to test

174: if an NFA accepts at least one non-palindrome, or infinitely many.

175: Further, we give a tight bound on the length of a shortest

176: non-palindrome accepted.

177:

178: In section~\ref{pow_test}, we begin our study of patterns.  We show that

179: it is PSPACE-complete to test if a given NFA accepts a word matching a

180: given pattern.  As a special case of this problem we consider testing

181: if an NFA accepts a $k$-power.  We give a

182: algorithm to test if a $k$-power is accepted that is polynomial in $k$.

183: If $k$ is not fixed, the problem is PSPACE-complete.

184: We also study the problem of accepting a power of exponent $\geq k$,

185: and of accepting infinitely many $k$-powers.

186:

187: In section~\ref{kp},

188: we give a polynomial-time algorithm to decide if a non-$k$-power

189: is accepted.  We also give upper and lower bounds

190: on the length of a shortest $k$-power accepted.  In

191: section~\ref{powers}, we give an efficient algorithm for

192: determining if an NFA accepts at least one non-power.

193: In section~\ref{smallkp}, we bound the length of the smallest power.

194: Section~\ref{add2pow} gives some additional results on powers.

195:

196: In section~\ref{bord}, we show how to test if an NFA accepts a bordered

197: word, or infinitely many,

198: and show that a shortest bordered word accepted can be of

199: quadratic length.  In section~\ref{unbord} we give an algorithm

200: to test if an NFA accepts an unbordered word, or infinitely many,

201: and we establish a linear upper bound on the length of a shortest

202: unbordered word.

203:

204: \section{Notions and notation}\label{nn}

205:

206: Let $\Sigma$ be an alphabet, i.e., a nonempty, finite set

207: of symbols (letters). By $\Sigma^*$ we denote the set of

208: all finite words (strings of symbols) over $\Sigma$, and by

209: $\epsilon$, the empty word (the word having zero

210: symbols). The operation of concatenation (juxtaposition) of

211: two words $u$ and $v$ is denoted by $u\cdot v$, or simply

212: $uv$.  If $w \in \Sigma^*$ is written in the form $w=xy$ for

213: some $x,y \in \Sigma^*$, then the word $yx$ is said to be a

214: {\it conjugate} of $w$.

215:

216: For $w\in \Sigma^*$, we denote by $w^R$ the word

217: obtained by reversing the order of symbols in $w$. A {\it

218: palindrome} is a word $w$ such that $w = w^R$. If $L$ is a

219: language over $\Sigma$, i.e., $L\subseteq \Sigma^*$, we say

220: that $L$ is {\it palindromic} if every word $w \in L$ is a

221: palindrome.

222:

223: Let $k \geq 2$ be an integer.  A word $y$ is a

224: \emph{$k$-power} if $y$ can be written as $y = x^k$ for

225: some non-empty word $x$.  If $y$ cannot be so written for

226: any $k \geq 2$, then $y$ is \emph{primitive}. A $2$-power

227: is typically referred to as a \emph{square}, and a

228: $3$-power as a \emph{cube}.

229:

230: Patterns are a generalization of powers.  A \emph{pattern}

231: is a non-empty word $p$ over a \emph{pattern alphabet} $\Delta$.  The

232: letters of $\Delta$ are called \emph{variables}.  A pattern

233: $p$ \emph{matches} a word $w \in \Sigma^*$ if there exists a non-erasing

234: morphism $h : \Delta^* \to \Sigma^*$ such that $h(p) = w$.  Thus,

235: a word $w$ is a $k$-power if it matches the pattern $a^k$.

236:

237:       Bordered words are generalizations of powers.  We say a

238: word $x$ is {\it bordered}

239: if there exist words $u \in \Sigma^+$, $w \in \Sigma^*$

240: such that $x = uwu$.  In this case, the word $u$ is said to be a

241: {\it border} for $x$.  Otherwise, $x$ is {\it unbordered}.

242:

243: A nondeterministic finite automaton (NFA) over $\Sigma$

244: is a $5$-tuple $M=(Q, \Sigma, \delta, q_0, F)$ where $Q$

245: is a finite set of states, $\delta : Q\times \Sigma

246: \rightarrow 2^{Q}$ is a next-state

247: function, $q_0$ is an initial state and $F\subseteq Q$ is a

248: set of final states. We sometimes view $\delta$ as a

249: transition table, i.e., as a set consisting of tuples $(p,

250: a, q)$ with $p, q\in Q$ and $a\in\Sigma$.

251: The machine $M$ is deterministic (DFA) if $\delta$ is a function

252: mapping $Q\times \Sigma\rightarrow Q$. We consider only {\em

253: complete} DFAs, that is, those whose transition function

254: is a total function.  Sometimes we use NFA-$\epsilon$, which are

255: NFAs that also allow transitions on the empty word.

256:

257: The size of $M$ is the total number $N$ of its

258: states and transitions. When we want to emphasize the components of $M$,

259: we say $M$ has $n$ states and $t$ transitions, and define $N := n+t$.

260: The language of $M$,

261: denoted by $L(M)$, belongs to the family of {\em regular

262: languages} and consists of those words accepted by $M$ in

263: the usual sense. A {\em successful path}, or {\em

264: successful computation} of $M$ is any computation starting

265: in the initial state and ending in a final state. The label

266: of a computation is the input word that triggered it; thus,

267: the language of $M$ is the set of labels of all successful

268: computations of $M$.

269:

270: A state of $M$ is {\em accessible} if there exists a path in the

271: associated transition graph, starting from $q_0$ and ending

272: in that state. By convention, there exists a path from each

273: state to itself labeled with $\epsilon$. A state $q$ is

274: {\em coaccessible} if there exists a path from $q$ to some

275: final state. A state which is both accessible and

276: coaccessible is called {\em useful}, and if it is not

277: coaccessible it is called {\em dead}.

278:

279: We note that if $M$ is an NFA or NFA-$\epsilon$, we can remove all states that

280: are not useful in linear time (in the number of states and transitions)

281: using depth-first search.  We observe that $L(M) \not= \emptyset$ if

282: and only if any states remain after this process, which can be

283: tested in linear time.  Similarly, if $M$ is a NFA,  then $L(M)$ is infinite

284: if and only if the corresponding digraph has a directed cycle.

285: This can also be tested in linear time.

286:

287: If $M$ is an NFA-$\epsilon$, then to check if $L(M)$ is infinite

288: we need to know not only that the corresponding digraph has a cycle, but

289: that it has a cycle labeled by a non-empty word.  This can also be

290: checked in linear time as follows.  Let us suppose that all non-useful

291: states of $M$ have been removed.  We wish to test whether there is

292: some edge of the digraph of $M$ that is part of some cycle and is not

293: labeled by the empty word.  We now observe that an edge of a digraph

294: belongs to a directed cycle if and only if both of its endpoints lie within

295: the same strongly connected component.  It is well known that the strongly

296: connected components of a graph can be computed in linear time

297: (see \cite[Section~22.5]{CLRS01}).  Once the strongly connected components

298: of the NFA-$\epsilon$ are known, we simply check the edges not

299: labeled by $\epsilon$ to determine if there is such an edge with both

300: endpoints in the same strongly connected component.  Thus we can

301: determine if $L(M)$ is infinite in linear time.

302:

303: Although the results of this paper are generally stated as applying

304: to NFA's, by virtue of the preceding algorithm, one sees that the

305: results apply equally well to NFA-$\epsilon$'s.

306:

307: We will also need the following well-known results

308: \cite{Hopcroft&Ullman:1979}:

309:

310: \begin{theorem}

311: Let $M$ be an NFA with $n$ states.  Then

312: \begin{itemize}

313: \item[(a)]   $L(M) \not= \emptyset$ if and only if $M$ accepts a word

314: of length $< n$.

315:

316: \item[(b)] $L(M)$ is infinite if and only if $M$ accepts a word

317: of length $\ell$, $n \leq \ell < 2n$.

318: \end{itemize}

319: \label{hopcroft}

320: \end{theorem}

321:

322: If $L \subseteq \Sigma^*$ is a language, the \emph{Myhill--Nerode equivalence

323: relation} $\equiv_L$ is the equivalence relation defined as

324: follows:  for $x,y \in \Sigma^*$, $x \equiv_L y$ if for all $z \in \Sigma^*$,

325: $xz \in L$ if and only if $yz \in L$.  The classical Myhill--Nerode theorem

326: asserts that if $L$ is regular, the equivalence relation $\equiv_L$ has

327: only finitely many equivalence classes.

328:

329: For a background on finite automata and regular languages

330: we refer the reader to Yu \cite{YU97}.

331:

332: \section{Testing if an NFA accepts at least one palindrome}

333: \label{onepal}

334:

335:      Over a unary alphabet, every string is a palindrome, so problems

336: (1)--(3) become trivial.  Let us assume, then, that the alphabet $\Sigma$

337: contains at least two letters.  Although the palindromes over such an

338: alphabet are not regular, the language

339: $$ \lbrace x \in \Sigma^* \ : \ x x^R \in  L(M) \text{ or there exists } a \in \Sigma \text{ such that } x a x^R \in L(M) \rbrace$$

340: is, in fact, regular, as is often shown in a beginning course in formal

341: languages \cite[p.\ 72, Exercise 3.4 (h)]{Hopcroft&Ullman:1979}.  We

342: can take advantage of this as follows:

343:

344: \begin{lemma}

345:       Let $M$ be an NFA with $n$ states and $t$ transitions.  Then there

346: exists an NFA-$\epsilon$ $M'$ with $n^2+1$

347: states and $\leq 2t^2$ transitions such that

348: $$L(M') = \lbrace x \in \Sigma^* \ : \ x x^R \in L(M) \text{ or there

349: 	exists } a \in \Sigma \text{ such that } x a x^R \in L(M) \rbrace.$$

350: \label{pal-con}

351: \end{lemma}

352:

353: \begin{proof}

354: Let $M=(Q, \Sigma, \delta, q_0, F)$ be an NFA

355: with $n$ states. We construct an NFA-$\epsilon$

356: $M' = (Q', \Sigma, \delta', q'_0, F')$ as follows:

357: We let $Q'=Q\times Q\cup \{q_0'\}$, where $q_0'$ is the new initial state,

358: and we define

359: the set of final states by

360: $$F' = \lbrace [p, p] \ : \ p\in Q\}\cup \{[p, q] \ : \ \text{ there exists }

361: a \in \Sigma \text{ such that } q \in \delta(p,a) \rbrace.$$

362: The transition function $\delta'$ is defined as follows:

363: $$ \delta'(q'_0, \epsilon) = \lbrace [q_0,q] \ : \ q \in F \rbrace$$

364: and

365: $$\delta'([p,q], a) = \lbrace [r,s] \ : \ r \in \delta(p,a) \text{ and }

366: q \in \delta(s, a) \rbrace.$$

367:

368: It is clear that $M'$ accepts the desired language and consists of at most

369: $n^2+1$ states and $2t^2$ transitions.

370: \end{proof}

371:

372: \begin{corollary}

373:     Given an NFA $M$ with $n$ states and $t$ transitions,

374: we can determine if $M$ accepts a palindrome in $O(n^2 + t^2)$ time.

375: \end{corollary}

376:

377: \begin{proof}

378:       We create $M'$ as in the proof of Lemma~\ref{pal-con},

379: and remove all states that are not useful, and

380: their associated transitions.  Now $M$ accepts

381: at least one palindrome if and only if $L(M')\not=\emptyset$, which can

382: be tested in time linear in the number of transitions and states of $M'$.

383: \end{proof}

384:

385:       From Lemma~\ref{pal-con}, we obtain two other interesting

386: corollaries.

387:

388: \begin{corollary}

389:       Given an NFA $M'$, we can determine if $L(M)$ contains infinitely

390: many palindromes in quadratic time.

391: \label{inf-pal}

392: \end{corollary}

393:

394: \begin{proof}

395:        We create $M'$ as in the proof of Lemma~\ref{pal-con}, and remove

396: all states that are not useful, and their associated transitions.

397: $M$ accepts infinitely many palindromes if and only if $L(M')$ is infinite,

398: which can be tested in linear time, as described in Section~\ref{nn}.

399: \end{proof}

400:

401: \begin{corollary}

402:      If an NFA $M$ accepts at least one palindrome, it accepts a

403: palindrome of length $\leq 2n^2 -1$.

404: \end{corollary}

405:

406: \begin{proof}

407:       Suppose $M$ accepts at least one palindrome.  Then $M'$, as in

408: Lemma~\ref{pal-con}, accepts a word.  Although $M'$ has $n^2+1$ states,

409: the only transition from the initial state $q'_0$ is

410: an $\epsilon$-transition to one of the other $n^2$ states.  Thus if

411: $M'$ accepts a word, it must accept a word of length $\leq n^2 - 1$.

412: Then $M$ accepts

413: either $w w^R$ or $w a w^R$, and both are palindromes, so $M$

414: accepts a palindrome of length at most $2(n^2 - 1) + 1 = 2n^2 - 1$.

415: \end{proof}

416:

417:      For a different proof of this corollary, see Rosaz \cite{Rosaz:2002}.

418:

419:       We observe that the quadratic bound is tight, up to

420: a multiplicative constant, in the case of alphabets with at

421: least two letters, and even for DFAs:

422:

423: \begin{proposition}

424: For infinitely many $n$ there exists a DFA $M$ with $n$ states

425: over a $2$-letter alphabet such that

426:     \begin{itemize}

427:     \item[(a)] $M$ has $n$ states;

428:     \item[(b)] The shortest palindrome accepted by $M_n$ is

429:     of length $\geq n^2/2 - 3n + 5$.

430:     \end{itemize}

431: \end{proposition}

432:

433: \begin{proof}

434:      For $t \geq 2$,

435: consider the language $L_t = ({\tt a}^t)^+ {\tt b} ({\tt a}^{t-1})^+$.

436: This language evidently can be accepted by a DFA with $n = 2t+2$ states.

437: For a word $w \in L_t$ to be a palindrome, we must have

438: $w = {\tt a}^{c_1 t} {\tt b} {\tt a}^{c_2 (t-1)}$, for some

439: integers $c_1, c_2 \geq 1$, with $c_1 t = c_2 (t-1)$.  Since $t$ and

440: $t-1$ are relatively prime, we must have $t-1 \divides c_1$ and

441: $t \divides c_2$.  Thus the shortest palindrome in $L_n$ is

442: ${\tt a}^{t(t-1)} {\tt b} {\tt a}^{t(t-1)}$, which is of length

443: $2t^2 - 2t + 1 = n^2/2 - 3n + 5$.

444: \end{proof}

445:

446: \section{Testing if an NFA accepts at least one non-palindrome}

447: \label{algpal}

448:

449:     In this section we consider the problem of deciding if an

450: NFA accepts at least one non-palindrome.  Evidently, if an NFA

451: fails to accept a non-palindrome, it must accept nothing but

452: palindromes, and so we discuss the opposite decision problem,

453:

454: \medskip

455:

456: \centerline{Given an NFA $M$, is $L(M)$ palindromic?}

457:

458: \medskip

459:

460:     Again, the problem is trivial for a unary alphabet, so we

461: assume $|\Sigma| \geq 2$.

462:

463: Horv\'ath, Karhum\"aki, and Kleijn

464: \cite{Horvath&Karhumaki&Kleijn:1987} proved that the

465: question is recursively solvable.

466: In particular, they proved the following theorem:

467:

468: \begin{theorem}

469: $L(M)$ is palindromic if and only if $\lbrace x \in L(M) \

470: : \ |x| < 3n \rbrace$ is palindromic, where $n$ is the

471: number of states of $M$. \label{hkk}

472: \end{theorem}

473:

474: While a naive implementation of Theorem~\ref{hkk} would

475: take exponential time, in this section we show how to

476: test palindromicity in polynomial time. We

477: also show the bound of $3n$ in

478: Theorem~\ref{hkk} is tight for NFAs, and we improve the bound for

479: DFAs.

480:

481: First, we show how to construct a ``small'' NFA $M'_s$, for

482: some integer $s >1$, that has the following properties:

483:

484: \begin{itemize}

485: \item[(a)] no word in $L(M'_s)$ is a palindrome;

486:

487: \item[(b)] $M'_s$ accepts all non-palindromes of length $< s$  (in addition to some

488: other non-palindromes).

489:

490: \end{itemize}

491:      The idea in this construction is the following:  on input

492: $w$ of length $r<s$, we ``guess'' an index $i$, $1 \leq i

493: \leq r/2$, such that $w[i] \not= w[r+1-i]$.  We then

494: ``verify'' that there is indeed a mismatch $i$ characters

495: from each end. We can re-use states, as illustrated in

496: Figure~\ref{fig:pred2} for the case $\Sigma = \lbrace {\tt

497: a,b,c} \rbrace$ and $s = 10$.

498:

499: \begin{figure}[H]

500: \input nonpal.tex

501: \caption{Accepting non-palindromes over $\lbrace {\tt

502: a,b,c} \rbrace$ for $s = 10$.} \label{fig:pred2}

503: \end{figure}

504:

505:       The resulting NFA $M'_s$ has

506: % $(|\Sigma| + 2)(\lfloor (t-1)/2 \rfloor)$

507: $O(|\Sigma| s)$ states

508: and

509: % $2 (\lfloor (t-1)/2 \rfloor) |\Sigma| (|\Sigma| + 1) - 2|\Sigma|$

510: $O(|\Sigma|^2 s)$ transitions.  A similar construction appears

511: in \cite{Shallit&Breitbart:1996}.

512:

513: Given an NFA $M$ with $n$ states, we now construct the

514: cross-product with $M'_{3n}$, and obtain an NFA $A$ that

515: accepts $L(M) \ \cap \ L(M'_{3n})$. We claim that $L(A) =

516: \emptyset$ if and only if $L(M)$ is palindromic. For if

517: $L(A) = \emptyset$, then $M$ accepts no non-palindrome of

518: length $< 3n$, and so by Theorem~\ref{hkk}, $L(M)$ is

519: palindromic. If $L(A) \not= \emptyset$, then since

520: $L(M'_{3n})$ contains only non-palindromes, we see that

521: $L(M)$ is not palindromic.

522:

523: We can determine if $L(A) = \emptyset$ efficiently by

524: adding a new final state $q_f$ and

525: $\epsilon$-transitions from all the final states of $A$

526: to $q_f$, then performing a depth-first search to detect

527: whether there are any paths from $q_0$ to $q_f$.  This can

528: be done in time linear in the number of states and

529: transitions of $A$.  If $M$ has $n$ states and $t$

530: transitions, then $A$ has $O(n^2)$ states and

531: $O(tn)$ transitions.   Hence we have proved the following theorem.

532:

533: \begin{theorem}

534: Let $M$ be an NFA with $n$ states and $t$ transitions.

535: The algorithm sketched above determines whether

536: $M$ accepts a palindromic language in $O(n^2 + tn)$ time.

537: \label{thm2}

538: \end{theorem}

539:

540:       A different method runs slightly slower, but allows us

541: to do a little more.    We can mimic the construction for palindromes

542: in Section~\ref{onepal}, but adapt it for non-palindromes.  Given

543: an NFA $M$, we construct an NFA-$\epsilon$ $M'$ that accepts the language

544: \begin{eqnarray*}

545: \lbrace x \in \Sigma^* &:& \text{there exists } x' \in \Sigma^*,

546: a \in \Sigma \text{ such that } |x| = |x'|, x \not= {x'}^R,\\

547: &&\text{ and } x x' \in L(M) \text{ or } x a x' \in L(M) \rbrace.

548: \end{eqnarray*}

549: The construction is similar to that in Lemma~\ref{pal-con}.  On input

550: $x$, we simulate $M$ on $x x'$ and $x a x'$ symbol-by-symbol, moving

551: forward from the start state and backward from a final state.

552: We need an additional boolean ``flag'' for each state to record whether or not

553: we have processed a character in $x'$ that would mismatch the corresponding

554: character in $x$.   If $M$ has $n$ states and $t$ transitions,

555: this construction produces an NFA-$\epsilon$ $M'$ with

556: $\leq 1+2n^2$ states and $O(t^2)$ transitions.  From this we get,

557: in analogy with Corollary~\ref{inf-pal}, the following proposition.

558:

559: \begin{proposition}

560:      Given an NFA $M$ with $n$ states and $t$ transitions, we can determine in

561: $O(n^2 + t^2)$ time if $M$ accepts infinitely many non-palindromes.

562: \end{proposition}

563:

564: We now turn to the question of the optimality of the $3n$

565: bound given in Theorem~\ref{hkk}. For an NFA over an

566: alphabet of at least $2$ symbols, the bound is indeed

567: optimal, as the following example shows.

568:

569: \begin{proposition}

570: Let $\Sigma$ be an alphabet of at least two symbols, containing the

571: letters $\tt a$ and $\tt b$.

572: For $n \geq 1$

573: define $L_n =  ({\tt a}^{n-1} \Sigma)^* {\tt a}^{n-1}$.

574: Then $L_n$ can be accepted by an NFA with $n$ states and a shortest

575: non-palindrome in $L_n$ is ${\tt a}^{n-1} {\tt a} {\tt a}^{n-1} {\tt b}

576: {\tt a}^{n-1}$.

577: \label{prope}

578: \end{proposition}

579:

580: \begin{proof}

581: The details are straightforward.

582: \end{proof}

583:

584: For DFAs, however, the bound of $3n$ can be improved to

585: $3n-3$. To show this, we first prove the following lemma. A

586: language $L$ is called {\it slender} if there is a constant

587: $C$ such that, for all $n \geq 0$, the number of words of

588: length $n$ in $L$ is less than $C$. The following

589: characterization of slender regular languages has been

590: independently rediscovered several times

591: \cite{Kunze&Shyr&Thierrin:1981,Shallit:1994,Paun&Salomaa:1995}.

592:

593: \begin{theorem}

594: \label{slender}

595: Let $L \subseteq \Sigma^*$ be a regular language.  Then $L$ is slender

596: if and only if it can be written

597: as a finite union of languages of the form $u v^* w$, where

598: $u,v,w \in \Sigma^*$.

599: \end{theorem}

600:

601: Next we prove the following useful lemma concerning DFAs accepting

602: slender languages.

603:

604: \begin{lemma}

605: Let $L$ be a slender language accepted by a DFA $M$ with

606: $n$ states, over an alphabet of two or more symbols.  Then

607: $M$ must have a dead state. \label{dead-lemma}

608: \end{lemma}

609:

610: \begin{proof}

611: Without loss of generality, assume that every state of $M =

612: (Q, \Sigma, \delta, q_0, F)$ is reachable from $q_0$, and

613: that $\Sigma$ contains the symbols $a$ and $b$. We distinguish two

614: cases:

615: \begin{enumerate}

616: \item $M$ accepts a finite language. Consider the states reached

617: from $q_0$ on $a$, $a^2$, $a^3, \ldots$ Eventually some

618: state $q$ must be repeated.  This state $q$ must be a dead

619: state, for if not, $M$ would accept an infinite language.

620:

621: \item $M$ accepts an infinite language. Then $M$ has at

622: least one {\em fruitful} cycle, that is, a cycle that

623: produces infinitely many words in $L(M)$ as labels of

624: paths starting at $q_0$, entering the cycle, going around

625: the cycle some number of times,  then exiting and

626: eventually reaching a final state. Let $C_1$ be one

627: fruitful cycle, and consider the following successful path

628: involving $C_1$: $q_0 {\buildrel\alpha\over\longrightarrow}

629: q {\buildrel u \over\longrightarrow} q {\buildrel \beta

630: \over\longrightarrow} f$, where $f\in F$ and the repetition

631: of $q$ denotes the cycle $C_1$, labeled with $u$. Without

632: loss of generality assume the first letter of $u$ is $a$.

633: Since $M$ is complete, denote $p=\delta(q, b)$.

634:

635: We claim that from $p$ one cannot reach a fruitful cycle $C_2$.

636: Indeed, let's assume the contrary; this means that there exists

637: a successful path $q_0 {\buildrel\alpha\over\longrightarrow}

638: q {\buildrel u \over\longrightarrow} q {\buildrel \gamma

639: \over\longrightarrow} r {\buildrel v \over\longrightarrow}

640: r {\buildrel \mu \over\longrightarrow f'}$, with $f'\in F$

641: and the repetition of $r$ denotes the cycle $C_2$ labeled

642: with $v$. Let $n$ be an arbitrary integer, and $0 \leq i \leq n$.

643: There exist two integers $k, l$ such that

644: $k|u|=l|v|=m$. With this notation, observe that the

645: words $\alpha u^{k(n-i)}\gamma v^{l(n+i)}\mu$ are all

646: accepted by $M$ and have the same length $2mn + |\alpha\gamma\mu|$.

647: Since there are $n+1$

648: such words, this proves that $L(M)$ has $\Omega(n)$ words of length $n$

649: for large $n$---a contradiction.

650:

651: Thus, there exist a finite number of successful paths

652: starting from $p$. However, considering the states reached

653: from $p$ by the words $a$, $a^2$, $a^3, \ldots$, one such

654: state must repeat. This state is dead, for the alternative would

655: contradict the finiteness of successful paths from $p$.

656: \end{enumerate}

657: \end{proof}

658:

659: \begin{corollary}

660:     If $M$ is a DFA over an alphabet of at least two letters

661: and $L(M)$ is palindromic, then $M$ has a dead state.

662: \label{dead-cor}

663: \end{corollary}

664:

665: \begin{proof}

666:     If $L(M)$ is palindromic, then by

667: \cite[Theorem 8]{Horvath&Karhumaki&Kleijn:1987}

668: it can be written as a finite union of languages of the form

669: $u v (tv)^* u^R$, where $u, v, t \in \Sigma^*$ and $v, t$ are

670: palindromes.  By Theorem~\ref{slender}, this means

671: $L(M)$ is slender.  By Lemma~\ref{dead-lemma}, $M$ has a dead state.

672: \end{proof}

673:

674:    We are now ready to prove the improved bound of $3n-3$ for DFAs.

675:

676: \begin{theorem}

677: Let $M$ be a DFA with $n$ states.  Then $L(M)$ is palindromic if and

678: only if $\lbrace x \in L(M) \ : \ |x| < 3n-3 \rbrace$ is palindromic.

679: \end{theorem}

680:

681: \begin{proof}

682:       One direction is clear.

683:

684:      If $M = (Q, \Sigma, \delta, q_0, F)$ is over a unary alphabet,

685: then $L(M)$ is always palindromic, so the criterion is trivially true.

686:

687:     Otherwise $M$ is over an alphabet of at least two letters.

688: Assume  $\lbrace x \in L(M) \ : \ |x| < 3n-3 \rbrace$ is palindromic.  From

689: Corollary~\ref{dead-cor}, we see that $M$ must have a dead state.

690: But then we can delete such a dead state and all associated transitions,

691: and all states reachable from the deleted dead state, to get a new NFA $M'$

692: with at most $n-1$ states that accepts the same language.

693: We know from Theorem~\ref{hkk} that the palindromicity of

694: $\lbrace x \in L(M') \ : \ |x| < 3n-3 \rbrace$ implies that

695: $M'$ is palindromic.

696: \end{proof}

697:

698:    Finally, we observe that $3n-3$ is the best possible bound

699: in the case of DFAs.  To do so, we simply use the language $L_n$

700: from Proposition~\ref{prope} and observe it can be accepted by

701: a DFA with $n+1$ states; yet the shortest non-palindrome is of

702: size $3n-1$.

703:

704: We end this section by noting that the related, but fundamentally

705: different, problem of testing if $L = L^R$ was shown by Hunt

706: \cite{Hunt:1973} to be PSPACE-complete.

707:

708: \section{Testing if an NFA accepts a word matching a pattern}

709: \label{pow_test}

710:

711: In this section we consider the computational complexity of testing

712: if an NFA accepts a word matching a given pattern.

713: Specifically, we consider the following decision problem.

714:

715: \begin{quotation}

716: \noindent{\bf NFA PATTERN ACCEPTANCE}

717:

718: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$ and a

719: pattern $p$ over some alphabet $\Delta$.

720:

721: \noindent QUESTION: Does there exist $x \in \Sigma^+$ such that

722: $x \in L(M)$ and $x$ matches $p$?

723: \end{quotation}

724:

725: Since the pattern $p$ is given as part of the input, this problem

726: is actually somewhat more general than the sort of problem

727: formulated as Question~1 of the introduction, where the language

728: $L$ was fixed.

729:

730: We first consider the following result of Restivo and Salemi

731: \cite{Restivo&Salemi:2001} (a more detailed proof appears in

732: \cite{Castiglione&Restivo&Salemi:2004}).  We give here a boolean matrix

733: based proof (see Zhang \cite{Zhang:1999} for a study of this boolean matrix

734: approach to automata theory) that illustrates our general approach to

735: the other problems treated in this section.

736:

737: \begin{theorem}[Restivo and Salemi]

738: \label{res_sal}

739: Let $L$ be a regular language and let $\Delta$ be an alphabet.

740: The set $P_\Delta$ of all non-empty patterns $p \in \Delta^*$

741: such that $p$ matches a word in $L$ is effectively regular.

742: \end{theorem}

743:

744: \begin{proof}

745: Let $M = (Q,\Sigma,\delta,q_0,F)$ be an NFA such that $L(M) = L$.

746: Suppose that $Q = \{0,1,\ldots,n-1\}$.

747: For $a \in \Sigma$, let $B_a$ be the $n \times n$ boolean matrix whose

748: $(i,j)$ entry is $1$ if $j \in \delta(i,a)$ and $0$ otherwise.

749: Let $\mathcal{B}$ denote the semigroup generated by the $B_a$'s

750: along with the identity matrix.

751: For $w = w_0 w_1 \cdots w_s$, where $w_i \in \Sigma$ for $i = 0,\ldots,s$,

752: we write $B_w$ to denote the matrix product $B_{w_0} B_{w_1} \cdots B_{w_s}$.

753:

754: Without loss of generality, let $\Delta = \{1,2,\ldots,k\}$.

755: Observe that there exists a non-empty pattern

756: $p = p_0 p_1 \cdots p_r$, where $p_i \in \Delta$ for $i = 0,\ldots,r$,

757: and a non-erasing morphism $h : \Delta^* \to \Sigma^*$ such that $h(p) \in L$

758: if and only if there exist $k$ boolean matrices

759: $B_1,\ldots,B_k \in \mathcal{B}$ such that $B_i = B_{h(i)}$ for

760: $i \in \Delta$ and $B = B_{p_0} B_{p_1} \cdots B_{p_r}$ describes

761: an accepting computation of $M$.

762:

763: We construct an NFA $M' = (Q',\Delta,\delta',P,F')$ for $P_\Delta$

764: as follows.  For simplicity, we permit $M'$ to have multiple initial states,

765: as specified by the set $P$.  We define $Q' = \mathcal{B}^{k+1}$.

766: The set $P$ of initial states is given by $P = \mathcal{B}^k \times I$,

767: where $I$ denotes the identity matrix.  In other words, the NFA $M'$ uses the

768: first $k$ components of its state to record an initial guess of $k$ boolean

769: matrices $B_1,\ldots,B_k \in \mathcal{B}$.  Let $[B_1,\ldots,B_k,A]$

770: denote some arbitrary state of $M'$.  For $i=1,\ldots,k$, the

771: transition function $\delta'$ maps $[B_1,\ldots,B_k,A]$ to

772: $[B_1,\ldots,B_k,AB_i]$.  In other words, on input

773: $p = p_0 p_1 \cdots p_r\in \Delta^*$, $M'$ uses the last component of

774: its state to compute the product $B = B_{p_0} B_{p_1} \cdots B_{p_r}$.

775: The set $F'$ of final states of $M'$ consists of all states of the form

776: $[B_1,\ldots,B_k,B]$, where the matrix $B$ contains a $1$ in some entry

777: $(0,j)$, where $j \in F$.  In other words, $M'$ accepts if and only if

778: $B$ describes an accepting computation of $M$.

779: \end{proof}

780:

781: By consider unary patterns of the form $a^k$, we obtain the following

782: corollary of Theorem~\ref{res_sal}.

783:

784: \begin{corollary}

785: Let $L \subseteq \Sigma^*$ be a regular language.  The set of exponents $k$

786: such that $L$ contains a $k$-power is the union of a finite set with a finite

787: union of arithmetic progressions.  Further, this set of exponents is

788: effectively computable.

789: \end{corollary}

790:

791: Observe that Theorem~\ref{res_sal} implies the decidability

792: of the {\bf NFA PATTERN ACCEPTANCE} problem.  We prove the following

793: stronger result.

794:

795: \begin{theorem}

796: \label{pattern}

797: The {\bf NFA PATTERN ACCEPTANCE} problem is PSPACE-complete.

798: \end{theorem}

799:

800: \begin{proof}

801: We first show that the problem is in PSPACE.  By Savitch's theorem

802: \cite{Savitch:1970} it suffices to give an NPSPACE algorithm.

803: Let $M = (Q,\Sigma,\delta,q_0,F)$, where $Q = \{0,1,\ldots,n-1\}$.

804: For $a \in \Sigma$, let $B_a$ be the $n \times n$ boolean matrix whose

805: $(i,j)$ entry is $1$ if $j \in \delta(i,a)$ and $0$ otherwise.

806: Let $\mathcal{B}$ denote the semigroup generated by the $B_a$'s along with

807: the identity matrix.  For $w = w_0 w_1 \cdots w_s \in \Sigma^*$, we write

808: $B_w$ to denote the matrix product $B_{w_0} B_{w_1} \cdots B_{w_s}$.

809:

810: Let $\Delta$ be the set of letters occuring in $p$.  We may suppose that

811: $\Delta = \{1,2,\ldots,k\}$.  First, we non-deterministically guess

812: $k$ boolean matrices $B_1, \ldots, B_k$.  Next, for each $i$, we

813: verify that $B_i$ is in the semigroup $\mathcal{B}$ by

814: non-deterministically guessing a word $w = w_0 w_1 \cdots w_s$

815: such that $B_i = B_w$.  Since there are at most

816: $2^{n^2}$ possible $n \times n$ boolean matrices, we may assume that

817: $s \leq 2^{n^2}$.  We thus guess $w$ symbol-by-symbol and compute a

818: sequence of matrices

819: \[

820: B_{w_1}, B_{w_1w_2}, \ldots, B_{w_1w_2 \cdots w_s},

821: \]

822: reusing space after perfoming each matrix multiplication.

823: We maintain an $O(n^2)$ bit counter to keep track of the

824: length $s$ of our guessed word $w$.  If $s$ exceeds $2^{n^2}$, we reject

825: on this branch of the non-deterministic computation.

826:

827: Finally, if $p = p_0 p_1 \cdots p_r$, we compute the matrix product

828: $B = B_{p_0} B_{p_1} \cdots B_{p_r}$ and accept if and only if $B$

829: describes an accepting computation of $M$.

830:

831: To show hardness we reduce from the following PSPACE-complete problem

832: \cite[Problem AL6]{Garey&Johnson:1979}.

833:

834: \begin{quotation}

835: \noindent{\bf DFA INTERSECTION}

836:

837: \noindent INSTANCE: An integer $k \geq1$ and $k$ DFAs

838: $A_1,A_2,\ldots,A_k$, each over the alphabet $\Sigma$.

839:

840: \noindent QUESTION: Does there exist $x \in \Sigma^*$ such that $x$

841: is accepted by each $A_i$, $1 \leq i \leq k$?\qed

842: \end{quotation}

843:

844: Let $\#$ be a symbol not in $\Sigma$.  We construct, in linear time, a

845: DFA $M$ to accept the language

846: $L(A_1) \,\#\, L(A_2) \,\#\, \cdots L(A_k) \,\#$.

847: Any word in $L(M)$ matching the pattern $a^k$ is of the form $(x\#)^k$.

848: It follows that $M$ accepts a word matching $a^k$ if and only if there

849: exists $x$ such that $x \in L(A_i)$ for $1 \leq i \leq k$.

850: This completes the reduction.

851: \end{proof}

852:

853: We may define various variations or special cases of the

854: {\bf NFA PATTERN ACCEPTANCE} problem, such as:

855: {\bf NFA ACCEPTS A $k$-POWER},

856: {\bf NFA ACCEPTS A $\geq k$-POWER},

857: {\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS},

858: {\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS}, etc.

859: We define and consider the computational complexity of these variations

860: below.

861:

862: \begin{quotation}

863: \noindent{\bf NFA ACCEPTS A $k$-POWER}.

864:

865: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$ and an

866: integer $k \geq 2$.

867:

868: \noindent QUESTION: Does there exist $x \in \Sigma^+$ such that

869: $M$ accepts $x^k$?

870: \end{quotation}

871:

872: \begin{quotation}

873: \noindent{\bf NFA ACCEPTS A $\geq k$-POWER}.

874:

875: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$.

876:

877: \noindent QUESTION: Does there exist $x \in \Sigma^+$ and an integer

878: $\ell \geq k$ such that $M$ accepts $x^\ell$?

879: \end{quotation}

880:

881:     The {\bf NFA ACCEPTS A $\geq k$-POWER} problem is actually an infinite

882: family of problems, each indexed by an integer $k \geq 2$.

883: If $k$ is fixed, the {\bf NFA ACCEPTS A $k$-POWER} problem can

884: be solved in polynomial time, as we now demonstrate.

885:

886: \begin{proposition}\label{fixed-k}

887: Let $M$ be an NFA with $n$ states and $t$ transitions, and set

888: $N = n+t$, the size of $M$.

889: For any fixed integer $k \geq 2$, there is an algorithm running in

890: $O(n^{2k-1} t^k) = O(N^{2k-1})$ time

891: to determine if $M$ accepts a $k$-power.

892: \end{proposition}

893:

894: \begin{proof}

895: For a language $L \subseteq \Sigma^*$, we define

896: $$L^{1/k} = \{ x \in \Sigma^* : x^k \in L \}.$$

897:

898: Let $M = (Q, \Sigma, \delta, q_0, F)$ be an NFA with $n$ states.

899: We will construct an NFA-$\epsilon$ $M'$ such that $L(M') = L(M)^{1/k}$.

900: To determine whether or not $M$ accepts a $k$-power,

901: it suffices to check whether or not $M'$ accepts a non-empty word.

902:

903: The idea behind the construction of $M'$ is as follows.

904: On input $x$, $M'$ first guesses $k-1$ states

905: $g_1, g_2, \ldots, g_{k-1} \in Q$ and then checks that

906: \begin{itemize}

907: \item

908:   $g_1 \in \delta(q_0,x)$,

909: \item

910:   $g_{i+1} \in \delta(g_i,x)$ for $i = 1,2,\ldots,k-2$, and

911: \item

912:   $\delta(g_{k-1},x) \cap F \neq \emptyset$.

913: \end{itemize}

914: It is clear that such states $g_1, g_2, \ldots, g_{k-1}$ exist

915: if and only if $x^k \in L(M)$.

916:

917: Formally, the construction of $M'$ is as follows.

918: We define the NFA $M' = (Q', \Sigma, \delta', q_0', F')$ such that:

919:

920: \begin{itemize}

921: \item $Q' = \{ q_0' \} \cup Q^{2k-1}$.  That is, except for $q_0'$,

922: each state of $M'$ is a $(2k-1)$-tuple of the form

923: $[g_1, g_2, \ldots, g_{k-1}, p_0, p_1, \ldots, p_{k-1}]$.

924: The state $g_i$ represents the $i$-th state guessed from $M$.

925: The NFA $M'$ will simulate in parallel the computations of $M$ on input

926: $x$ starting from states $q_0, g_1, g_2, \ldots, g_{k-1}$ respectively.

927: The state $p_0$ represents the current state of the simulation beginning

928: from state $q_0$, and the states $p_1, p_2, \ldots, p_{k-1}$ represent the

929: current states of the simulations beginning from states

930: $g_1, g_2, \ldots, g_{k-1}$, respectively.

931:

932: \item $q_0'$ is an additional state not in $Q^{2k-1}$.

933: This state will have outgoing $\epsilon$-transitions for each

934: different combination of guesses $g_i$.

935: The transition function on the start state is defined as

936: $$\delta'( q_0', \epsilon) = \{ [g_1, g_2, \ldots, g_{k-1},

937: q_0, g_1, g_2, \ldots, g_{k-1}] :

938: \forall i \in \{1,2,\ldots,k-1\}, g_i \in Q \}.$$

939:

940: \item We define the transition function $\delta'$ on all other states

941: as:

942: \begin{eqnarray*}

943: \lefteqn{\delta'([g_1, g_2, \ldots, g_{k-1}, p_0, p_1, \ldots, p_{k-1}], a)=}\\

944: & & \{[g_1, g_2, \ldots, g_{k-1}, p_0', p_1', \ldots, p_{k-1}'] :

945: \forall i \in \{0,1,\ldots,k-1\}, p_i' \in \delta (p_i, a)\}

946: \end{eqnarray*}

947: for all $a \in \Sigma$.

948:

949: \item $F' = \{ [g_1, g_2, \ldots, g_{k-1}, g_1, g_2, \ldots, g_{k-1}, t]: t \in F \}$.

950: That is, we reach a state in $F'$ on input $x$ exactly when the guessed

951: states $g_i$ verify the conditions described above.

952: \end{itemize}

953:

954: It should be clear from the construction that $M'$ accepts $L(M)^{1/k}$.

955: The number of states in $M'$ is $n^{2k-1} + 1$, as, except for $q_0'$,

956: each state is a $(2k-1)$-tuple in which each coordinate can take on

957: $|Q|=n$ possible values.  For each state there are at most $t^k$ distinct

958: transitions.  Testing whether or not $L(M')$ accepts a non-empty

959: word can be done in linear time (since the only $\epsilon$-transitions are

960: transitions outgoing from $q_0'$), so the running time of our algorithm is

961: $O(n^{2k-1} t^k)$.

962: \end{proof}

963:

964:     As before, we can use the same automaton to test if infinitely many

965: $k$-powers are accepted.

966:

967: \begin{corollary}

968:      We can decide if an NFA $M$ with $n$ states and $t$ transitions

969: accepts infinitely many $k$-powers in $O(n^{2k-1} t^k)$ time.

970: \end{corollary}

971:

972: If $k$ is not fixed, we have the following result, which is an immediate

973: consequence of Theorem~\ref{pattern} if $k$ is given in unary.  However,

974: the problem remains in PSPACE even if $k$ is given in binary, as we now

975: demonstrate.

976:

977: \begin{theorem}

978: \label{kpow_alg}

979: The problem {\bf NFA ACCEPTS A $k$-POWER} is PSPACE-complete.

980: \end{theorem}

981:

982: \begin{proof}

983: We first show that the problem is in PSPACE.  By Savitch's theorem

984: \cite{Savitch:1970} it suffices to give an NPSPACE algorithm.

985: Let $M = (Q,\Sigma,\delta,q_0,F)$, where $Q = \{0,1,\ldots,n-1\}$.

986: For $a \in \Sigma$, let $B_a$ be the $n \times n$ boolean matrix whose

987: $(i,j)$ entry is $1$ if $j \in \delta(i,a)$ and $0$ otherwise.

988: Let $\mathcal{B}$ denote the semigroup generated by the $B_a$'s.

989:

990: We non-deterministically guess a boolean matrix $B$ and

991: verify that $B \in \mathcal{B}$ (i.e., $B = B_x$ for some $x \in \Sigma^*$),

992: as illustrated in the proof of Theorem~\ref{pattern}.

993: Finally, we compute $B_x^k$ efficiently by repeated squaring

994: and verify that $B_x^k$ contains a $1$ in  position $(q_0,f)$ for some

995: $f \in F$.

996:

997: The proof for PSPACE-hardness is precisely that given in the proof

998: of Theorem~\ref{pattern}.

999: \end{proof}

1000:

1001: \begin{theorem}

1002: \label{pow_alg}

1003: For each integer $k \geq 2$, the problem {\bf NFA ACCEPTS A $\geq k$-POWER}

1004: is PSPACE-complete.

1005: \end{theorem}

1006:

1007: \begin{proof}

1008: To show that the problem is in PSPACE, we use the same algorithm

1009: as in the proof of Theorem~\ref{kpow_alg}, with the following modification.

1010: In order to verify that $M$ accepts an $\ell$-power for some $\ell \geq k$,

1011: we first observe that by the same argument as in the proof of

1012: Proposition~\ref{exponent_bd} below, if $M$ accepts such an $\ell$-power,

1013: then $M$ accepts an $\ell$-power for $k \leq \ell < k+n$.

1014: Thus, after non-deterministically computing $B_x$, we must compute

1015: $B_x^\ell$ for all $k \leq \ell < k+n$, and verify that at least

1016: one $B_x^\ell$ contains a $1$ in position $(q_0,f)$ for some $f \in F$.

1017:

1018: To show PSPACE-hardness, we again reduce from the {\bf DFA INTERSECTION}

1019: problem.  Suppose that we are given $r$ DFAs $A_1,A_2,\ldots,A_r$ and we wish

1020: to determine if the $A_i$'s accept a common word $x$.  We may suppose

1021: that $r \geq k$, since for any fixed $k$ such a restriction does not affect

1022: the PSPACE-completeness of the {\bf DFA INTERSECTION} problem.

1023: Let $j$ be the smallest non-negative integer such that $r+j$ is prime.

1024: By Bertrand's Postulate \cite[Theorem~418]{Hardy&Wright:1979},

1025: we may take $j \leq r$.  We now construct, in linear time,

1026: a DFA $M$ to accept the language

1027: $L(A_1) \,\#\, L(A_2) \,\#\, \cdots L(A_r) \,\# (\Sigma^* \,\#)^j$.

1028: The DFA $M$ accepts a $\geq k$-power if and only if it accepts an

1029: $(r+j)$-power.  Moreover, $M$ accepts an $(r+j)$-power if and only if

1030: there exists $x$ such that $x \in L(A_i)$ for $1 \leq i \leq r$.

1031: This completes the reduction.

1032: \end{proof}

1033:

1034: In a similar fashion, we now show that the following decision problems

1035: are PSPACE-complete:

1036:

1037: \begin{quotation}

1038: \noindent{\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS}.

1039:

1040: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$ and an

1041: integer $k \geq 2$.

1042:

1043: \noindent QUESTION: Does $M$ accept $x^k$ for infinitely many words $x$?

1044: \end{quotation}

1045:

1046: \begin{quotation}

1047: \noindent{\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS}.

1048:

1049: \noindent INSTANCE: An NFA $M$ over the alphabet $\Sigma$.

1050:

1051: \noindent QUESTION: Are there infinitely many pairs $(x,i)$ such that

1052: $i \geq k$ and $M$ accepts $x^i$?

1053: \end{quotation}

1054:

1055:       Again, the {\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS}

1056: problem is actually an infinite family of problems, each indexed by

1057: an integer $k \geq 2$.

1058: We will prove that these decision problems are PSPACE-complete

1059: by reducing from the following problem.

1060:

1061: \begin{quotation}

1062: \noindent{\bf INFINITE CARDINALITY DFA INTERSECTION}.

1063:

1064: \noindent INSTANCE: An integer $k \geq 1$ and  $k$ DFAs

1065: $A_1,A_2,\ldots,A_k$, each over the alphabet $\Sigma$.

1066:

1067: \noindent QUESTION: Do there exist infinitely many

1068: $x \in \Sigma^*$ such that $x$

1069: is accepted by each $A_i$, $1 \leq i \leq k$?

1070: \end{quotation}

1071:

1072: \begin{lemma}

1073:      The decision problem {\bf INFINITE CARDINALITY DFA INTERSECTION}

1074: is PSPACE-complete.

1075: \end{lemma}

1076:

1077: \begin{proof}

1078:      First, let's see that the problem is in PSPACE.  If the largest

1079: DFA has $n$ states, then there is a DFA with at most $n^k$ states

1080: that accepts $\bigcap_{1 \leq i \leq k}  L(A_i)$.  Now from

1081: Theorem~\ref{hopcroft} (b), we know that there exist infinitely many

1082: $x$ accepted by each $A_i$ if and only if there is a word $x$

1083: length $\ell$, $n^k \leq \ell < 2n^k$, accepted by all the $A_i$.

1084: We can simply guess the symbols of $x$, ensuring with a counter that

1085: $n^k \leq |x| < 2n^k$, and checking by simulation that $x$ is accepted

1086: by all the $A_i$.  The counter uses at most $k \log n + \log 2$ bits,

1087: which is polynomial in the size of the input.  This shows the problem

1088: is in nondeterministic polynomial space, and hence, by Savitch's theorem

1089: \cite{Savitch:1970}, in PSPACE.

1090:

1091:      Now, to see that {\bf INFINITE CARDINALITY DFA INTERSECTION}

1092: is PSPACE-hard, we reduce from {\bf DFA INTERSECTION}.  For each

1093: DFA $A_i = (Q_i, \Sigma, \delta_i, q_{0,i}, F_i)$,

1094: we modify it to $B_i$ as follows:  we add a new initial

1095: state $q'_{0,i}$, and add the same transitions from it as from $q_{0,i}$.

1096: We then change all final states to non-final, and we make $q'_{0,i}$

1097: final.

1098: We add a transition from all states that were

1099: previously final on a new letter $\cent$ (the same letter is used for

1100: each $A_i$), and a transition from all other states on $\cent$ to a new

1101: dead state $d$.  Finally, we add transitions on all letters from $d$

1102: to itself.   We claim $B_i$ is a DFA and

1103: $L(B_i) = (L(A_i)\cent)^*$.   Furthermore,

1104: $\bigcap_{1 \leq i \leq k}  L(A_i) \not= \emptyset$ if and only if

1105: $\bigcap_{1 \leq i \leq k} L(B_i)$ is infinite.

1106:

1107: Suppose $\bigcap_{1 \leq i \leq k}  L(A_i) \not= \emptyset$.  Then

1108: there exists $x$ accepted by each of the $A_i$.  Then $(x\cent)^*$

1109: is accepted by each of the $B_i$, so

1110: $\bigcap_{1 \leq i \leq k}  L(B_i) $ is infinite.

1111:

1112: Now suppose $\bigcap_{1 \leq i \leq k}  L(B_i) $ is infinite.  Choose

1113: any nonempty

1114: $x \in \bigcap_{1 \leq i \leq k}  L(B_i)  = \bigcap_{1 \leq i \leq k}

1115: (L(A_i)\cent)^*$.  Thus $x$ must be of the form $y_1 \cent y_2 \cent

1116: \cdots y_j \cent$

1117: for some $j \geq 1$, where each $y_i$ is accepted by all the $A_i$.

1118: Hence, in particular, $y_1$ is accepted by all the $A_i$, and so

1119: $\bigcap_{1 \leq i \leq k}  L(A_i) \not= \emptyset$.

1120: \end{proof}

1121:

1122: We are now ready to prove

1123:

1124: \begin{theorem}

1125: The decision problem {\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS} is

1126: PSPACE-complete.

1127: \end{theorem}

1128:

1129: \begin{proof}

1130:     First, let's see that the problem is in PSPACE.  We claim that

1131: an NFA $M$ with $n$ states accepts infinitely many $k$-powers if and only

1132: if it accepts a $k$-power $x^k$ with $2^{n^2} \leq |x| < 2^{n^2+1}$.

1133:

1134: One direction is clear.  For the other direction,

1135: we use boolean matrices, as in the proof of Theorem~\ref{kpow_alg}.

1136: We can construct a DFA $M' = (Q', \Sigma, \delta', q'_0, F')$

1137: of $2^{n^2}$ states that accepts $L^{1/k} =

1138: \lbrace x \in \Sigma^* \ : \ x^k \in L(M) \rbrace$, as follows:  the states are

1139: $n \times n$ boolean matrices.  The initial state $q'_0$ is the

1140: identity matrix. If $B_a$ is the boolean matrix with

1141: a $1$ in entry $(i,j)$ if $j \in \delta(q_i,a)$ and $0$ otherwise,

1142: then $\delta'(B, a) = B B_a$.  The set of final states is

1143: $F' = \lbrace B \ : \ \text{the $(0,j)$ entry of $B^k$ is $1$ for some }

1144: q_j \in F \rbrace.$

1145:

1146: The idea of this construction is that  if $x = a_1 a_2 \cdots a_i$, then

1147: $\delta(q'_0, x) = B_{a_1} \cdots B_{a_i}$.    Now we use

1148: Theorem~\ref{hopcroft} (b) to conclude that $M'$ accepts infinitely

1149: many words if and only if it accepts a word $x$ with

1150: $2^{n^2} \leq |x| < 2^{n^2+1}$.  But $L(M') = L(M)^{1/k}$.

1151:

1152: Thus, to check if $M$ accepts infinitely many $k$-powers, we simply

1153: guess the symbols of $x$, stopping when $2^{n^2} \leq |x| < 2^{n^2+1}$,

1154: and verify that $M$ accepts $x^k$.  We can do this by accumulating

1155: $B_{a_1} \cdots B_{a_k}$ and raising the result to the $k$-th power, as before.

1156: We need $n^2+1$ bits to keep track of the counter, so the result is in

1157: NPSPACE, and hence in PSPACE.

1158:

1159: Now we argue that {\bf NFA ACCEPTS INFINITELY MANY $k$-POWERS} is

1160: PSPACE-hard.  To do so, we reduce from

1161: {\bf INFINITE CARDINALITY DFA INTERSECTION}.  Given DFAs

1162: $A_1, A_2, \ldots, A_k$, we can easily construct a DFA $A$ to accept

1163: $L(A_1) \# \cdots L(A_k) \#$.  Clearly $A$ accepts infinitely many

1164: $k$-powers if and only if $\bigcap_{1 \leq i \leq k} L(A_i)$ is infinite.

1165: \end{proof}

1166:

1167: \begin{theorem}

1168: For each integer $k \geq 2$, the problem

1169: {\bf NFA ACCEPTS INFINITELY MANY $\geq k$-POWERS} is PSPACE-complete.

1170: \end{theorem}

1171:

1172: \begin{proof}

1173: Left to the reader.

1174: \end{proof}

1175:

1176:

1177: \section{Testing if an NFA accepts a non-$k$-power}

1178: \label{kp}

1179:

1180: In the previous section we showed that it is computationally hard

1181: to test if an NFA accepts a $k$-power (when $k$ is not fixed).

1182: In this section we show how to test if an NFA accepts a

1183: non-$k$-power.  Again, we find it more congenial to discuss the

1184: opposite problem, which is whether an NFA accepts nothing but

1185: $k$-powers.

1186:

1187: First, we need several classical

1188: results from the theory of combinatorics on words.

1189: The following theorem is due to Lyndon and Sch\"utzenberger

1190: \cite{Lyndon&Schutzenberger:1962}.

1191:

1192: \begin{theorem}

1193: \label{ls_eqn}

1194: If $x$, $y$, and $z$ are words satisfying an equation $x^i y^j = z^k$,

1195: where $i,j,k \geq 2$, then they are all powers of a common word.

1196: \end{theorem}

1197:

1198: The next result is also due to Lyndon and Sch\"utzenberger.

1199:

1200: \begin{theorem}

1201: \label{lyn_schu}

1202: Let $u$ and $v$ be non-empty words.  If $uv = vu$, then there exists

1203: a word $x$ and integers $i,j \geq 1$, such that $u = x^i$ and $v = x^j$.

1204: In other words, $u$ and $v$ are powers of a common word.

1205: \end{theorem}

1206:

1207: The following result can be derived from Theorem~\ref{lyn_schu}.

1208:

1209: \begin{corollary}

1210: \label{ls_cor}

1211: Let $u$ and $v$ be non-empty words.  If $u^r = v^s$ for some $r,s \geq 1$,

1212: then $u$ and $v$ are powers of a common word.

1213: \end{corollary}

1214:

1215: Ito, Katsura, Shyr, and Yu \cite{Ito&Katsura&Shyr&Yu:1988}

1216: gave a proof of the next proposition.

1217:

1218: \begin{proposition}

1219: \label{shyr}

1220: Let $u$ and $v$ be non-empty words.  If $u$ and $v$ are not powers of

1221: a common word, then for any integers $r,s \geq 1$, $r \neq s$,

1222: at least one of $u^rv$ or $u^sv$ is primitive.

1223: \end{proposition}

1224:

1225: The next result is due to Shyr and Yu \cite{Shyr&Yu:1994}.

1226:

1227: \begin{theorem}

1228: \label{p+q+}

1229: Let $p$ and $q$ be primitive words, $p \neq q$.  The set $p^+q^+$

1230: contains at most one non-primitive word.

1231: \end{theorem}

1232:

1233:

1234: Next we prove the following analogue

1235: of Theorem~\ref{hkk}, from which we will derive an

1236: efficient algorithm for testing if a finite automaton

1237: accepts only $k$-powers.

1238:

1239: \begin{theorem}

1240: \label{k-pow}

1241: Let $L$ be accepted by an $n$-state NFA $M$ and let $k \geq 2$ be an integer.

1242: \begin{enumerate}

1243: \item Every word in $L$ is a $k$-power if and only if every word in the set

1244: $\lbrace x \in L : |x| \leq 3n \rbrace$ is a $k$-power.

1245: \item All but finitely many words in $L$ are $k$-powers if and only if

1246: every word in the set $\lbrace x \in L : n \leq |x| \leq 3n \rbrace$

1247: is a $k$-power.

1248: \end{enumerate}

1249: Further, if $M$ is a DFA over an alphabet of size $\geq 2$, then the bound $3n$

1250: may be replaced by $3n-3$.

1251: \end{theorem}

1252:

1253: Ito, Katsura, Shyr, and Yu \cite{Ito&Katsura&Shyr&Yu:1988}

1254: proved a similar result for primitive words: namely, that

1255: if $L$ is accepted by an $n$-state DFA over an alphabet of

1256: two or more letters and contains a

1257: primitive word, then it contains a primitive word of length

1258: $\leq 3n-3$. In other words, every word in $L$ is a power if and only if

1259: every word in the set $\lbrace x \in L : |x| \leq 3n-3 \rbrace$ is a power.

1260: However, this result does not imply

1261: Theorem~\ref{k-pow}, as one can easily construct a regular

1262: language $L$ where every word in $L$ that is not a

1263: $k$-power is nevertheless non-primitive:  for example, $L = \lbrace a^{k+1}

1264: \rbrace$.

1265:

1266: We shall use the next result to characterize those regular languages

1267: consisting only of $k$-powers.

1268:

1269: \begin{proposition}

1270: \label{prim} Let $u$, $v$, and $w$ be words, $v \neq

1271: \epsilon$, $uw \neq \epsilon$, and let $f,g \geq 1$ be integers, $f \neq g$.

1272: If $uv^fw$ and $uv^gw$ are non-primitive, then $uv^nw$ is

1273: non-primitive for all integers $n \geq 1$. Further, if

1274: $uvw$ and $uv^2w$ are $k$-powers for some integer $k \geq

1275: 2$, then $v$ and $uv^nw$ are $k$-powers for all integers $n

1276: \geq 1$.

1277: \end{proposition}

1278:

1279: \begin{proof}

1280: Suppose $uv^fw$ and $uv^gw$ are non-primitive.  Then $v^fwu$ and

1281: $v^gwu$ are non-primitive. Let $x$ and $y$ be the primitive roots of $v$ and

1282: $wu$, respectively, so that $v = x^i$ and $wu = y^j$ for some integers

1283: $i,j \geq 1$.  If $x \neq y$, then by Proposition~\ref{shyr}, one concludes

1284: that at least one of $v^fwu$ or $v^gwu$ is primitive, a contradiction.

1285:

1286: If $x = y$, then for all integers $n \geq 1$, $v^nwu = x^{ni+j}$ is clearly

1287: non-primitive, and consequently, $uv^nw$ is non-primitive, as required.

1288: Let us now suppose that $uvw$ and $uv^2w$ are $k$-powers for some $k \geq 2$.

1289: Then $vwu = x^{i+j}$ and $v^2wu = x^{2i+j}$ are both $k$-powers as well.

1290: We claim that the following must hold:

1291: \begin{eqnarray*}

1292: i + j & \equiv & 0 \pmod k \\

1293: 2i + j & \equiv & 0 \pmod k.

1294: \end{eqnarray*}

1295: To see this, write $vwu = z^k$ for some word $z$.  Then $z^k = x^{i+j}$,

1296: so by Corollary~\ref{ls_cor} $z$ and $x$ are powers of a common word.

1297: Since $x$ is primitive it follows that $z$ is a power of $x$.

1298: In particular, $|x|$ divides $|z|$ and $i + j$ is a multiple of $k$,

1299: as claimed.  A similar argument applies to $v^2wu$.

1300:

1301: We conclude that $i \equiv j \equiv 0 \pmod k$,

1302: and hence, $v = x^i$ is a $k$-power.  Moreover, $v^nwu = x^{ni+j}$ is also a

1303: $k$-power for all integers $n \geq 1$, and consequently, $uv^nw$ is a

1304: $k$-power, as required.

1305: \end{proof}

1306:

1307: The characterization due to Ito et al.

1308: \cite[Proposition~10]{Ito&Katsura&Shyr&Yu:1988} (see also D\"om\"osi,

1309: Horv\'ath, and Ito \cite[Theorem~3]{Domosi&Horvath&Ito:2004})

1310: of the regular languages consisting only of powers,

1311: along with Theorem~\ref{slender}, implies

1312: that any such language is slender.  A simple application of the

1313: Myhill--Nerode Theorem gives the following weaker result.

1314:

1315: \begin{proposition}

1316: \label{my-ner}

1317: Let $L$ be a regular language and let $k \geq 2$ be an integer.  If

1318: all but finitely many words of $L$ are $k$-powers, then $L$ is slender.

1319: In particular, if $L$ is accepted by an $n$-state DFA and all words in $L$ of

1320: length $\geq \ell$ are $k$-powers, then for all $r \geq \ell$,

1321: the number of words in $L$ of length $r$ is at most $n$.

1322: \end{proposition}

1323:

1324: \begin{proof}

1325: Let $x^k$ and $y^k$ be distinct words in $L$ of length $r \geq \ell$.

1326: Then $x$ and $y$ are inequivalent with respect to the Myhill--Nerode

1327: equivalence relation, since $y^k \in L$ but $xy^{k-1} \not\in L$.

1328: The Myhill--Nerode equivalence relation on $L$ thus has index at least

1329: as large as the number of distinct words of length $r$ in $L$.  Since

1330: the index of the Myhill--Nerode relation is at most $n$, it follows that

1331: there is a bounded number of words of length $r$ in $L$, so that $L$

1332: is slender, as required.

1333: \end{proof}

1334:

1335: The following characterization is analogous to the characterization

1336: of palindromic regular languages given in

1337: \cite[Theorem~8]{Horvath&Karhumaki&Kleijn:1987}.

1338:

1339: \begin{theorem}

1340: Let $L \subseteq \Sigma^*$ be a regular language and let $k \geq 2$ be

1341: an integer.  The language $L$ consists only of $k$-powers if and only if

1342: it can be written as a finite union of languages of the form

1343: $uv^*w$, where $u,v,w \in \Sigma^*$ satisfy the following:

1344: there exists a primitive word $x \in \Sigma^*$ and integers $i,j \geq 0$

1345: such that $v = x^{ik}$ and $wu = x^{jk}$.

1346: \end{theorem}

1347:

1348: \begin{proof}

1349: The ``if'' direction is clear; we prove the ``only if'' direction.

1350: Let $L$ consist only of $k$-powers.  Then by Proposition~\ref{my-ner},

1351: $L$ is slender.  By Theorem~\ref{slender}, $L$ can be written

1352: as a finite union of languages of the form $uv^*w$.  By examining the proof

1353: of Proposition~\ref{prim}, one concludes that $u$, $v$, and $w$ have the

1354: desired properties.

1355: \end{proof}

1356:

1357: We shall need the following lemma for the proof of Theorem~\ref{k-pow}.

1358:

1359: \begin{lemma}\label{inf_many}

1360: Let $L$ be a regular language accepted by an $n$-state NFA $M$ and let

1361: $k \geq 2$ be an integer.  If $L$ contains a non-$k$-power of length

1362: $\geq n$, then $L$ contains infinitely many non-$k$-powers.

1363: \end{lemma}

1364:

1365: \begin{proof}

1366: Let $s \in L$ be a non-$k$-power such that $|s| \geq n$.  Consider

1367: an accepting computation of $M$ on $s$.  Such a computation must contain

1368: at least one repeated state.  It follows that there exists a decomposition

1369: $s = uvw$, $v \neq \epsilon$, such that $uv^*w \subseteq L$.

1370: Let $x$ be the primitive root of $v$, so that $v = x^i$ for some positive

1371: integer $i$.

1372:

1373: Suppose that $wu = \epsilon$.  Since $s = v = x^i$ is not a $k$-power,

1374: it follows that $i \not\equiv 0 \pmod k$.  Moreover, there exist

1375: infinitely many positive integers $\ell$ such that

1376: $\ell i \not\equiv 0 \pmod k$, and so by Corollary~\ref{ls_cor}, there

1377: exist infinitely many

1378: words of the form $v^\ell = x^{\ell i}$ that are

1379: non-$k$-powers in $L$, as required.

1380:

1381: Suppose then that $wu \neq \epsilon$.

1382: Let $y$ be the primitive root of $wu$, so that $wu = y^j$ for some positive

1383: integer $j$.  We have two cases.

1384:

1385: Case 1: $x = y$.  Since $uvw$ is a not a $k$-power, $vwu$ is also not

1386: a $k$-power, and thus we have $i + j \not\equiv 0 \pmod k$.

1387: Moreover, there are infinitely many

1388: positive integers $\ell$ such that $\ell i + j \not\equiv 0 \pmod k$.

1389: For all such $\ell$, the word $v^\ell wu = x^{\ell i + j}$ is not

1390: a $k$-power, and hence the word $uv^\ell w$ is a non-$k$-power in $L$.

1391: We thus have infinitely many non-$k$-powers in $L$, as required.

1392:

1393: Case 2: $x \neq y$.  By Theorem~\ref{p+q+}, $v^*wu$ contains

1394: infinitely many primitive words.  Thus, $uv^*w$ contains infinitely

1395: many non-$k$-powers, as required.

1396: \end{proof}

1397:

1398: We are now ready to prove Theorem~\ref{k-pow}.

1399:

1400: \begin{proof}[Proof of Theorem~\ref{k-pow}]

1401: The proof is similar to that of \cite[Proposition~7]{Ito&Katsura&Shyr&Yu:1988}.

1402: It suffices to prove statement (2) of the theorem, since statement (1)

1403: follows immediately from (2) and Lemma~\ref{inf_many}.

1404:

1405: Suppose that $L$ contains infinitely many non-$k$-powers.  Then

1406: $L$ contains a non-$k$-power $s$ with $|s| \geq n$.  Suppose, contrary to

1407: statement (2), that a shortest such $s$ has $|s| > 3n$.

1408: Then any computation of $M$ on

1409: $s$ must repeat some state at least $4$ times.  It follows that

1410: there exists a decomposition $s = u v_1 v_2 v_3 w$,

1411: $v_1,v_2,v_3 \neq \epsilon$, such that $u v_1^* v_2^* v_3^* w \subseteq L$.

1412: We may assume further that $|v_1v_2v_3| \leq 3n$, so that $wu \neq \epsilon$.

1413:

1414: Let $p_1$, $p_2$, $p_3$, and $q$ be the primitive roots of

1415: $v_1$, $v_2$, $v_3$, and $wu$, respectively.

1416: Let $v_1 = p_1^{i_1}$, $v_2 = p_2^{i_2}$, $v_3 = p_3^{i_3}$, and $wu = q^j$,

1417: for some integers $i_1,i_2,i_3,j>0$.  We consider three cases.

1418:

1419: Case~1: $p_1 = p_2 = p_3 = q$.

1420: Without loss of generality, suppose that $|v_1| \leq |v_2| \leq |v_3|$.

1421: Since $|s| > 3n$, we must have $|uv_3w| \geq n$, and thus

1422: $|uv_1v_3w| \geq n$ and $|uv_2v_3w| \geq n$.  By assumption, the words

1423: $v_3wu = q^{i_3+j}$, $v_1v_3wu = q^{i_1+i_3+j}$, and

1424: $v_2v_3wu = q^{i_2+i_3+j}$ are $k$-powers, whereas the word

1425: $v_1v_2v_3wu = q^{i_1+i_2+i_3+j}$ is not.  Applying Corollary~\ref{ls_cor},

1426: we deduce that the following system of equations

1427: \begin{eqnarray*}

1428: i_1 + i_2 + i_3 + j & \not\equiv & 0 \pmod k \\

1429: i_3 + j & \equiv & 0 \pmod k \\

1430: i_1 + i_3 + j & \equiv & 0 \pmod k \\

1431: i_2 + i_3 + j & \equiv & 0 \pmod k

1432: \end{eqnarray*}

1433: must be satisfied.  However, it is easy to see that this is impossible.

1434:

1435: Case~2: $p_1 \neq q$ and $p_2 = p_3 = q$.  If $|v_1wu| \leq n$,

1436: then let $\ell$ be the smallest positive integer

1437: such that $n \leq |v_1^\ell wu| < |v_1^{\ell+1} wu| \leq |s|$.  Then

1438: by Proposition~\ref{shyr}, one of the words $v_1^\ell wu$ or

1439: $v_1^{\ell+1} wu$ is primitive.  Hence,

1440: at least one of the words $u v_1^\ell w$ or $u v_1^{\ell+1} w$

1441: is a primitive word in $L$, contradicting the minimality of $s$.

1442:

1443: If, instead, $|v_1wu| > n$, then we have $n < |v_1wu| < |v_1v_2wu| \leq |s|$.

1444: Again, by Proposition~\ref{shyr}, one of the words $v_1wu$ or

1445: $v_1v_2wu$ is primitive.  Hence, at least one of the words

1446: $uv_1w$ or $uv_1v_2w$ is a primitive word in $L$, contradicting

1447: the minimality of $s$.

1448:

1449: Case~3: $p_1 \neq q$ and $p_2 \neq q$.  In this case we choose the

1450: smaller of $v_1$ and $v_2$ to ``pump'', so without loss of generality,

1451: suppose $|v_1| \leq |v_2|$.  Let $\ell$ be the smallest positive integer

1452: such that $n \leq |v_1^\ell wu| < |v_1^{\ell+1} wu| \leq |s|$.  Note

1453: that $|v_1^2 wu| \leq |v_1v_2wu| < |s|$, so such an $\ell$ must exist.

1454: Then by Proposition~\ref{shyr}, one of the words $v_1^\ell wu$ or

1455: $v_1^{\ell+1} wu$ is primitive.  Hence,

1456: at least one of the words $u v_1^\ell w$ or $u v_1^{\ell+1} w$

1457: is a primitive word in $L$, contradicting the minimality of $s$.

1458:

1459: All remaining possibilities are symmetric to the cases considered above.

1460: Since in all cases we derive a contradiction, it follows that if $L$

1461: contains infinitely many non-$k$-powers, it contains a non-$k$-power

1462: $s$, where $n \leq |s| \leq 3n$.

1463:

1464: It remains to consider the situation where $M$ is a DFA over an alphabet

1465: of size $\geq 2$.  Let $a \neq b$ be alphabet symbols of $M$.  If $M$ does

1466: not have a dead state, then for every integer $i \geq n-1$,

1467: there exists a word $x$, $|x| \leq n-1$, such

1468: that $a^ibx \in L$.  These words $a^ibx$ are all distinct and primitive.

1469: Thus, whenever $M$ has no dead state, $M$ always accepts infinitely many

1470: non-$k$-powers, and, in particular, $M$ accepts a non-$k$-power $s$,

1471: where $n \leq |s| \leq 2n-1$.

1472:

1473: If, on the other hand, $M$ does have a dead state,

1474: then we may delete this dead state and apply

1475: the earlier argument with the bound $3n-3$ in place of $3n$.

1476:

1477: Finally, the converse of statement (2) follows immediately from

1478: Lemma~\ref{inf_many}.

1479: \end{proof}

1480:

1481: We can now deduce the following algorithmic result.

1482:

1483: \begin{theorem}\label{alg_allkpow}

1484: Let $k \geq 2$ be an integer.  Given an NFA $M$ with $n$ states and

1485: $t$ transitions, it is

1486: possible to determine if every word in $L(M)$ is a $k$-power in

1487: $O(n^3 + t n^2)$ time.

1488: \end{theorem}

1489:

1490: \begin{proof}

1491:      The proof is exactly analogous to that of Theorem~\ref{thm2}, and

1492: we only indicate what needs to be changed.  Suppose $M$ has $t$ states.

1493: We create an NFA, $M'_r$, for $r = 3t$, such that

1494: no word in $L(M'_r)$ is a $k$-power, and $M'_r$ accepts all non-$k$-powers

1495: of length $\leq r$ (and perhaps some other non-$k$-powers).

1496:

1497: Note that we may assume that $k \leq r$.  If $k > r$, then no word of

1498: length $\leq r$ is a $k$-power.  In this case, to obtain the desired

1499: answer it suffices to test if the set $\{ x \in L(M) : |x| \leq r \}$

1500: is empty.  However, this set is empty if and only if $L(M)$ is empty, and

1501: this is easily verified in linear time.

1502:

1503: We now form a new NFA $A$ as

1504: the cross product of $M'_r$ with $M$.  From Theorem~\ref{k-pow}, it follows

1505: that $L(A) = \emptyset$ iff

1506: every word in $L(M)$ is a $k$-power.  We can determine if

1507: $L(A) = \emptyset$ by

1508: checking (using depth-first search)

1509: whether any final states of $A$ are reachable from the start state.

1510:

1511:      It remains to see how $M'_r$ is constructed.

1512: If the length of a word $x$ accepted by $M_r'$ is a multiple of $k$,

1513: $x$ can be partitioned into $k$ sections of equal length.  In order

1514: for $M_r'$ to accept $x$, the NFA must `verify' a symbol mismatch between

1515: two symbols found in different sections but in the same position.

1516:

1517: If $x$ is a non-$k$-power, then a symbol mismatch will occur between two

1518: sections of $x$, call them $s_i$ and $s_j$.  This means that $s_i$ and

1519: $s_j$ differ in at least one position.  Comparing $s_i$ and $s_j$ to

1520: $s_1$, the first section of $x$, we notice that at least one of $s_i$ or

1521: $s_j$ must have a symbol mismatch with $s_1$ (otherwise $s_1=s_i=s_j$,

1522: which would give a contradiction).  Therefore, when checking $x$ for a

1523: symbol mismatch, it is sufficient to only check $s_1$ against each of the

1524: remaining $k-1$ sections, as opposed to checking all $k \choose

1525: 2$ possibilities.

1526:

1527: In order to construct $M_r'$, we create a series of `lobes', each of which

1528: is connected to the start state by an $\epsilon$-transition.  Each lobe

1529: represents three simultaneous `guesses' made by the NFA, which are:

1530:

1531: \begin{itemize}

1532: \item Which alphabet symbols will conflict and in which order.  The number

1533: of possible conflict pairs is $| \Sigma | \left( |\Sigma| - 1 \right)$.

1534:

1535: \item The section in which there will be a symbol mismatch with the first

1536: section.  There are $k-1$ possible sections.

1537:

1538: \item The position in which the conflict will occur.  In the worst case

1539: when the length of the input is $r$, there will be at most

1540: $r/k$ possible positions.

1541: \end{itemize}

1542:

1543: This gives a total of at most $|\Sigma|\left( |\Sigma| - 1 \right) \cdot (k-1)

1544: \cdot  r/k $ lobes.  The construction of each lobe is

1545: illustrated in Figure~\ref{fig:module}.

1546:

1547: \begin{figure}[hbt]%55

1548: \input module.tex

1549: \caption{One lobe of the NFA for $k=3$, $r=12$ and

1550: $0,1$ conflicting symbols.}

1551: \label{fig:module}

1552: \end{figure}

1553:

1554: Each lobe contains at most $r+1$ states.

1555: In addition to these lobes, we also require a $k$-state submachine to

1556: accept all words whose lengths are not a multiple of $k$.

1557:

1558: In total, $M_r'$ has at most

1559: $$|\Sigma| \left( |\Sigma| - 1 \right) \cdot (k-1) \cdot

1560: {r \over k}  \cdot (r+1) + k + 1 \in O(r^2)$$ states

1561: (since $k \leq r$), and similarly, $O(r^2)$

1562: transitions.

1563: After constructing the cross-product, this gives a $O(n^3 + tn^2)$

1564: bound on the time required to determine if every word in $L(M)$ is a

1565: $k$-power.

1566: \end{proof}

1567:

1568:      Theorem~\ref{k-pow} suggests the following question:  if $M$ is an

1569: NFA with $n$ states that accepts at least one non-$k$-power, how long

1570: can a shortest non-$k$-power be?   Theorem~\ref{k-pow} proves an

1571: upper bound of $3n$.   A lower bound of $2n-1$ for infinitely many

1572: $n$ follows easily from the obvious $(n+1)$-state NFA accepting

1573: ${\tt a}^{n} ({\tt a}^{n+1})^*$, where $n$ is divisible by $k$.

1574: However, Ito, Katsura, Shyr, and Yu \cite{Ito&Katsura&Shyr&Yu:1988}

1575: gave a very interesting example that improves this lower bound:

1576: if $x = ( (ab)^n a)^2$ and $y = ba x ab$,  then $x$ and $xyx$ are

1577: squares, but $xyxyx$ is not a power.  Hence, the obvious $(8n+8)$-state NFA

1578: that accepts $x(yx)^*$ has the property that the shortest non-$k$-power

1579: accepted is of length $20n+18$.  This improves the lower bound  for

1580: infinitely many $n$.

1581:

1582:        We now generalize their lower bound.

1583:

1584: \begin{proposition}

1585:        Let $k \geq 2$ be fixed.  There exist infinitely many NFAs $M$

1586: with the property that if $M$ has $r$ states, then the shortest

1587: non-$k$-power accepted is of length $\left(2+ {1 \over{2k-2}}\right) r - O(1)$.

1588: \end{proposition}

1589:

1590: \begin{proof}

1591: Let $u = (ab)^n a$, $x = u^k$, and $ y = x^{-1} (x ba u^{-1} x)^k x^{-1}$.

1592: Thus $xyx = (x ba u^{-1} x)^k$.

1593: Hence $x$ and $xyx$ are both $k$-powers.

1594:

1595: However, $xyxyx$ is not a $k$-power.  To see this,

1596: assume it is, and write $xyxyx = g_1 g_2 \cdots g_k$.

1597: Look at the character in position $2kn-2n+k$ (indexing beginning with 1)

1598: in $g_1$ and $g_k$.  In $g_1$ it is $a$, and in $g_k$ it is $b$, so

1599: $xyxyx$ is not a $k$-power.

1600:

1601: We can accept $x(yx)^*$ with an NFA using $|xy|$ states.

1602: The shortest non-$k$-power is $xyxyx$, which is of length $m$.

1603:

1604: We have $|u| = 2n+1$, $|x| = k(2n+1)$, $|y| = k(4kn - 6n + 2k - 1)$,

1605: $ r = |xy| = 2k(2kn - 2n + k)$, and $ m = |xyxyx| = k(8kn - 6n + 4k + 1)$.

1606: Thus $m = {{4k-3} \over {2k-2}}r - {k \over {k-1}} =

1607: \left(2 + {1 \over {2k-2}}\right)r - O(1)$.

1608: \end{proof}

1609:

1610: Next, we apply part~(2) of Theorem~\ref{k-pow} to obtain an algorithm

1611: to check if an NFA accepts infinitely many non-$k$-powers.

1612:

1613: \begin{theorem}

1614: Let $k \geq 2$ be an integer.  Given an NFA $M$ with $n$ states and

1615: $t$ transitions, it is possible to determine if all but finitely many words

1616: in $L(M)$ are $k$-powers in $O(n^3 + t n^2)$ time.

1617: \end{theorem}

1618:

1619: \begin{proof}

1620: The proof is similar to that of Theorem~\ref{alg_allkpow}.

1621: The only difference is that in view of part~(2) of Theorem~\ref{k-pow}

1622: we instead construct $M_r'$ to accept all non-$k$-powers $s$,

1623: where $n \leq |s| \leq 3n$.  We leave the details to the reader.

1624: \end{proof}

1625:

1626: \section{Automata accepting only powers}

1627: \label{powers}

1628:

1629: In this section we move from the problem of testing if an automaton

1630: accepts only $k$-powers to the problem of testing if it accepts only

1631: powers (of any kind).  Just as Theorem~\ref{k-pow} was the starting

1632: point for our algorithmic results in Section~\ref{kp}, the

1633: following theorem of Ito, Katsura, Shyr, and Yu

1634: \cite{Ito&Katsura&Shyr&Yu:1988} is the starting point for our

1635: algorithmic results in this section.  We state the theorem in

1636: a stronger form than was originally presented by Ito et al.

1637:

1638: \begin{theorem}

1639: \label{ito}

1640: Let $L$ be accepted by an $n$-state NFA $M$.

1641: \begin{enumerate}

1642: \item Every word in $L$ is a power if and only if every word in the set

1643: $\lbrace x \in L : |x| \leq 3n \rbrace$ is a power.

1644: \item All but finitely many words in $L$ are powers if and only if

1645: every word in the set $\lbrace x \in L : n \leq |x| \leq 3n \rbrace$

1646: is a power.

1647: \end{enumerate}

1648: Further, if $M$ is a DFA over an alphabet of size $\geq 2$, then the bound $3n$

1649: may be replaced by $3n-3$.

1650: \end{theorem}

1651:

1652: We next prove an analogue of Proposition~\ref{my-ner}.  We

1653: need the following result, first proved by Birget \cite{Birget:1992},

1654: and later, independently, in a weaker form, by Glaister and Shallit

1655: \cite{Glaister&Shallit:1996}.

1656:

1657: \begin{theorem}

1658: \label{birget}

1659: Let $L \subseteq \Sigma^*$ be a regular language.  Suppose there exists

1660: a set of pairs

1661: \[

1662: S = \{(x_i,y_i) \in \Sigma^* \times \Sigma^* : 1 \leq i \leq n \}

1663: \]

1664: such that

1665: \begin{itemize}

1666: \item $x_iy_i \in L$ for $1 \leq i \leq n$, and

1667: \item either $x_iy_j \notin L$ or $x_jy_i \notin L$ for $1 \leq i,j \leq n$,

1668: $i \neq j$.

1669: \end{itemize}

1670: Then any NFA accepting $L$ has at least $n$ states.

1671: \end{theorem}

1672:

1673: \begin{proposition}

1674: \label{slender_7n}

1675: Let $M$ be an $n$-state NFA and let $\ell$ be a non-negative integer

1676: such that every word in $L(M)$ of length $\geq \ell$ is a power.

1677: For all $r \geq \ell$, the number of words in $L(M)$ of length $r$

1678: is at most $7n$.

1679: \end{proposition}

1680:

1681: \begin{proof}

1682: Let $r \geq \ell$ be an arbitrary integer.  The proof consists of three steps.

1683:

1684: Step~1.  We consider the set $A$ of words $w$ in $L(M)$ such that

1685: $|w| = r$ and $w$ is a $k$-power for some $k \geq 4$.  For each such $w$,

1686: write $w = x^i$, where $x$ is a primitive word, and define a pair

1687: $(x^2,x^{i-2})$.  Let $S_A$ denote the set of such pairs.

1688: Consider two pairs in $S_A$: $(x^2,x^{i-2})$ and

1689: $(y^2,y^{j-2})$.  The word $x^2y^{j-2}$ is primitive by Theorem~\ref{ls_eqn}

1690: and hence is not in $L(M)$.  The set $S_A$ thus satifies the conditions

1691: of Theorem~\ref{birget}.  Since $L(M)$ is accepted by an $n$-state

1692: NFA, we must have $|S_A| \leq n$ and thus $|A| \leq n$.

1693:

1694: Step~2.  Next we consider the set $B$ of cubes of length $r$ in $L(M)$.

1695: For each such cube $w = x^3$, we define a pair $(x,x^2)$.  Let

1696: $S_B$ denote the set of such pairs.  Consider two pairs in $S_B$:

1697: $(x,x^2)$ and $(y,y^2)$.  Suppose that $xy^2$ and $yx^2$ are both in

1698: $L(M)$.  The word $xy^2$ is certainly not a cube; we claim that it

1699: cannot be a square.  Suppose it were.  Then $|x|$ and $|y|$ are even,

1700: so we can write $x = x_1 x_2$ and $y = y_1 y_2$ where

1701: $|x_1| = |x_2| = |y_1| = |y_2|$.  Now if $xy^2 = x_1 x_2 y_1 y_2 y_1 y_2$

1702: is a square, then $x_1 x_2 y_1 = y_2 y_1 y_2$, and so $y_1 = y_2$.

1703: Thus $y$ is a square; write $y = z^2$.

1704: By Theorem~\ref{ls_eqn}, $yx^2 = z^2x^2$ is primitive,

1705: contradicting our assumption that $yx^2 \in L(M)$.  It must be

1706: the case then that $xy^2$ is a $k$-power for some $k \geq 4$.

1707: Thus, $xy^2 = u^k$ for some primitive $u$ uniquely determined by $x$ and $y$.

1708: With each pair of cubes $x^3$ and $y^3$ such that both $xy^2$ and $yx^2$

1709: are in $L(M)$ we may therefore associate a $k$-power $u^k \in L(M)$ of length

1710: $r$, where $k \geq 4$.  We have already established in Step~1 that

1711: the number of such $k$-powers is at most $n$.  It follows that

1712: by deleting at most $n$ pairs from the set $S_B$ we obtain

1713: a set of pairs satisfying the conditions of Theorem~\ref{birget}.

1714: We must therefore have $|S_B| \leq 2n$ and thus $|B| \leq 2n$.

1715:

1716: Step~3.  Finally we consider the set $C$ of squares of length $r$ in

1717: $L(M)$.  For each such square $w = x^2$, we define a pair $(x,x)$.

1718: Let $S_C$ denote the set of such pairs.  Consider two pairs in

1719: $S_C$: $(x,x)$ and $(y,y)$.  Suppose that $xy$ and $yx$ are both

1720: in $L(M)$.  The word $xy$ is not a square and must therefore be

1721: a $k$-power for some $k \geq 3$.  We write $xy = u^k$ for some

1722: primitive $u$ uniquely determined by $x$ and $y$.  In Steps~1 and 2

1723: we established that the number of $k$-powers of length $r$, $k \geq 3$,

1724: is $|A| + |B| \leq 3n$.  It follows that

1725: by deleting at most $3n$ pairs from the set $S_C$ we obtain

1726: a set of pairs satisfying the conditions of Theorem~\ref{birget}.

1727: We must therefore have $|S_C| \leq 4n$ and thus $|C| \leq 4n$.

1728:

1729: Putting everything together, we see that there are

1730: $|A| + |B| + |C| \leq 7n$ words of length $r$ in $L(M)$,

1731: as required.

1732: \end{proof}

1733:

1734:      The bound of $7n$ in Proposition~\ref{slender_7n} is almost

1735: certainly not optimal.

1736:

1737: We now prove the following algorithmic result.

1738:

1739: \begin{theorem}

1740: Given an NFA $M$ with $n$ states, it is

1741: possible to determine if every word in $L(M)$ is a power in

1742: $O(n^5)$ time.

1743: \label{kats}

1744: \end{theorem}

1745:

1746: \begin{proof}

1747:      First, we observe that we can test whether a word $w$ of length

1748: $n$ is a power in $O(n)$ time, using a linear-time string matching

1749: algorithm, such as Knuth-Morris-Pratt \cite{Knuth&Morris&Pratt:1977}.

1750: To do so, search for $w = a_1 a_2 \cdots a_n$ in the word

1751: $x = a_2 \cdots a_n a_1 \cdots a_{n-1}$.  Then $w$ appears in $x$ iff

1752: $w$ is a power.  Furthermore, if the leftmost occurrence of

1753: $w$ in $x$ appears beginning at $a_i$, then $w$ is a $n/(i-1)$ power, and

1754: this is the largest exponent of a power that $w$ is.

1755:

1756:      Now, using Theorem~\ref{ito}, it suffices to test all words

1757: in $L(M)$ of length $\leq 3n$;  every word in $L(M)$ is a power iff all

1758: of these words are powers.  On the other hand, by

1759: Proposition~\ref{slender_7n}, if all words are powers, then

1760: the number of words of each length is bounded by $7n$.  Thus, it

1761: suffices to enumerate the words in $L(M)$ of lengths $1,2, \ldots, 3n$,

1762: stopping if the number of such words in any length exceeds $7n$.  If all

1763: these words are powers, then every word is a power.  Otherwise, if we

1764: find a non-power, or if the number of words in any length exceeds $7n$,

1765: then not every word is a power.

1766:

1767:       By the work of M\"akinen \cite{Makinen:1997} or

1768: Ackerman \& Shallit \cite{Ackerman&Shallit:2007}, we can enumerate

1769: these words in $O(n^5)$ time.

1770: \end{proof}

1771:

1772:       Using part~(2) of Theorem~\ref{ito} along with

1773: Proposition~\ref{slender_7n}, we can prove the following.

1774:

1775: \begin{theorem}

1776:      Given an NFA $M$ with $n$ states,

1777: we can decide if all but finitely many words in $L(M)$ are

1778: non-powers in $O(n^5)$ time.

1779: \end{theorem}

1780:

1781: \begin{proof}

1782:       The proof is analogous to that of Theorem~\ref{kats}.  The only

1783: difference is that here we need only enumerate the words in $L(M)$ of

1784: lengths $n,n+1,\ldots,3n$.

1785: \end{proof}

1786:

1787:

1788: \section{Bounding the length of a smallest power}

1789: \label{smallkp}

1790:

1791: In Section~\ref{kp} we gave an upper bound on the length of

1792: a smallest non-$k$-power accepted by an $n$ state NFA.  In this section

1793: we study the complementary problem of bounding the length of

1794: the smallest $k$-power accepted by an $n$-state NFA.

1795:

1796: \begin{proposition}

1797: \label{upper_bd}

1798: Let $M$ be an NFA with $n$ states and let $k \geq 2$ be an integer.

1799: If $L(M)$ contains a $k$-power, then $L(M)$ contains a $k$-power

1800: of length $\leq kn^k$.

1801: \end{proposition}

1802:

1803: \begin{proof}

1804: Consider the NFA-$\epsilon$ $M'$ accepting $L(M)^{1/k}$ defined in the proof of

1805: Proposition~\ref{fixed-k}.  The only transitions from the start

1806: state of $M'$ are $\epsilon$-transitions to submachines whose states are

1807: $(2k-1)$-tuples of the form

1808: $[g_1, g_2, \ldots, g_{k-1}, p_0, p_1, \ldots, p_{k-1}]$,

1809: where the first $(k-1)$-elements of the tuple are fixed.  Thus we may

1810: consider $L(M')$ as a finite union of languages, each accepted by

1811: an NFA of size $n^k$.  It follows that if $M'$ accepts a non-empty

1812: word $w$, it accepts such a $w$ of length $\leq n^k$.  However,

1813: $M'$ accepts $w$ if and only if $M$ accepts $w^k$.  We conclude that

1814: if $L(M)$ contains a $k$-power, it contains one of length $\leq kn^k$.

1815: \end{proof}

1816:

1817: We now give a lower bound on the size of the smallest $k$-power

1818: accepted by an $n$-state DFA.

1819:

1820: \begin{proposition}

1821: Let $k \geq 2$ be an integer.  There exist infinitely many DFAs

1822: $M_n$ such that

1823:

1824: \begin{itemize}

1825: \item[(a)] $M_n$ has $O(kn)$ states;

1826: \item[(b)] The shortest $k$-power accepted by $M_n$ is of length

1827: $k\cdot\Omega\left({n \choose k}\right)$.

1828: \end{itemize}

1829: \end{proposition}

1830:

1831: \begin{proof}

1832: For $n \geq k$, let

1833: \[

1834: L_n = ({\tt a}^n)^+ {\tt b} ({\tt a}^{n-1})^+ {\tt b} \cdots

1835: ({\tt a}^{n-k+1})^+ {\tt b}.

1836: \]

1837: Then $L_n$ is accepted by a DFA with $O(kn)$ states,

1838: and the shortest $k$-power in $L_n$ is $({\tt a}^\ell{\tt b})^k$,

1839: where

1840: \[

1841: \ell = \text{lcm}(n,n-1,\ldots,n-k+1) \geq n(n-1)\cdots(n-k+1)/k!

1842: = {n \choose k},

1843: \]

1844: as required.

1845: \end{proof}

1846:

1847: Next we consider the length of a smallest power (rather than $k$-power).

1848:

1849: \begin{proposition}

1850: \label{exponent_bd}

1851: Let $M$ be an NFA with $n$ states.  If $L(M)$ contains a power,

1852: it contains a $k$-power for some $k$, $2 \leq k \leq n+1$.

1853: \end{proposition}

1854:

1855: \begin{proof}

1856: Suppose to the contrary

1857: that the smallest $k$ for which $L(M)$ contains a $k$-power $w^k$

1858: satisfies $k > n+1$.  For some accepting computation of $M$ on $w^k$ let

1859: $q_1,q_2,\ldots,q_{k-1}$ be the states reached by $M$ after

1860: reading $w,w^2,\ldots,w^{k-1}$ respectively.  Since $k > n+1$, there

1861: exist $i$ and $j$ where $1 \leq i < j \leq k-1$ and $q_i = q_j$.

1862: It follows that $M$ accepts $w^\ell$ for some $\ell$, $2 \leq \ell < k$,

1863: contradicting the minimality of $k$.  We conclude that if $L(M)$ contains a

1864: $k$-power, we may take $k \leq n+1$.

1865: \end{proof}

1866:

1867: \begin{proposition}

1868: Let $M$ be an NFA with $n$ states.  If $L(M)$ contains a power,

1869: then $L(M)$ contains a power of length $\leq (n+1)n^{n+1}$.

1870: \end{proposition}

1871:

1872: \begin{proof}

1873: Apply Propositions~\ref{exponent_bd} and \ref{upper_bd}.

1874: \end{proof}

1875:

1876: We now give a lower bound.

1877:

1878: \begin{proposition}

1879: \label{smallest_pow}

1880: There exist infinitely many DFAs $M_n$ such that

1881:

1882: \begin{itemize}

1883: \item $M_n$ has $O(n)$ states;

1884: \item The shortest power accepted by $M_n$ is of length

1885: $e^{\Omega(\sqrt{n \log n})}$.

1886: \end{itemize}

1887: \end{proposition}

1888:

1889: \begin{proof}

1890: Let $p_i$ denote the $i$-th prime number.  For any integer

1891: $n \geq 2$, let $P(n) = p_k$ be the largest prime number such that

1892: $p_1 + p_2 + \cdots + p_k \leq n$.  We define

1893: \[

1894: L_n = ({\tt a}^{p_1})^+ {\tt b} ({\tt a}^{p_2})^+ {\tt b} \cdots

1895: ({\tt a}^{p_k})^+ {\tt b}.

1896: \]

1897: Then $L_n$ is accepted by a DFA with $O(n)$ states.

1898:

1899: If $k$ is itself prime,

1900: the shortest power in $L_n$ is $w = ({\tt a}^\ell{\tt b})^k$,

1901: where $\ell = p_1p_2 \cdots p_k$.  For $n \geq 2$, let

1902: \[

1903: F(n) = \prod_{p \leq P(n)} p,

1904: \]

1905: where the product is over primes $p$.

1906: We have $F(n) \in e^{\Omega(\sqrt{n \log n})}$ \cite[Theorem~1]{Miller:1987}.

1907: This lower bound is valid

1908: for all sufficiently large $n$; in particular, it holds for infinitely

1909: many $n$ such that $n = p_1 + p_2 + \cdots + p_k$, where $k$ is prime.

1910: This gives the desired result.

1911: \end{proof}

1912:

1913: \section{Additional results on powers}

1914: \label{add2pow}

1915:

1916: D\"om\"osi, Mart\'{\i}n-Vide, and Mitrana

1917: \cite[Theorem~10]{Domosi&Martin-Vide&Mitrana:2004} proved that if $L$

1918: is a slender regular language over $\Sigma$, and $Q_\Sigma$ is the

1919: set of primitive words over $\Sigma$, then $L \cap Q_\Sigma$ is regular.

1920: This result is somewhat surprising, since it is widely believed

1921: that $Q_\Sigma$ is not even context-free for $|\Sigma| \geq 2$.  In this

1922: section we apply a variation of their argument to show that $Q_\Sigma$ may be

1923: replaced by the language of squares, (cubes, etc.) over $\Sigma$.

1924:

1925: For any integer $k \geq 2$ and alphabet $\Sigma$, let $P(k,\Sigma)$

1926: denote the set of $k$-powers over $\Sigma$.  Clearly, for $|\Sigma| \geq 2$,

1927: $P(k,\Sigma)$ is not context-free.

1928:

1929: \begin{proposition}

1930: If $L \subseteq \Sigma^*$ is a slender regular language, then for all

1931: integers $k \geq 2$, $L \cap P(k,\Sigma)$ is regular.

1932: \end{proposition}

1933:

1934: \begin{proof}

1935: If $L$ is slender, then by Theorem~\ref{slender} it

1936: suffices to consider $L = uv^*w$.

1937: The result is clearly true if $v$ is empty, so we suppose $v$ is

1938: non-empty.  Let $x$ and $y$ be the primitive roots of $v$ and $wu$

1939: respectively.  If $x = y$, then the set of $k$-powers in $v^*wu$

1940: is given by $v^*wu \cap (x^k)^*$, so the set of $k$-powers in $uv^*w$

1941: is regular.  If $x \neq y$, then by Theorem~\ref{p+q+},

1942: the set $v^*wu$ contains only finitely many $k$-powers.

1943: The set of $k$-powers in $uv^*w$ is therefore finite, and,

1944: a fortiori, regular.

1945: \end{proof}

1946:

1947: \section{Testing if an NFA accepts a bordered word}

1948: \label{bord}

1949:

1950: In this section we give an efficient algorithm to test if an

1951: NFA accepts a bordered word.  We also give upper and lower

1952: bounds on the length of a shortest bordered word accepted by

1953: an NFA.

1954:

1955: \begin{proposition}

1956:       Given an NFA $M$ with $n$ states and $t$ transitions,

1957: we can decide if $M$ accepts at least one

1958: bordered word in $O(n^3 t^2)$ time.

1959: \label{border}

1960: \end{proposition}

1961:

1962: \begin{proof}

1963:       Given an NFA $M = (Q, \Sigma, \delta, q_0, F)$,

1964:       we can easily create an NFA-$\epsilon$ $M'$ that

1965: accepts

1966: $$\lbrace u \in \Sigma^* \ : \ \text{there exists

1967: $w \in \Sigma^*$ such that } uwu \in L \rbrace$$

1968: by ``guessing'' the state we would be in after reading $uw$, and

1969: then verifying it.   More formally, we let $M' = (Q', \Sigma,

1970: \delta', q'_0, F')$ where $Q' = \lbrace q'_0 \rbrace

1971: \cup \ \lbrace [p,q,r] \ : \ p, q, r \in Q \rbrace$,

1972: $F' = \lbrace [p,q,r] \ : \ r \in F \text{ and there exists

1973: $w \in \Sigma^*$ such that } q \in \delta(p,w) \rbrace$.

1974: The transitions are defined as follows:

1975: $\delta(q'_0, \epsilon) = \lbrace [q_0, p, p] \ : \ p \in Q \rbrace$

1976: and

1977: $$\delta([p,q,r],a) = \lbrace [p', q, r'] \ : \ p' \in \delta(p,a),

1978: 	r' \in \delta(r,a) \rbrace.$$

1979: If $M$ has $n$ states and $t$ transitions,

1980: then $M'$ has $n^3 + 1$ states and at most $n + n^3 t^2$ transitions.

1981: Now get rid of all useless states and their associated transitions.

1982: We can compute the final states by doing $n$ depth-first searches,

1983: starting at each node, at a cost of $O(n(n+t))$ time.

1984: Now we just test to see if $L(M')$ accepts a nonempty

1985: string, which can be

1986: done in linear time in the size of $M'$.

1987: \end{proof}

1988:

1989: \begin{corollary}

1990:      If $M$ is an NFA with $n$ states, and it accepts at least one

1991: bordered word, it must accept a bordered word of length

1992: $< 2n^2 + n$.

1993: \end{corollary}

1994:

1995: \begin{proof}

1996:     Consider the DFA $M'$ constructed in the proof of the

1997: previous theorem, which accepts

1998: $$L' = \lbrace u \in \Sigma^* \ : \ \text{there exists

1999: $w \in \Sigma^*$ such that } uwu \in L \rbrace.$$

2000: If $M$ accepts a bordered string, then $M'$ accepts a nonempty string.

2001: Although $M'$ has $n^3+1$ states, once a computation

2002: leaves $q'_0$ and enters a triple of the form $[p,q,r]$, it never

2003: enters a state $[p',q',r']$ with $q \not= q'$.  Thus we may view

2004: the NFA $M'$ as implicitly defining a union of $n$ disjoint languages,

2005: each accepted by an NFA with $n^2$ states.     Therefore, if $M'$

2006: accepts a nonempty string $u$, it accepts one of length at most $n^2$.

2007: Now the corresponding bordered string is $uwu$.  The string $w$

2008: is implicitly defined in the previous proof as a path from a state

2009: $p$ to a state $q$.  If such a path exists, it is of length at most

2010: $n-1$.  Thus there exists $uwu \in L(M)$  with $|uwu| \leq 2n^2 + n-1$.

2011: \end{proof}

2012:

2013: \begin{proposition}

2014:        For infinitely many $n$ there is an DFA of $n$ states

2015: such that the shortest bordered word accepted is of length

2016: $n^2/2 - 6n +43/2$.

2017: \end{proposition}

2018:

2019: \begin{proof}

2020: Consider $a (b^t)^+ c a (b^{t-1})^+ c$.  An obvious DFA can accept

2021: this using $2t+5$ states.  However, the

2022: shortest bordered word accepted is $a b^{t(t-1)} c

2023: a b^{t(t-1)} c$, which is of length $2t(t-1)+ 4 = n^2/2 - 6n + 43/2$.

2024: \end{proof}

2025:

2026:     We now consider

2027: testing if an NFA accepts infinitely many bordered words.

2028:

2029: \begin{corollary}

2030:      If an NFA $M$ has $n$ states and $t$ transitions,

2031: we can test whether $M$ accepts infinitely many bordered words

2032: in $O(n^6 t^2)$ time.

2033: \end{corollary}

2034:

2035: \begin{proof}

2036:       If an NFA $M$ accepts infinitely many words of the form $uwu$,

2037: there are two possibilities, at least one of which must hold:

2038:

2039: \begin{itemize}

2040: \item[(a)] there is a single word $u$ such

2041: that there are infinitely many $w$ with $uwu \in L(M)$, or

2042:

2043: \item[(b)] there

2044: are infinitely many $u$, with possibly different $w$ depending on $u$,

2045: such that $uwu \in L(M)$.

2046: \end{itemize}

2047:

2048:       To check these possibilities, we return to the NFA-$\epsilon$ $M'$

2049: constructed in the proof of Theorem~\ref{border}.  First, for each pair

2050: of states $q_i$ to $q_j$, we determine whether there exists a nonempty

2051: path from $q_i$ to $q_j$.  This can be done with

2052: $n$ different depth-first searches, starting at each vertex, at a cost

2053: of $O(n^3(n^3+t^2))$ time.

2054: In particular, for each vertex, we learn whether there

2055: is a nonempty cycle beginning and ending at that vertex.

2056:

2057: Now let us check whether (a) holds.  After removing all useless states

2058: and their associated transitions, look at the remaining final states

2059: $[p,q,r]$ of $M'$ and determine if there is a path from $p$ to $q$

2060: that goes through a vertex with a cycle.   This can be done by

2061: testing, for each vertex $s$ that has a cycle, whether there is a non-empty

2062: path from $p$ to $s$ and then $s$ to $q$.  If such a vertex exists, then

2063: there are infinitely many $w$ in some $uwu$.

2064:

2065: To check whether (b) holds, we just need to know whether $M'$ accepts

2066: infinitely many strings, which we can easily check by looking for a

2067: directed cycle.

2068:

2069: The total cost is therefore $O(n^3(n^3 t^2))$.

2070: \end{proof}

2071:

2072: We now prove the following decomposition theorem for regular languages

2073: consisting only of bordered words.

2074:

2075: \begin{theorem}

2076: If every word in a regular language $L$ is bordered, then there is a

2077: decomposition of $L$ as a finite union of regular languages of the

2078: form $JKJ$, where each $J$ and $K$ are regular and $\epsilon \not\in J$.

2079: \end{theorem}

2080:

2081: \begin{proof}

2082: Let $L$ be accepted by an NFA $M = (Q,\Sigma,\delta,q_0,F)$.

2083: For each $x \in \Sigma^+$, define an automaton $M_x = (Q,\Sigma,\delta,I',F')$

2084: (for $M_x$ we permit multiple initial states), where the set of

2085: initial states is $I' = \delta(q_0, x)$,

2086: and the set of final states is $F' = \{q \in Q : \delta(q,x) \in F\}$.

2087: Then $M_x$ has the property that for every $w \in L(M_x)$, we have

2088: $xwx \in L(M)$.  Note that there are only finitely many distinct automata

2089: $M_x$.

2090:

2091: For each automaton $M_x$, define the regular language

2092: \[

2093: L_x = \{y : \delta(q,y) = I' \text{ and } \{q \in Q: \delta(q,y) \in F\} = F'\}.

2094: \]

2095: Note that again there are only finitely many distinct languages $L_x$.

2096:

2097: For every $x \in \Sigma^+$, every word in $L_x L(M_x) L_x$ is in $L$.

2098: Furthermore, if $w \in L$ is bordered, then there exists $x \in \Sigma^+$

2099: such that $w \in L_x L(M_x) L_x$.  Thus, if every word of $L$

2100: is bordered, then $L = \cup_{x \in \Sigma^+} L_x L(M_x) L_x$.

2101: Since there are only finitely many languages $L_x$ and $L(M_x)$,

2102: this union is finite, as required.

2103: \end{proof}

2104:

2105: \section{Testing if an NFA accepts an unbordered word}

2106: \label{unbord}

2107:

2108: We present a simple test to determine if all words in a regular language

2109: are bordered, and to determine if a regular language contains infinitely many

2110: unbordered words.

2111: We first need the following well-known result about words, which is due to

2112: Lyndon and Sch\"utzenberger \cite{Lyndon&Schutzenberger:1962}.

2113:

2114: \begin{lemma}\label{loft1}

2115: Suppose $x$, $y$ and $z$ are non-empty words, and that $xy = yz$.  Then

2116: there is a non-empty word $p$, a word $q$ and a non-negative

2117: integer $k_1$ for which we can write $x = pq$, $z = qp$, and $y = (pq)^{k_1}p$.

2118: \end{lemma}

2119:

2120: We also need the following result, which is just a variation of the

2121: pumping lemma.

2122:

2123: \begin{lemma}\label{loft2}

2124: Let $M = (Q,\Sigma,\delta,q_0,F)$ be an $n$-state NFA.

2125: Let $L$ be the language accepted by $M$.

2126: Let $d$ be a positive integer.

2127: Let $(X,y,Z)$ be a $3$-tuple of words

2128: for which $|y|$ is a multiple of $d$, $|y| \ge nd$ and $XyZ \in L$.

2129: Then there are words $r$, $s$ and $t$, whose lengths are multiples of $d$,

2130: with $|s| \ge d$, for which we can

2131: write $y = rst$, and, for all $z \ge 0$, $Xrs^ztY \in L$.

2132: \end{lemma}

2133:

2134: \begin{proof}

2135: Set $l := |X|$ and $m := |y|/d$, $\gamma := XyZ$, and $k := |\gamma|$.

2136: First, write $\gamma$ as a sequence of letters, that is,

2137: $\gamma := \gamma_1 \gamma_2 \cdots

2138: \gamma_k$ with each $\gamma_i$ a letter.  By $\gamma[i,j]$ for $1 \le i,j

2139: \le |\gamma|$ we

2140: mean the subsequence that consists of the $i-j+1$ consecutive letters of $\gamma$

2141: starting at position $i$ and ending at position $j$, that is, $\gamma_i

2142: \gamma_{i+1}\cdots \gamma_j$.

2143: If $i > j$ we take $\gamma[i,j]$ to be the empty word.

2144: Now we have the following sequence of $k$ states

2145: \[q_1 \in \delta(q_0, \gamma_1), q_2 \in \delta(q_1, \gamma_2), \dots,

2146: q_k \in \delta(q_{k-1}, \gamma_k).\]

2147: We'll choose $q_k$ to be a final state.

2148:

2149: Note that $y = \gamma[l+1, l+md]$, and consider the following sequence

2150: of $m+1$ states of $M$:

2151:

2152: \[q_l, q_{l+d}, q_{l+2d}, \dots, q_{l+md}.\]

2153:

2154: There are integers $i$ and $j$, with $0 \le i < j \le m$ for which

2155: $q_{l+id} = q_{l+jd}$.  Set $r := \gamma[l+1, l+id]$, $s := \gamma[l+id+1,

2156: l+jd]$, and $t := \gamma[l+jd+1, l+md]$, so $y = rst$.  Note that $|s| \ge d$,

2157: and the desired conclusion follows immediately.

2158: \end{proof}

2159:

2160: \begin{lemma}\label{loft3}

2161: Let $M$ be an $n$-state NFA.  Let $L$ be the language accepted by $M$.

2162: Let $(X,Y,Z)$ be a $3$-tuple of words for which $XYZ \in L$.

2163: Then there is a word $y$ for which $|y| < n$ and $XyZ \in L$.

2164: \end{lemma}

2165:

2166: \begin{proof}

2167: Let $S := \{u \in \Sigma^{*} : XuZ \in L \}$.  Let $y$ be an element of

2168: $S$ of minimal length.  We proceed by contradiction, and suppose $|y| \ge n$.

2169: We apply Lemma~\ref{loft2} to $(X,y,Z)$, with $d = 1$, and write $y = rst$

2170: with $s$ non-empty.  Then $XrtZ \in L$, which violates the minimality of $|y|$.

2171: \end{proof}

2172:

2173: \begin{lemma}\label{loft4}

2174: Suppose there are words $\Psi_L$, $\Psi_R$, $e$, $f$, $g$ and $h$ with

2175: $|\Psi_L| = |\Psi_R|$, $|e| < |\Psi_L|$, $|g| < |\Psi_L|$, and for which

2176: \begin{equation}\label{star1}

2177: b_\zeta := \Psi_Le = f\Psi_R,

2178: \end{equation}

2179: and

2180: \begin{equation}\label{star2}

2181: b_\eta := \Psi_Lg = h\Psi_R.

2182: \end{equation}

2183: Suppose further that $|b_\eta| < |b_\zeta|$.

2184: Then we can write $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$

2185: for $p$ a non-empty word, $q$ a word for which $|g| + |pq| = |f|$,

2186: and $k$ a positive integer.

2187: \end{lemma}

2188:

2189: \begin{proof}

2190: Since $|b_\eta| < |b_\zeta|$, we must have $|g| < |e| < |\Psi_R|$.

2191: This last observation, together with (\ref{star1}) and (\ref{star2})

2192: above allows us to assert that there are non-empty words $s_1$ and $s_2$, with

2193: $|s_2| > |s_1|$, such that $\Psi_R = s_1e = s_2g$.

2194: This last fact combined again with (\ref{star1}) and (\ref{star2}) yields that

2195: \begin{equation}\label{star3}

2196: \Psi_L = f s_1 = hs_2,

2197: \end{equation}

2198: and

2199: \begin{equation}\label{star4}

2200: \Psi_R = s_1e  = s_2g.

2201: \end{equation}

2202:

2203: Now we can apply (\ref{star3}) and (\ref{star4}) to assert that there are

2204: non-empty words $r_1$ and $r_2$ for which $s_1 r_1 = s_2 = r_2 s_1$; that is,

2205: \begin{equation}\label{star5}

2206: s_1 r_1 = r_2 s_1.

2207: \end{equation}

2208:

2209: Now apply Lemma~\ref{loft1} to (\ref{star5}) to get that there is a non-empty

2210: word $p$, a word $q$ and an integer $k_1 \ge 0$ for which

2211: $s_1 = (pq)^{k_1}p$, $r_1 = qp$, and $r_2 = pq$.  Set $k := k_1 + 1$.

2212: Then $s_2 = (pq)^{k}p$, and (\ref{star3}) gives $\Psi_L = h(pq)^{k}p$,

2213: and (\ref{star4}) gives $\Psi_R = (pq)^{k}pg$.

2214: Also $s_2 = r_2 s_1$ combined with (\ref{star3}) above gives that $f = hr_2$,

2215: so $|g| + |pq| = |h| + |pq| = |h| + |r_2| = |f|$.

2216: \end{proof}

2217:

2218: Theorems~\ref{loft_thm1} and \ref{loft_thm3} below are the main results.

2219:

2220: \begin{theorem}\label{loft_thm1}

2221: Let $M$ be an $n$-state NFA. Let $L$ be the language accepted by $M$.

2222: Let $N$ be a non-negative integer.

2223: Suppose all words in $L$ of length in the interval $[N, 2N+6n+1]$ are bordered.

2224: Then all words in $L$ of length greater than $2N+6n+1$ are bordered.

2225: Hence, if all words in $L$ of length at most $6n+1$ are bordered, then all the words

2226: in $L$ must be bordered.

2227: \end{theorem}

2228:

2229: \begin{proof}

2230: We'll prove Theorem~\ref{loft_thm1} by making  the following series

2231: of observations.

2232: Throughout, we'll assume that all words in $L$ of length in the interval

2233: $[N, 2N + 6n+1]$ are bordered, and we'll assume $w$ is an unbordered word in $L$

2234: for which $|w| > 2N+6n+1$, with $|w|$ minimal.  We write $w$ as $u \theta v$ with

2235: $\theta$ a word for which $|\theta| \le 1$ and $u$ and $v$ words for

2236: which $|u| = |v| > 3n + N$.

2237:

2238: \begin{claim}\label{claim1}

2239: Write $u$ as $\Psi_L X_L$ and $v$ as

2240: $X_R \Psi_R$, for words $\Psi_L$, $X_L$,

2241: $\Psi_R$, $X_R$ for which $|X_L| = |X_R| = n$.

2242: (So that $w$ is  $\Psi_L X_L \theta X_R \Psi_R$.)

2243: Then there are words $x_L$ and $x_R$, both of length less than $n$, for

2244: which:

2245: \begin{itemize}

2246: \item[(i)] $\zeta := \Psi_L x_L \theta X_R \Psi_R \in L$, and

2247: \item[(ii)] $\eta := \Psi_L X_L \theta x_R \Psi_R \in L$.

2248: \end{itemize}

2249: Further, $N \le |\zeta | < |w|$, and $N \le |\eta| < |w|$.

2250: \end{claim}

2251:

2252: To justify (i), apply Lemma~\ref{loft3} to the 3-tuple $(\Psi_L, X_L,

2253: \theta X_R \Psi_R)$.  Similarly, to arrive at (ii), apply Lemma~\ref{loft3}

2254: again to the 3-tuple $(\Psi_L X_L \theta, X_R, \Psi_R)$.

2255:

2256: \begin{claim}\label{claim2}

2257: We can write $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$

2258: for $p$ a non-empty word, $g$, $h$ and $q$ words for which $|g| = |h|$,

2259: $|pq| + |g| \le n$, and $k$ a positive integer.

2260: Hence $w$ can be written as $h(pq)^{k}p X_L \theta X_R (pq)^{k}pg$.

2261: \end{claim}

2262:

2263: To justify Claim~\ref{claim2}, first recall $w = \Psi_L X_L \theta X_R

2264: \Psi_R$ and $|\Psi_L| = |\Psi_R| > 2n$.

2265: From Claim~\ref{claim1} above we get that $\zeta$ and $\eta$

2266: are bordered words, so we can assert that there exist non-empty words

2267: $b_{\zeta}$ and  $b_{\eta}$, and words $p_{\zeta}$ and $p_{\eta}$, for which:

2268: \begin{itemize}

2269: \item[(I)]    $\zeta = \Psi_L x_L \theta X_R

2270: \Psi_R = b_{\zeta} p_{\zeta} b_{\zeta}$, and

2271: \item[(II)]   $\eta = \Psi_L X_L \theta x_R \Psi_R =

2272: b_{\eta}p_{\eta} b_{\eta}$.

2273: \end{itemize}

2274:

2275: Note that, if  $|b_{\zeta}| \le |\Psi_L|$ then by (I) $b_{\zeta}$

2276: would be a border for $w$.  So we must have  $|b_{\zeta}| > |\Psi_L|$.

2277: Similarly, (II) gives that  $|b_{\eta}| > |\Psi_L|$.

2278: These latter facts together with (I) and (II) give that there exists

2279: non-empty words $e$, $f$, $g$, $h$, for which $|e| = |f|$, $|g| = |h|$,

2280: and for which

2281: \begin{equation}\label{2nd_star1}

2282: b_{\zeta} = \Psi_L e = f \Psi_R,

2283: \end{equation}

2284: and

2285: \begin{equation}\label{2nd_star2}

2286: b_{\eta} = \Psi_L g = h \Psi_R.

2287: \end{equation}

2288: Further, $|\zeta| < |w|$ implies that $|f| \le n$, and similarly

2289: $|\eta| < |w|$ implies that $|h| \le n$.

2290:

2291: Suppose $|b_{\eta}| = |b_{\zeta}|$.  Then from (\ref{2nd_star1}) and

2292: (\ref{2nd_star2}) above, $|e| = |g|$.

2293: But $e$ and $g$ are suffixes of $\Psi_R$, so

2294: we get that $e = g$.  Hence $b_{\zeta} = \Psi_L e = \Psi_L g

2295: = b_{\eta}$.  Set $b := b_{\zeta} = b_{\eta}$.

2296: Then from (II) above, as $|b| \le |\Psi_L| + n$, $b$ is a prefix of

2297: $\Psi_L X_L$.  And from (I) above, $b$ is a suffix of $X_R \Psi_R$.

2298: So $b$ is a non-empty prefix of $w$, and a suffix of $w$.  Hence, as $|b|

2299: \le \frac{|w|} {2}$, $b$ is a border for $w$.

2300:

2301: So we must have $|b_{\eta}| \neq |b_{\zeta}|$.  Suppose first that

2302: $|b_{\eta}| < |b_{\zeta}|$.  Now apply Lemma~\ref{loft4} to get that

2303: there is a positive integer $k$, a non-empty word $p$ and a word $q$ for which

2304: $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$.  And finally observe that

2305: $|pq| + |g| = |f| \le n$.

2306: If $|b_{\eta}| > |b_{\zeta}|$, the argument is similar, so Claim~\ref{claim2}

2307: is established.

2308:

2309: \begin{claim}\label{claim3}

2310: Let $x := pq$ in the statement of Claim~\ref{claim2}.  There is a

2311: conjugate $c_L$ of $x$ which is a prefix of

2312: $\Psi_L$, and there is a conjugate $c_R$ of $x$ which is a suffix of

2313: $\Psi_R$.

2314: \end{claim}

2315:

2316: To justify Claim~\ref{claim3}, let $S_L$ be the prefix of length $n$ of

2317: $\Psi_L$.  So there is a word $T_L$ for which we can write

2318: $\Psi_L X_L \theta X_R = S_LT_L$.  (So $w$ is $S_L T_L \Psi_R$.)

2319: Now apply Lemma~\ref{loft3} to $(S_L, T_L, \Psi_R)$, obtaining a word $t_L$,

2320: with $|t_L| < n$ for which $w_1 := S_L t_L \Psi_R \in L$.

2321: By supposition, since $N \le |w_1| < |w|$, $w_1$ has a border, say $b_1$.

2322: Further, if $|b_1| \le n$ then $b_1$ would be a border for $w$.

2323: So we must have $|b_1| >  n$.  And $|b_1| \le \frac{|w_1|} {2}$ implies

2324: $|b_1| \le |\Psi_R|$.

2325:

2326: So $b_1$ is a suffix of $\Psi_R$ of length greater than $n$; hence by

2327: Claim~\ref{claim2} above we can write $b_1 = s_x x^{k_2}pg$ for some integer

2328: $k_2 \ge 0$, with $s_x$ a suffix of $x$.  Write $x = p_xs_x$, and recall that

2329: $p$ is a prefix of $x$.  Then  $|s_x x^{k_2}pg| > n$ and $|x| + |g| \le n$

2330: (from Claim~\ref{claim2}) yields that $s_xp_x$ is a prefix of

2331: $s_xx^{k_2}pg$, that is, $s_xp_x$ is a prefix of $b_1$.  So set $c_L :=

2332: s_xp_x$.  Since $b_1$ is a prefix of $w_1$,

2333: $c_L$ must be a prefix of $w_1$, and $|c_L| \le n = |S_L|$ gives that

2334: $c_L$ is a prefix of $S_L$, and the first statement of Claim~\ref{claim3}

2335: follows.

2336:

2337: To get the second statement of Claim~\ref{claim3}, similarly

2338: let $S_R$ be the suffix of length $n$ of $\Psi_R$.

2339: So there is a word $T_R$ for which we can write $X_L \theta X_R \Psi_R =

2340: T_RS_R$.  (So $w$ is $\Psi_L T_R S_R$.)

2341: Now apply Lemma~\ref{loft3} to $(\Psi_L, T_R, S_R)$, obtaining a word $t_R$, with $|t_R| < n$ for

2342: which $w_2 := \Psi_L t_R S_R \in L$.

2343: By supposition, since $N \le |w_2| < |w|$,  $w_2$ has a border, say $b_2$.

2344: Further, if $|b_2|

2345: \le n$ then $b_2$ would be a border for $w$.

2346: So we can assert that $n < |b_2| \le |\Psi_L|$.

2347:

2348: So $b_2$ is a prefix of $\Psi_L$ of length greater than $n$; hence by

2349: Claim~\ref{claim2} we can write $b_2 = hx^{k_3}\rho_x$ for some integer

2350: $k_3 \ge  0$, with $\rho_x$

2351: a prefix of $x$.  Write $x = \rho_x \sigma_x$.  Then $|hx^{k_3} \rho_x | > n$

2352: and $|x| + |h| \le n$ (from Claim~\ref{claim2}) yields that $\sigma_x \rho_x$

2353: is a suffix of $hx^{k_3} \rho_x$, that is, $\sigma_x \rho_x$ is a suffix of

2354: $b_2$.  So  set $c_R := \sigma_x \rho_x$.  Since $b_2$ is a suffix of $w_2$,

2355: $c_R$ must be a suffix of $w_2$, and also $|c_R| \le n = |S_R|$ yields

2356: that $c_R$ is a suffix of $S_R$, and the second statement of

2357: Claim~\ref{claim3} follows.

2358:

2359: To complete the proof of Theorem~\ref{loft_thm1}, note that,

2360: since $c_L$ and $c_R$ are both conjugates of $x$,

2361: $c_L$ and $c_R$ are non-empty words which are conjugates.

2362: So there is a non-empty word $\alpha$ and

2363: a word $\beta$ for which we can write $c_L = \alpha \beta$ and

2364: $c_R = \beta \alpha$.  Then $\alpha$ is a prefix of

2365: $\Psi_L$, and $\alpha$ is a suffix of $\Psi_R$, which gives that $\alpha$

2366: is a border for $w$, and gives a contradiction.

2367: \end{proof}

2368:

2369: \begin{corollary}

2370: The problem of determining if an NFA accepts an unbordered word

2371: is decidable.

2372: \end{corollary}

2373:

2374: \begin{proof}

2375: Let $M$ be an NFA with $n$ states.  To determine if $M$ accepts

2376: an unbordered word, it suffices to test whether $M$ accepts

2377: an unbordered word of length at most $6n+1$.

2378: \end{proof}

2379:

2380: We do not know if there is a polynomial-time algorithm to

2381: test if an NFA accepts an unbordered word or if the problem is

2382: computationally intractable.

2383:

2384: Theorem~\ref{loft_thm1} gives an upper bound of $6n+1$ on the length

2385: of a shortest unbordered word accepted by an $n$-state NFA.  The best

2386: lower bound we are able to come up with is $2n-3$, as illustrated by the

2387: following example: an NFA of $n$ states accepts

2388: $a b^{n-3} a b^*$, and the shortest unbordered word accepted is

2389: $a b^{n-3} a b^{n-2}$, which is of length $2n-3$.

2390:

2391: \begin{theorem}\label{loft_thm2}

2392: Let $M$ be an $n$-state NFA, and let $L$ be the language accepted by $M$.

2393: Suppose there is an unbordered word in $L$ of length greater than $4n^2 + 6n + 1$.

2394: Then $L$ contains infinitely many unbordered words.

2395: \end{theorem}

2396:

2397: \begin{proof}

2398: Suppose $L$ contains only finitely many unbordered words.

2399: Let $w$ be an unbordered word in $L$ of length greater than $4n^2 + 6n + 1$,

2400: with $|w|$ maximal.

2401: Write $w$ as $\Psi_L X_L \theta X_R \Psi_R$ for words  $\Psi_L$, $X_L$,

2402: $\theta$, $\Psi_R$, $X_R$ for which $|X_L| = |X_R| = n$, $|\Psi_L| =

2403: |\Psi_R| > 2n^2 + 2n$, and $|\theta| \le 1$.

2404:  We proceed by making the following series of observations.

2405:

2406: \begin{claim}\label{2nd_claim1}

2407: There are words $x_L$, $u_L$, $y_L$ and  $x_R$, $u_R$, $y_R$,

2408: with $u_L$ and $u_R$ both non-empty, $X_L = x_Lu_Ly_L$, $X_R = x_Ru_Ry_R$, and

2409: for which:

2410: \begin{itemize}

2411: \item[(i)] $\zeta := \Psi_L x_Lu_Lu_Ly_L\theta X_R \Psi_R \in L$, and

2412: \item[(ii)] $\eta := \Psi_L X_L \theta x_Ru_Ru_Ry_R \Psi_R \in L$.

2413: \end{itemize}

2414: Further, $|\zeta | > |w|$, and $|\eta| > |w|$.

2415: \end{claim}

2416:

2417: To justify (i), apply Lemma~\ref{loft2} (with $d = 1$) to the 3-tuple $(\Psi_L, X_L,

2418: \theta X_R \Psi_R)$.  Similarly, to arrive at (ii), apply Lemma~\ref{loft2} again

2419: (also with $d = 1$) to the 3-tuple $(\Psi_L X_L \theta, X_R, \Psi_R)$.

2420:

2421: \begin{claim}\label{2nd_claim2}

2422: We can write $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$

2423: for $p$ a non-empty word, $g$, $h$ and $q$ words for which $|g| = |h|$,

2424: $|pq| + |g| \le 2n$, and $k$ an integer $\ge n$.

2425: Hence $w$ can be written as $h(pq)^{k}p X_L \theta X_R (pq)^{k}pg$.

2426: \end{claim}

2427:

2428: To justify Claim~\ref{2nd_claim2},

2429: first recall that $w = \Psi_L x_Lu_Ly_L \theta x_Ru_Ry_R

2430: \Psi_R$, and $X_L = x_Lu_Ly_L$, $X_R = x_Ru_Ry_R$.

2431: From Claim~\ref{2nd_claim1} above and the maximality of $|w|$

2432: we get that $\zeta$ and $\eta$ are bordered words, so

2433: we can assert that there exist non-empty words $b_{\zeta}$ and

2434: $b_{\eta}$, and words $p_{\zeta}$ and $p_{\eta}$, for which:

2435: \begin{itemize}

2436: \item[(I)]    $\zeta = \Psi_L x_Lu_Lu_Ly_L \theta X_R

2437: \Psi_R = b_{\zeta} p_{\zeta} b_{\zeta}$, and

2438: \item[(II)]   $\eta = \Psi_L X_L \theta x_Ru_Ru_Ry_R \Psi_R =

2439: b_{\eta}p_{\eta} b_{\eta}$.

2440: \end{itemize}

2441:

2442: Note that, if  $|b_{\zeta}| \le |\Psi_L|$ then by (I) $b_{\zeta}$

2443: would be a border for $w$.  So we must have  $|b_{\zeta}| > |\Psi_L|$.

2444: Similarly, (II) gives that  $|b_{\eta}| > |\Psi_L|$.

2445: These latter facts together with (I) and (II) give that there exists

2446: non-empty words $e$, $f$,

2447: $g$, $h$, for which $|e| = |f|$, $|g| = |h|$, and for which

2448: \begin{equation}\label{3rd_star1}

2449: b_{\zeta} = \Psi_L e = f \Psi_R,

2450: \end{equation}

2451: and

2452: \begin{equation}\label{3rd_star2}

2453: b_{\eta} = \Psi_L g = h \Psi_R.

2454: \end{equation}

2455:

2456: Further, the reader can verify that $|e| \le 2n < |\Psi_R|$, and $|g| \le 2n < |\Psi_R|$.

2457:

2458: Suppose $|b_{\eta}| = |b_{\zeta}|$.  Then from (\ref{3rd_star1}) and (\ref{3rd_star2}) above,

2459: $|e| = |g|$.  But $e$ and $g$ are suffixes of $\Psi_R$, so

2460: we get that $e = g$.  Hence $b_{\zeta} = \Psi_L e = \Psi_L g

2461: = b_{\eta}$.  Set $b := b_{\zeta} = b_{\eta}$.

2462: Now $|u_Ly_L \theta X_R| > |x_Lu_L|$, so from (I) above, we must have

2463: $|b| \le |u_Ly_L \theta X_R \Psi_R|$, that is, $b$ is a suffix of $u_Ly_L \theta X_R \Psi_R$.

2464: Similarly, $|X_L \theta x_Ru_R| > |u_Ry_R|$, so from (II) above we get that

2465: $|b| \le |\Psi_L X_L \theta x_Ru_R|$, that is, $b$ is a prefix of $\Psi_L X_L \theta x_Ru_R$.

2466: So $b$ is a non-empty prefix of $w$, and a suffix of $w$.

2467: Hence $w$ must be bordered, which is a contradiction.

2468:

2469: So we must have $|b_{\eta}| \neq |b_{\zeta}|$.  First, suppose

2470: $|b_{\eta}| < |b_{\zeta}|$.

2471: Now apply Lemma~\ref{loft4} to get that

2472: there is a positive integer $k$, a non-empty word $p$ and a word $q$ for which

2473: $\Psi_L = h(pq)^{k}p$ and $\Psi_R = (pq)^{k}pg$.

2474: And finally observe that $|pq| + |g| = |f| \le 2n$,

2475: and since $|\Psi_L| > 2n^2 + 2n$ and $|pq| \le 2n$, we get that $k \ge n$.

2476: The case $|b_{\eta}| > |b_{\zeta}|$ is symmetric,  so Claim~\ref{2nd_claim2}

2477: is established.

2478:

2479: \begin{claim}\label{2nd_claim3}

2480: Let $x := pq$ in the statement of Claim~\ref{2nd_claim2}.  There is a

2481: conjugate $c_L$ of $x$ which is a prefix of

2482: $\Psi_L$, and there is a conjugate $c_R$ of $x$ which is a suffix of

2483: $\Psi_R$.

2484: \end{claim}

2485:

2486: To justify Claim~\ref{2nd_claim3}, recall from Claim~\ref{2nd_claim2}

2487: that $w$ is $\Psi_L X_L \theta X_R x^{k}pg$.

2488: And since $k \ge n$, we can apply Lemma~\ref{loft2} to the 3-tuple of words

2489: $(\Psi_LX_L \theta X_R, x^k, pg)$, with $d := |x|$, obtaining a

2490: positive integer $J_1$ for which, for all $z \ge 0$, we have

2491: $\Psi_LX_L \theta X_R x^{k+J_1z}pg \in L$.

2492: So choose $z_1 := |\Psi_LX_L \theta X_R|$, and define $w_1 :=

2493: \Psi_LX_L \theta X_R x^{k+J_1z_1}pg$.  By supposition $w_1$ is a bordered word, say

2494: with border $b_1$.  Further, if $|b_1| \le |\Psi_R|$ then $b_1$ would be a border for $w$.

2495: So we must have $|b_1| > |\Psi_R|$.  And $|b_1| \le \frac{|w_1|} {2}$ implies

2496: $|b_1| \le |x^{k+J_1z_1}pg|$.

2497:

2498: So $b_1$ is a suffix of $x^{k+J_1z_1}pg$ of length greater than $|\Psi_R| > 2n$,

2499: hence by Claim~\ref{2nd_claim2} above we can write

2500: $b_1 = s_x x^{k_2}pg$ for some integer $k_2 \ge 0$,

2501: with $s_x$ a suffix of $x$.  Write $x = p_xs_x$, and recall that $p$ is a

2502: prefix of $x$.  Then  $|s_x x^{k_2}pg| > 2n$ and $|x| + |g| \le 2n$ (from

2503: Claim~\ref{2nd_claim2}) yields that $s_xp_x$ is a prefix of

2504: $s_xx^{k_2}pg$, that is, $s_xp_x$ is a prefix of $b_1$.  So set $c_L :=

2505: s_xp_x$.  Since $b_1$ is a prefix of $w_1$,

2506: $c_L$ must be a prefix of $w_1$, and $|c_L| \le 2n$ gives that

2507: $c_L$ is a prefix of $\Psi_L$, and the first statement of

2508: Claim~\ref{2nd_claim3} follows.

2509:

2510: To justify the second statement of Claim~\ref{2nd_claim3},

2511: we proceed similarly; that is, we recall that

2512: $w$ is $hx^kpX_L \theta X_R \Psi_R$, and

2513: apply Lemma~\ref{loft2} to the 3-tuple of words

2514: $(h, x^k, pX_L \theta X_R \Psi_R)$, with $d := |x|$, allowing us to assert that there is a

2515: positive integer $J_2$ for which, for all $z \ge 0$, we have

2516: $hx^{k+J_2z}pX_L \theta X_R \Psi_R \in L$.

2517: So choose $z_2 := |pX_L \theta X_R \Psi_R|$, and define

2518: $w_2 : = hx^{k+J_2z_2}pX_L \theta X_R \Psi_R $.  By supposition $w_2$ is a bordered word, say

2519: with border $b_2$.

2520: Further, if $|b_2| \le |\Psi_L|$ then $b_2$ would be a border for $w$.  So we must have $|b_2| >

2521: |\Psi_L|$.  And $|b_2| \le \frac{|w_2|} {2}$ implies $|b_2| \le |hx^{k+J_2z_2}p|$.

2522:

2523: So $b_2$ is a prefix of $hx^{k+J_2z_2}p$ of length greater than $|\Psi_L| > 2n$;

2524: hence by Claim~\ref{2nd_claim2} we can write

2525: $b_2 = hx^{k_3}\rho_x$ for some integer $k_3 \ge

2526: 0$, with $\rho_x$ a prefix of $x$.  Write $x = \rho_x \sigma_x$.

2527: Then $|hx^{k_3} \rho_x | > 2n$ and $|x| + |h| \le 2n$

2528: (from Claim~\ref{2nd_claim2}) yields that $\sigma_x \rho_x$ is a suffix of

2529: $hx^{k_3} \rho_x$, that is, $\sigma_x \rho_x$ is a suffix of $b_2$.  So

2530: set $c_R := \sigma_x \rho_x$.  Since $b_2$ is a suffix of $w_2$,

2531: $c_R$ must be a suffix of $w_2$, and also $|c_R| \le 2n$ yields

2532: that $c_R$ is a suffix of $\Psi_R$, and the second statement of

2533: Claim~\ref{2nd_claim3} follows.

2534:

2535: To complete the proof of Theorem~\ref{loft_thm2},

2536: note that, since $c_L$ and $c_R$ are

2537: both conjugates of $x$, $c_L$ and $c_R$ are non-empty

2538: words which are conjugates.  So there is a non-empty word $\alpha$ and

2539: a word $\beta$ for which we can write $c_L = \alpha \beta$ and

2540: $c_R = \beta \alpha$.  Then $\alpha$ is a prefix of

2541: $\Psi_L$, and $\alpha$ is a suffix of $\Psi_R$, which gives that $\alpha$

2542: is a border for $w$, which is a contradiction.  So we're forced to conclude

2543: that $L$ contains infinitely many unbordered words.

2544: \end{proof}

2545:

2546: \begin{theorem}\label{loft_thm3}

2547: Let $M$ be an $n$-state NFA, and let $L$ be the language accepted by $M$.

2548: Then the following are equivalent:

2549: \begin{enumerate}

2550: \item $L$ contains infinitely many unbordered words.

2551: \item There is an unbordered word $w$ in $L$, with $4n^2+6n+2 \le |w| \le 8n^2 + 18n + 5$.

2552: \end{enumerate}

2553: \end{theorem}

2554:

2555: \begin{proof}

2556: (1) $\rightarrow$ (2).  Suppose all words $w \in L$ whose lengths are in

2557: $[4n^2+6n+2, 8n^2 + 18n + 5]$ are bordered words.

2558: Then by Theorem~\ref{loft_thm1}, (with $N = 4n^2+6n+2$),

2559: we have that any word in $L$ whose length is at least $4n^2+6n+2$ is bordered, i.e., $L$

2560: contains at most finitely many unbordered words.

2561:

2562: (2) $\rightarrow$ (1).  This follows immediately from Theorem~\ref{loft_thm2}.

2563: \end{proof}

2564:

2565: \begin{corollary}

2566: The problem of determining if an NFA accepts infinitely many unbordered words

2567: is decidable.

2568: \end{corollary}

2569:

2570: \begin{proof}

2571: Let $M$ be an NFA with $n$ states.  To determine if $M$ accepts

2572: infinitely many unbordered words, it suffices to test whether $M$ accepts

2573: an unbordered word $w$, where $4n^2+6n+2 \le |w| \le 8n^2 + 18n + 5$.

2574: \end{proof}

2575:

2576: We do not know if there is a polynomial-time algorithm to

2577: test if an NFA accepts infinitely many unbordered words or if the problem is

2578: computationally intractable.

2579:

2580: \section{Final remarks}\label{concl}

2581:

2582:       In this paper we examined the complexity of checking various

2583: properties of regular languages, such as consisting only of palindromes,

2584: containing at least one palindrome, consisting only of powers, or containing

2585: at least one power.  In each case (except for the unbordered words),

2586: we were able to provide an efficient algorithm or show that the problem

2587: is likely to be hard.  Our results are summarized in the following table.

2588: Here $M$ is an NFA with $n$ states and $t$ transitions.

2589: When $L$ is the language of unbordered words, it is an open problem

2590: to either find polynomial time algorithms to test if

2591: (a) $L(M) \intersect L = \emptyset$, and (b) $L(M) \intersect L$ is infinite,

2592: or to show the intractability of these problems.

2593:

2594: \bigskip

2595: \begin{figure}[H]

2596: \begin{center}

2597: \begin{tabular}{|c|c|c|c|c|}

2598: \hline

2599:      & decide if & decide if & upper bound on & worst-case  \\

2600: $L$  & $L(M) \intersect L = \emptyset$ & $L(M) \intersect L$ & shortest element  & lower bound  \\

2601:      &      & infinite & of $L(M) \intersect L$ & known  \\

2602: \hline

2603: palindromes & $O(n^2+t^2)$ & $O(n^2+t^2)$ & $2n^2-1$ & ${{n^2}\over 2} - 3n+ 5$  \\

2604: \hline

2605: non-palindromes & $O(n^2+tn)$ & $O(n^2+t^2)$ & $3n-1$ & $3n-1$ \\

2606: \hline

2607: $k$-powers       & $O(n^{2k-1} t^k)$ & $O(n^{2k-1} t^k)$ & $kn^k$ &

2608: 	$\Omega(n^k)$  \\

2609: ($k$ fixed)  & & & &\\

2610: \hline

2611: $k$-powers & PSPACE- & PSPACE- & &  \\

2612: ($k$ part of input) & complete & complete & & \\

2613: \hline

2614: non-$k$-powers & $O(n^3 + t n^2)$ & $O(n^3 + t n^2)$ & $3n$ & $(2+{1 \over {2k-2}}) n - O(1)$ \\

2615: \hline

2616: powers & PSPACE- & PSPACE- & $(n+1)n^{n+1}$ & $e^{\Omega(\sqrt{n\log n})}$ \\

2617: & complete & complete & & \\

2618: \hline

2619: non-powers & $O(n^5)$ & $O(n^5)$ & $3n$  & ${5 \over 2} n - 2$\\

2620: \hline

2621: bordered words & $O(n^3 t^2)$ & $O(n^6 t^2)$ & $2n^2 + n- 1$ & $ {{n^2}\over 2} - 6n+ {{43} \over 2}$ \\

2622: \hline

2623: unbordered & decidable & decidable & $6n+1$ & $2n-3$ \\

2624: words & & & & \\

2625: \hline

2626: \end{tabular}

2627: \end{center}

2628: \end{figure}

2629:

2630: \section*{Acknowledgments}

2631:

2632: The algorithm mentioned in Section~\ref{nn} for testing if an NFA-$\epsilon$

2633: accepts infinitely many words was suggested to us by Timothy Chan.

2634: We would like to thank both him and Jack Zhao for their ideas on this subject.

2635:

2636: \bibliography{abbrevs,pal}

2637: \bibliographystyle{new}

2638:

2639: \end{document}

2640: