1: \begin{thebibliography}{10}
2:
3: \bibitem{bttp01}
4: A.~Bahr, J.~D. Thompson, J.-C. Thierry, and O.~Poch.
5: \newblock {BA}li{BASE} ({B}enchmark {A}lignment data{BASE}): enhancements for
6: repeats, transmembrane sequences and circular permutations.
7: \newblock {\em Nucleic Acids Research}, 29:323--326, 2001.
8:
9: \bibitem{bb01}
10: P.~Baldi and S.~Brunak.
11: \newblock {\em Bioinformatics}.
12: \newblock The MIT Press, Cambridge, MA, 2001.
13:
14: \bibitem{bv01}
15: P.~Bonizzoni and G.~Della Vedora.
16: \newblock The complexity of multiple sequence alignment with {SP}-score that is
17: a metric.
18: \newblock {\em Theoretical Computer Science}, 259:63--79, 2001.
19:
20: \bibitem{bjeg98}
21: A.~Brazma, I.~Jonassen, I.~Eidhammer, and D.~Gilbert.
22: \newblock Approaches to the automatic discovery of patterns in biosequences.
23: \newblock {\em Journal of Computational Biology}, 5:279--305, 1998.
24:
25: \bibitem{ctrv02}
26: N.~Cannata, S.~Toppo, C.~Romualdi, and G.~Valle.
27: \newblock Simplifying amino acid alphabets by means of a branch and bound
28: algorithm and substitution matrices.
29: \newblock {\em Bioinformatics}, 18:1102--1108, 2002.
30:
31: \bibitem{clrs01}
32: T.~H. Cormen, C.~E. Leiserson, R.~L. Rivest, and C.~Stein.
33: \newblock {\em Introduction to Algorithms}.
34: \newblock The MIT Press, Cambridge, MA, second edition, 2001.
35:
36: \bibitem{dso78}
37: M.~O. Dayhoff, R.~M. Schwartz, and B.~C. Orcutt.
38: \newblock A model of evolutionary change in proteins.
39: \newblock In M.~O. Dayhoff, editor, {\em Atlas of Protein Sequence and
40: Structure}, volume 5, supplement\ 3, pages 345--352. National Biomedical
41: Research Foundation, Washington, DC, 1978.
42:
43: \bibitem{d81}
44: R.~F. Doolittle.
45: \newblock Similar amino acid sequences: chance or common ancestry?
46: \newblock {\em Science}, 214:149--159, 1981.
47:
48: \bibitem{dms98}
49: A.~Dress, B.~Morgenstern, and J.~Stoye.
50: \newblock The number of standard and of effective multiple alignments.
51: \newblock {\em Applied Mathematics Letters}, 11:43--49, 1998.
52:
53: \bibitem{dekm98}
54: R.~Durbin, S.~Eddy, A.~Krogh, and G.~Mitchison.
55: \newblock {\em Biological Sequence Analysis}.
56: \newblock Cambridge University Press, Cambridge, UK, 1998.
57:
58: \bibitem{ejt04}
59: I.~Eidhammer, I.~Jonassen, and W.~R. Taylor.
60: \newblock {\em Protein Bioinformatics}.
61: \newblock John Wiley \& Sons, Chichester, UK, 2004.
62:
63: \bibitem{gs97}
64: R.~Giegerich and S.~Kurtz.
65: \newblock From {U}kkonen to {M}c{C}reight and {W}einer: a unifying view of
66: linear-time suffix tree construction.
67: \newblock {\em Algorithmica}, 19:331--353, 1997.
68:
69: \bibitem{g96}
70: O.~Gotoh.
71: \newblock Significant improvement in accuracy of multiple protein sequence
72: alignments by iterative refinement as assessed by reference to structural
73: alignments.
74: \newblock {\em Journal of Molecular Biology}, 264:823--838, 1996.
75:
76: \bibitem{g99}
77: O.~Gotoh.
78: \newblock Multiple sequence alignment: algorithms and applications.
79: \newblock {\em Advances in Biophysics}, 36:159--206, 1999.
80:
81: \bibitem{gb02}
82: R.~E. Green and S.~E. Brenner.
83: \newblock Bootstrapping and normalization for enhanced evaluations of pairwise
84: sequence comparison.
85: \newblock {\em Proceedings of the IEEE}, 90:1834--1847, 2002.
86:
87: \bibitem{g97}
88: D.~Gusfield.
89: \newblock {\em Algorithms on Strings, Trees, and Sequences}.
90: \newblock Cambridge University Press, Cambridge, UK, 1997.
91:
92: \bibitem{hh92}
93: S.~Henikoff and J.~G. Henikoff.
94: \newblock Amino acid substitution matrices from protein blocks.
95: \newblock {\em Proceedings of the National Academy of Sciences USA},
96: 89:10915--10919, 1992.
97:
98: \bibitem{hm91}
99: X.~Huang and W.~Miller.
100: \newblock A time-efficient, linear-space local similarity algorithm.
101: \newblock {\em Advances in Applied Mathematics}, 12:337--357, 1991.
102:
103: \bibitem{i97}
104: T.~R. Ioerger.
105: \newblock The context-dependence of amino acid properties.
106: \newblock In {\em Proceedings of the Fifth International Conference on
107: Intelligent Systems for Molecular Biology}, pages 157--166, 1997.
108:
109: \bibitem{jz81}
110: M.~A. Jim{\'{e}}nez-Monta{\~{n}}o and L.~Zamora-Cortina.
111: \newblock Evolutionary model for the generation of amino acid sequences and its
112: application to the study of fragments of mammal-hemoglobin chains.
113: \newblock In {\em Proceedings of the Seventh International Biophysics
114: Congress}, 1981.
115:
116: \bibitem{j01}
117: W.~Just.
118: \newblock Computational complexity of multiple sequence alignment with
119: {SP}-score.
120: \newblock {\em Journal of Computational Biology}, 8:615--623, 2001.
121:
122: \bibitem{kh01}
123: K.~Karplus and B.~Hu.
124: \newblock Evaluation of protein multiple alignments by {SAM}-{T}99 using the
125: {BA}li{BASE} multiple alignment test set.
126: \newblock {\em Bioinformatics}, 17:713--720, 2001.
127:
128: \bibitem{kmkm02}
129: K.~Katoh, K.~Misawa, K.~Kuma, and T.~Miyata.
130: \newblock {MAFFT}: a novel method for rapid multiple sequence alignment based
131: on fast {F}ourier transform.
132: \newblock {\em Nucleic Acids Research}, 30:3059--3066, 2002.
133:
134: \bibitem{k96b}
135: T.~M. Klingler.
136: \newblock {\em Structural Inference from Correlations in Biological Sequences}.
137: \newblock PhD thesis, Program in Medical Informatics, Stanford University,
138: 1996.
139:
140: \bibitem{ls02}
141: T.~Lassmann and E.~L.~L. Sonnhammer.
142: \newblock Quality assessment of multiple alignment programs.
143: \newblock {\em FEBS Letters}, 529:126--130, 2002.
144:
145: \bibitem{ltptp01}
146: O.~Lecompte, J.~D. Thompson, F.~Plewniak, J.-C. Thierry, and O.~Poch.
147: \newblock Multiple alignment of complete sequences ({MACS}) in the post-genomic
148: era.
149: \newblock {\em Gene}, 270:17--30, 2001.
150:
151: \bibitem{lfww03}
152: T.~P. Li, K.~Fan, J.~Wang, and W.~Wang.
153: \newblock Reduction of protein sequence complexity by residue grouping.
154: \newblock {\em Protein Engineering}, 16:323--330, 2003.
155:
156: \bibitem{lb93}
157: C.~D. Livingstone and G.~J. Barton.
158: \newblock Protein sequence alignments: a strategy for the hierarchical analysis
159: of residue conservation.
160: \newblock {\em Computer Applications in the Biosciences}, 9:745--756, 1993.
161:
162: \bibitem{m03}
163: B.~Manthey.
164: \newblock Non-approximability of weighted multiple sequence alignment.
165: \newblock {\em Theoretical Computer Science}, 296:179--192, 2003.
166:
167: \bibitem{m76}
168: E.~M. McCreight.
169: \newblock A space-economical suffix tree construction algorithm.
170: \newblock {\em Journal of the ACM}, 23:262--272, 1976.
171:
172: \bibitem{mbrzh94}
173: W.~Miller, M.~Boguski, B.~Raghavachari, Z.~Zhang, and R.~C. Hardison.
174: \newblock Constructing aligned sequence blocks.
175: \newblock {\em Journal of Computational Biology}, 1:51--64, 1994.
176:
177: \bibitem{m95}
178: G.~Mocz.
179: \newblock Fuzzy cluster analysis of simple physicochemical properties of amino
180: acids for recognizing secondary structure in proteins.
181: \newblock {\em Protein Science}, 4:1178--1187, 1995.
182:
183: \bibitem{m99}
184: B.~Morgenstern.
185: \newblock {DIALIGN} 2: improvement of the segment-to-segment approach to
186: multiple sequence alignment.
187: \newblock {\em Bioinformatics}, 15:211--218, 1999.
188:
189: \bibitem{mfdw98}
190: B.~Morgenstern, K.~Frech, A.~Dress, and T.~Werner.
191: \newblock {DIALIGN}: finding local similarities by multiple sequence alignment.
192: \newblock {\em Bioinformatics}, 14:290--294, 1998.
193:
194: \bibitem{msv02}
195: T.~M{\"{u}}ller, R.~Spang, and M.~Vingron.
196: \newblock Estimating amino acid substitution models: a comparison of
197: {D}ayhoff's estimator, the resolvent approach and a maximum likelihood
198: method.
199: \newblock {\em Molecular Biology and Evolution}, 19:8--13, 2002.
200:
201: \bibitem{nfjwn96}
202: D.~Naor, D.~Fischer, R.~L. Jernigan, H.~J. Wolfson, and R.~Nussinov.
203: \newblock Amino acid pair interchanges at spatially conserved locations.
204: \newblock {\em Journal of Molecular Biology}, 256:924--938, 1996.
205:
206: \bibitem{nw70}
207: S.~B. Needleman and C.~D. Wunsch.
208: \newblock A general method applicable to the search for similarities in the
209: amino acid sequence of two proteins.
210: \newblock {\em Journal of Molecular Biology}, 48:443--453, 1970.
211:
212: \bibitem{nrd02}
213: H.~B. {Nicholas Jr}., A.~J. Ropelewski, and D.~E. {Deerfield II}.
214: \newblock Strategies for multiple sequence alignment.
215: \newblock {\em Biotechniques}, 32:572--591, 2002.
216:
217: \bibitem{n02}
218: C.~Notredame.
219: \newblock Recent progress in multiple sequence alignment: a survey.
220: \newblock {\em Pharmacogenomics}, 3:131--144, 2002.
221:
222: \bibitem{nhh00}
223: C.~Notredame, D.~G. Higgins, and J.~Heringa.
224: \newblock {T}-{C}offee: a novel method for fast and accurate multiple sequence
225: alignment.
226: \newblock {\em Journal of Molecular Biology}, 302:205--217, 2000.
227:
228: \bibitem{pfr99}
229: L.~Parida, A.~Floratos, and I.~Rigoutsos.
230: \newblock An approximation algorithm for alignment of multiple sequences using
231: motif discovery.
232: \newblock {\em Journal of Combinatorial Optimization}, 3:247--275, 1999.
233:
234: \bibitem{p00}
235: P.~A. Pevzner.
236: \newblock {\em Computational Molecular Biology}.
237: \newblock The MIT Press, Cambridge, MA, 2000.
238:
239: \bibitem{rfpgp00}
240: I.~Rigoutsos, A.~Floratos, L.~Parida, Y.~Gao, and D.~Pratt.
241: \newblock The emergence of pattern discovery techniques in computational
242: biology.
243: \newblock {\em Metabolic Engineering}, 2:159--177, 2000.
244:
245: \bibitem{sw03}
246: M.-F. Sagot and Y.~Wakabayashi.
247: \newblock Pattern inference under many guises.
248: \newblock In B.~A. Reed and C.~L. Sales, editors, {\em Recent Advances in
249: Algorithms and Combinatorics}, pages 245--287. Springer-Verlag, New York, NY,
250: 2003.
251:
252: \bibitem{s75}
253: D.~Sankoff.
254: \newblock Minimum mutation trees of sequences.
255: \newblock {\em SIAM Journal on Applied Mathematics}, 28:35--42, 1975.
256:
257: \bibitem{sm97}
258: J.~Setubal and J.~Meidanis.
259: \newblock {\em Introduction to Computational Molecular Biology}.
260: \newblock PWS Publishing Company, Boston, MA, 1997.
261:
262: \bibitem{s98}
263: J.~B. Slowinski.
264: \newblock The number of multiple alignments.
265: \newblock {\em Molecular Phylogenetics and Evolution}, 10:264--266, 1998.
266:
267: \bibitem{ss90}
268: R.~F. Smith and T.~F. Smith.
269: \newblock Automatic generation of primary sequence patterns from sets of
270: related protein sequences.
271: \newblock {\em Proceedings of the National Academy of Sciences USA},
272: 87:118--122, 1990.
273:
274: \bibitem{sw81}
275: T.~F. Smith and M.~S. Waterman.
276: \newblock Identification of commom molecular subsequences.
277: \newblock {\em Journal of Molecular Biology}, 147:195--197, 1981.
278:
279: \bibitem{s66}
280: P.~H. Sneath.
281: \newblock Relations between chemical structure and biological activity in
282: peptides.
283: \newblock {\em Journal of Theoretical Biology}, 12:157--195, 1966.
284:
285: \bibitem{sm86}
286: E.~Sobel and H.~M. Martinez.
287: \newblock A multiple sequence alignment program.
288: \newblock {\em Nucleic Acids Research}, 14:363--374, 1986.
289:
290: \bibitem{s96}
291: L.~E. Stanfel.
292: \newblock A new approach to clustering the amino acids.
293: \newblock {\em Journal of Theoretical Biology}, 183:195--205, 1996.
294:
295: \bibitem{t86}
296: W.~R. Taylor.
297: \newblock The classification of amino acid conservation.
298: \newblock {\em Journal of Theoretical Biology}, 119:205--218, 1986.
299:
300: \bibitem{t99}
301: W.~R. Taylor.
302: \newblock The properties of amino acids in sequences.
303: \newblock In M.~J. Bishop, editor, {\em Genetic Databases}, pages 81--103.
304: Academic Press, London, UK, 1999.
305:
306: \bibitem{thg94}
307: J.~D. Thompson, D.~G. Higgins, and T.~J. Gibson.
308: \newblock {CLUSTAL} {W}: improving the sensitivity of progressive multiple
309: sequence alignment through sequence weighting, position-specific gap
310: penalties, and weight matrix choice.
311: \newblock {\em Nucleic Acids Research}, 22:4673--4680, 1994.
312:
313: \bibitem{tpp99a}
314: J.~D. Thompson, F.~Plewniak, and O.~Poch.
315: \newblock {BA}li{BASE}: a benchmark alignment database for the evaluation of
316: multiple alignment programs.
317: \newblock {\em Bioinfomatics}, 15:87--88, 1999.
318:
319: \bibitem{tpp99b}
320: J.~D. Thompson, F.~Plewniak, and O.~Poch.
321: \newblock A comprehensive comparison of multiple sequence alignment programs.
322: \newblock {\em Nucleic Acids Research}, 27:2682--2690, 1999.
323:
324: \bibitem{u95}
325: E.~Ukkonen.
326: \newblock On-line construction of suffix trees.
327: \newblock {\em Algorithmica}, 14:249--260, 1995.
328:
329: \bibitem{vms99}
330: A.~Vanet, L.~Marsan, and M.-F. Sagot.
331: \newblock Promoter sequences and algorithmical methods for identifying them.
332: \newblock {\em Research in Microbiology}, 150:779--799, 1999.
333:
334: \bibitem{vvp01}
335: D.~Voet, J.~G. Voet, and C.~W. Pratt.
336: \newblock {\em Fundamentals of Biochemistry}.
337: \newblock John Wiley \& Sons, New York, NY, 2001.
338:
339: \bibitem{vea95}
340: G.~Vogt, T.~Etzold, and P.~Argos.
341: \newblock An assessment of amino acid exchange matrices in aligning protein
342: sequences: the twilight zone revisited.
343: \newblock {\em Journal of Molecular Biology}, 249:816--831, 1995.
344:
345: \bibitem{wj94}
346: L.~Wang and T.~Jiang.
347: \newblock On the complexity of multiple sequence alignment.
348: \newblock {\em Journal of Computational Biology}, 1:337--348, 1994.
349:
350: \bibitem{wj90}
351: M.~S. Waterman and R.~Jones.
352: \newblock Consensus methods for {DNA} and protein sequence alignment.
353: \newblock In {\em Methods in Enzymology}, volume 183, pages 221--237. Academic
354: Press, 1990.
355:
356: \bibitem{wsb76}
357: M.~S. Waterman, T.~F. Smith, and W.~A. Beyer.
358: \newblock Some biological sequence metrics.
359: \newblock {\em Advances in Mathematics}, 20:367--387, 1976.
360:
361: \bibitem{w73}
362: P.~Weiner.
363: \newblock Linear pattern matching algorithms.
364: \newblock In {\em Proceedings of the Fourteenth Symposium on Switching and
365: Automata Theory}, pages 1--11, 1973.
366:
367: \bibitem{wb96}
368: T.~D. Wu and D.~L. Brutlag.
369: \newblock Discovering empirically conserved amino acid substitution groups in
370: databases of protein families.
371: \newblock In {\em Proceedings of the Fourth International Conference on
372: Intelligent Systems for Molecular Biology}, pages 230--240, 1996.
373:
374: \bibitem{zhm96}
375: Z.~Zhang, B.~He, and W.~Miller.
376: \newblock Local multiple alignment via subgraph enumeration.
377: \newblock {\em Discrete Applied Mathematics}, 71:337--365, 1996.
378:
379: \bibitem{zj01}
380: P.~Zhao and T.~Jiang.
381: \newblock A heuristic algorithm for multiple sequence alignment based on
382: blocks.
383: \newblock {\em Journal of Combinatorial Optimization}, 5:95--115, 2001.
384:
385: \end{thebibliography}
386: