0408:q-bio0408017/evin3.TEX

1: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2: \documentclass{elsart}

3:

4: %%%%%%%%%%%%%%%%%%%%%%PACKAGES%%%%%%%%%%%%%%%%%%%%%%

5: \usepackage{graphicx}

6: \usepackage{amssymb}

7:

8: %%%%%%%%%%%%%%%%%%%%%%DOCUMENT%%%%%%%%%%%%%%%%%%%%%%

9: \begin{document}

10: %%%%%%%%%%%%%%%%%%%%%%TITLE%%%%%%%%%%%%%%%%%%%%%%%%%

11: \begin{frontmatter}

12:

13: \title{MONTE CARLO SIMULATION AND STATISTICAL ANALYSIS OF GENETIC INFORMATION CODING}

14: \author{E. Gultepe\corauthref{Northeastern}}

15: \author{M.~L. Kurnaz\corauthref{BU}}

16: \corauth[Northeastern]{Present Address: Northeastern University}

17: \address{Department of Physics, Bogazici University, 34342

18: Bebek Istanbul}

19: \ead{kurnaz@boun.edu.tr}

20: \corauth[BU]{Corresponding Author}

21:

22: %%%%%%%%%%%%%%%%%%%%%ABSTRACT%%%%%%%%%%%%%%%%%%%%%%%

23: \begin{abstract}

24: The rules that specify how the information contained in DNA codes

25: amino acids, is called ``the genetic code". Using a simplified

26: version of the Penna nodel, we are using computer simulations to

27: investigate the importance of the genetic code and the number of

28: amino acids in Nature on population dynamics. We find that the

29: genetic code is not a random pairing of codons to amino acids and

30: the number of amino acids in Nature is an optimum under mutations.

31: \end{abstract}

32:

33: \end{frontmatter}

34:

35: \section{INTRODUCTION\protect\\ }

36: \label{sec:level1}

37: In general population dynamics is a matter of interest for biologists,

38: however it has attracted the attention of physicists since it is a

39: subject very closely related to statistical mechanics. Investigation

40: of population dynamics in Nature is not a simple task because to

41: get any idea about the dynamics of population growth, one has to

42: consider many generations of a population. Even if one can find

43: fast-reproducing species like the fruit fly, checking all individuals in

44: such a population is not an easy task either. Therefore, modelling

45: with computers has lots of advantages such as considerably small

46: time consumption and simplicity in population monitoring.

47:

48: The most successful computational model for age-structured

49: populations is the Penna model \cite{Penna}. In this model,

50: individuals are represented by bit-strings which are 32 bits long

51: and are initially set to zero. Each bit represents a given age:

52: as the individual gets older we move down on the bit-string.

53: Bits which are set to zero represent that no bad

54: mutations is stored at that age. However, if a bit is set to one,

55: it means that the individual suffers a disease at that age and its

56: probability of staying alive is decreased.

57:

58: The Penna model has been successfully used to investigate the

59: advantages of sexual reproduction over asexual reproduction

60: \cite{Stauffer13580}\cite{Stauffer13600}\cite{Stauffer13570}\cite{Sousa680}\cite{Tuzel2770}\cite{Tuzel21970}\cite{Orcal3410},

61: certain features of ecology \cite{Penna2510} and population dynamics

62: \cite{Penna13590}\cite{Penna13610}\cite{Huang3140}.

63:

64: To investigate the importance of \textit{the genetic code} and

65: number of amino acids in population dynamics we have constructed

66: a model based on the Penna model.

67:

68: The genetic information about the individuals is stored in the DNA.

69: DNA is made up of different monomers. Each monomer, nucleotide,

70: in the chain carries a heterocyclic base. In DNA, these bases

71: are adenine (A), guanine (G), cytosine (C) and thymine (T).

72: Proteins are synthesized from amino acids using the information

73: stored in the DNA. As there are four bases in the DNA, and 20 amino

74: acids used in proteins, during protein synthesis there is not a

75: one-to-one correspondence between the nucleotides in the DNA and the

76: amino acids in a protein. Rather, the linear sequence of bases

77: which constitutes the protein-coding information is "read" by the

78: cell in blocks of three nucleotide residues, or codons, each of

79: which specifies a different amino acid. If we consider a

80: nucleotide on the DNA to be a letter in a four-letter alphabet,

81: codons can be thought as words with three letters. Hence, there

82: are sixty four words to code the twenty different amino acids. The

83: set of rules that specifies which nucleic acid codons correspond

84: to which amino acid is known as the genetic code.

85:

86:

87:

88: \section{COMPUTATIONAL METHOD\protect\\ }

89: \label{sec:level2}

90:

91: If there is a mutation on a gene which causes a change in the

92: amino acid chain, we will think that the organism may not be able

93: to build the protein which may be crucial for the organism. If so;

94: it will not function properly or it may simply die; hence it is

95: simplistic yet reasonable to represent the organism by a single

96: gene.

97:

98: In our model, to represent a whole individual, we took a real gene

99: from Nature ``human cytokine" (LD78 Homo sapiens blood lymphocyte

100: gene on the DNA 17$^{th}$ chromosome) \cite{gene}. This gene is

101: necessary for activating lymphocytes; therefore if it is missing

102: the human body cannot perform immune responses.

103:

104: If all other effects (aging, food restriction, illness etc.) are

105: neglected, mutation will be the only possible cause of death.

106: Also, in our simplistic model reproduction is not included,

107: therefore we have a population which can only decrease as a

108: result of mutations. We use this model to investigate the

109: effects of mutations to population decrease.

110:

111: A mutation in our model is a process acting at each site

112: independently. We disregard more complicated processes such as

113: deletions or insertions, and we only look at single nucleotide

114: replacements by another nucleotide in the gene. Normally the

115: rates for these replacements depend on the two nucleotides being

116: interchanged. The simplest approach to the problem is to take all

117: mutation rates to be equal, known as the Jukes-Cantor mutation scheme

118: \cite{Jukes21920}.

119:

120: The mutation is taken to be deleterious if it causes a change in

121: the amino acid chain; and not all the mutations kill the individual.

122: A real gene is composed of two different parts: a coding portion and

123: a noncoding portion. The coding part, exon, is responsible for coding

124: for proteins whereas the rest, intron, does not code for a protein and

125: the purpose of this part is not clearly understood yet. If a mutation

126: takes place on intron part, it is considered to be simply harmless

127: but if it takes place on exon part, it is usually harmful, but there

128: is still a chance: The interchanged codon may still code the same amino

129: acid since more than one codon can code one amino acid in nature.

130:

131: To be more explicit, the codons AAA and AAG code the same amino

132: acid, ``lysine''; hence if AAA turns into AAG as a result of a

133: mutation the amino acid will not change and the protein can be

134: constructed safely. However; if AAA turns into AGA, which codes

135: the amino acid ``arginine'', the amino acid chain will change and

136: we assume that the protein can not build up, which means the

137: represented organism will die.

138:

139: There can be a mutation which converts AAA to AAX where X $\neq$

140: {A, G, C, T}; then the individual dies automatically. As a model,

141: we are looking at a simpler case where a mutation changes A to one

142: of G, C, T not X.

143:

144: Since reproduction is not included in the model, the population

145: can only diminish. The decrease in population can be found by

146: calculating the probability of a deleterious mutation. The

147: probability of the mutation changing the amino acid depends on

148: the codon; so one needs to find the probability of hitting each

149: different codon type. First, the probability of hiting a codon

150: type ($P_{\alpha}$) is calculated as the ratio of the number

151: of codons of that type in the gene ($N_{\alpha}$) to total

152: number of codons. Then we need to exclude the mutations that

153: do not cause a change in the amino acid and calculate the

154: probability of a change occurring in the amino acid caused by

155: a change in one nucleotide ($P(d/{\alpha}$)) is calculated.

156:

157: As an example; only two codons code the amino acid ``lysine'':

158: AAA and  AAG. In the exon part of human cytokine gene, there are

159: only three ``AAA'' codons and the total number of codons in the gene

160: is 207, hence the probability of the mutation hitting an ``AAA''

161: codon is simply $P_{\alpha} = 3\div207 = 0.0145$. By a point

162: mutation to ``AAA" we can have 9 different codons (AAC, AAG, AAU,

163: ACA, AGA, AUA, CAA, GAA, UAA). One of these codons still codes

164: the same amino acid (AAG). Therefore the probability of deleterious

165: mutation ($P(d/{\alpha})$) is 8/9 for ``AAA'' in the human cytokine

166: gene.

167:

168: Next, we need to calculate the probability of hitting the exon part

169: of the gene as the ratio of the exon part to the total gene. In the

170: human cytokine gene, there are 621 nucleotides in exon part and

171: 1447 ones in intron part:

172: \begin{equation}

173: P(hitting \: exon) = \frac{621}{2068}= 0.3032\label{}

174: \end{equation}

175:

176: \noindent Hence; the probability of having a deleterious mutation

177: for all of the gene is simply:

178: \begin{equation}

179: P(deleterious) = P(hitting \, exon) \sum_{\alpha = 1}^{64} [

180: P_{\alpha} P(d/\alpha)]  = 0.2344 \label{}

181: \end{equation}

182:

183: \noindent The survival probability can be calculated by:

184:

185: \begin{equation}

186: P(surviving) = 1- P(deleterious) = 0.7656 \label{}

187: \end{equation}

188:

189: \noindent If we take a population of $N_0$ gene (individuals),

190: after n mutations, to the first approximation, the number of

191: surviving individuals is given by:

192:

193: \begin{equation}

194: N_n  \approx N_0P(surviving)^n \label{}

195: \end{equation}

196:

197: \noindent Hence, we obtain the ``probability of survival" with

198: the slope of the number of surviving individuals versus

199: time graph:

200:

201: \begin{equation}

202: slope \approx ln[P(surviving)] = -0.2670 \label{slope}

203: \end{equation}

204:

205: During this calculation we used a simple assumption that after each

206: timestep the genome remain the same as the wildtype. A mutation may

207: result in a different nucleotide sequence, but if this sequence codes

208: the same amino acid, we assume that this mutation has never happened

209: and go to the next stage. Hence, all alive individuals can be represented

210: by the same array, wildtype, as in the calculations. However, in the

211: less likely event of a harmless mutation the number of the codons of

212: each type changes which will in turn slightly change the probabilities.

213: We have designed a test simulation where after each mutation and deletion

214: of the individual, we have set all the sequence back to the wildtype. As

215: this simulation gives the same results (within the error bars) as the

216: original case, we have used the modified sequence in the later stages.

217:

218: \section{SIMULATION\protect\\ }

219: \label{sec:level3}

220:

221: In the simulation, an individual (a gene) is represented by an array

222: which contains 0, 1, 2, and 3's instead of the nucleotides Adenine (A),

223: Guanine (G), Cytosine (C) and Thymine (Uracil (U)) respectively and

224: also a sign bit which shows if the gene has a deleterious mutation (1)

225: or not (0).

226:

227: In each ``cycle", each individual has to go through one mutation event,

228: then it is determined whether or not the individual should die. In the

229: mutation event; the place of mutation and the mutant nucleotide is

230: determined randomly. If the nucleotide is not in the exon part, the

231: sign bit remains 0. Otherwise, the changed codon is checked for the

232: amino acid which it codes. If it is coding the same amino acid, the

233: protein can still be built, therefore the sign bit is not changed and

234: the individual survives. However, if the amino acid is changed then the

235: sign bit becomes '1'  that means this individual will be deleted. Deletion

236: time is recorded for each individual [Fig.\ref{flowchart} (a)]. In the

237: control simulation, if the mutation is harmless, modification of the gene

238: will be recovered [Fig.\ref{flowchart}(b)].

239:

240: \begin{figure}[!]

241: \begin{center}

242: \includegraphics[width=14cm]{figure1.eps}

243: \caption{a.) Flowchart of the simulation

244: b.) Flowchart of the control simulation}

245: \label{flowchart}

246: \end{center}

247: \end{figure}

248:

249: After $N$ individuals, the number of surviving individuals in each time

250: step is calculated. Since the probability of mutation is independent

251: of the number of individuals, this number also gives us the population

252: size. Hence, we have an exponential population decay and the exponent

253: depends on the probability of surviving ($P(surviving)$). Logarithm of

254: the population is fitted to a straight line and the slope of the line

255: is calculated.

256:

257: We run all simulations ten times. The average of the slopes of the

258: control simulations  $-0.266 \pm 0.001$, which is noticeably close to

259: the slope derived from calculations. After the control, we run the

260: simulation using genetic code of Nature. One example of such runs is

261: shown in Figure \ref{slope}. The average of slopes for Nature's

262: simulation is $-0.266 \pm 0.001$.\\

263:

264: \begin{figure}[!]

265: \begin{center}

266: \includegraphics[width=14cm]{figure2.eps}

267: \caption {Population decreasing: one of the simulations of the

268: amino acids table of Nature}\label{slope}

269: \end{center}

270: \end{figure}

271:

272: \section{ARTIFICIAL TABLES\protect\\ }

273: \label{sec:level4}

274:

275: With a few exceptions, twenty different kinds of amino acids

276: are used to build the proteins. Even though in some rare

277: cases certain organisms use selenocysteine and pyrrolysine,

278: in Nature, the majority uses the same table. Recently a team of

279: investigators at the Scripps Research Institute modified a

280: form of the bacterium Escherichia coli to use a 22-amino acid

281: genetic code instead of 20 \cite{Anderson21930}. They have

282: engineered the modified form of E. Coli to make myoglobin

283: proteins with 22 amino acids, using the unnatural amino acids

284: O-methyl-L-tyrosine and L-homoglutamine in addition to the

285: naturally occurring 20. This work opens up the possibility

286: that the same procedure can be used to expand the amino acid

287: family even further. So, the question is why did life stop

288: with twenty amino acids? To investigate the importance

289: of the number of amino acids, we create amino acid tables

290: based on Nature's table but with different amino acid numbers

291: and we use them in the simulation.

292:

293: If we change the number of amino acids in the genetic code, it

294: means that we change the amount of information in the genome.

295: Hence, to conserve the information, the genome length needs to be

296: adjusted also. Moreover, if we want our simulation to represent

297: Nature, the process of extending or shortening the amino acid

298: table needs to obey some rules of biochemistry.

299:

300: The twenty amino acids contain with their twenty different side

301: chains of different chemical properties. This allows proteins to have

302: such a great variety of structures and properties. There are

303: several classes of side chains, grouped by their dominant chemical

304: features. While developing tables, we try to make them fit the

305: natural structure of amino acids obeying the classification in

306: \cite{Matthews}.

307:

308: We have tried to use amino acid tables with more and with less number

309: of amino acids. To shorten the amino acid tables, we first randomly

310: choose which amino acid will be removed from the table. The random

311: choice is made such that no two amino acids are removed from the

312: same group (as long as the number of removed amino acids is less than

313: the number of amino acids groups).

314:

315: For example, in the table which has 18 amino acids Glutamine and

316: Isoleucine are removed. Glutamine is in the acidic group and its

317: frequency of occurrence is $3.9 \%$. The codons which code Glutamine

318: formerly (CAA, CAG) will code Glutamic Acid which is also in acidic

319: group and has the frequency of $6.2 \%$. Isoleucine is in the aliphatic

320: group and its frequency is $4.6 \%$. The codons which code Isoleucine

321: formerly (AUU, AUC, AUA) will code Glycine which is also in aliphatic

322: group and has the frequency of $7.5 \%$.

323:

324: To conserve the information content of the gene, the gene should be

325: lengthened. As an example, by changing the codons CAA and CAG from

326: Glutamine to Glutamic Acid we have lost the information carried by

327: Glutamine. Now we take Asparagine and Aspartic Acid, which are also

328: in the acidic group, and insert them where Glutamine was originally.

329: The same procedure was repeated to decrease the number of amino acids

330: to 16 and then to 14.

331:

332: To extend the amino acid table, first we determine which consecutive

333: amino acid pair in the gene has the highest frequency. To do this, we

334: calculate the number of occurrence of pairs and construct the pair matrix.

335: Then, each frequent pair is replaced by a new amino acid. For example,

336: Leucine - Leucine pair has the highest frequency and they will be replaced

337: by the new amino acid called New1. Similarly Threonine - Serine pairs are

338: replaced by New2. Leucine is from the aliphatic group, and New1 is

339: constructed by dividing Alanine, which is also from the aliphatics group

340: and is represented by the highest number of codons. Now the codons GCU

341: and GCC still code Alanine but the remaining (GCA, GCG) will code New1.

342: Similary, New2 is formed by dividing Serine: the codons UCU, UCC, UCA,

343: and UCG code still Serine but the others (AGU and AGG) code New2. The

344: tables with 24 amino acid , with 26 amino acids and with 28 amino acids

345: are constructed just the same way.

346:

347: As control cases we have also done simulations with different amino acid

348: tables, both increased and decreased number of amino acids, where we

349: neglected the conservation of information and kept the genome length

350: constant. As expected, the results of these simulations were very close

351: (within error bars) to the results of the original table.

352:

353: Biologists have also been trying to find simplified amino acid alphabets.

354: One of these methods is the MJ matrix constructed using Wang and Wang's

355: method \cite{Wang21950} which is based on Miyazawa-Jernigan's (MJ) residue -

356: residue potentials \cite{Miyazawa21960}. Their reduction algorithm, which

357: connects different representations of a protein, is generally based on

358: the idea that the amino acids can be distributed into different groups,

359: with different interactions. The interactions between amino acids of two

360: different groups should have similar characteristics for a successful

361: reduction.

362:

363: Another method is the BLOSUM50 matrix, built using Murphy, Wallqvist and

364: Levy's method \cite{Murphy21940} derived by Henikoff and Henikoff

365: \cite{Henikoff7460}. Their reduction scheme is based on the analysis of

366: correlations among similarity matrix elements used for sequence alignment.

367: We have constructed reduced amino acid tables using both the MJ matrix and

368: BLOSUM50 matrix methods.

369:

370: \section{CONCLUSION\protect\\ }

371: \label{sec:level6}

372:

373: In this paper, we developed a computer simulation which represents

374: a living organism under mutations. Furthermore, we changed the

375: genetic code used in the simulations to analyze its effect on

376: population stability.

377:

378: All the results of different simulations are summarized in

379: Table \ref{results} and plotted in Figure \ref{resultfig}.

380:

381: \begin{center}

382: \begin{table}[!]

383: \caption{Results of the simulations using different genetic code

384: tables.}

385: \label{results}

386: \begin{tabular}{|l|c|}

387: \hline Table Name & ``probability of survival" \\

388: \hline Nature & $-0.266 \pm 0.001$\\

389: \hline with 18 & $-0.281 \pm 0.001$\\

390: \hline with 16 & $-0.291 \pm 0.004$\\

391: \hline with 14 & $-0.313 \pm 0.004$\\

392: \hline with 18 (using MJ Matrix)& $-0.282 \pm 0.002$\\

393: \hline with 16 (using MJ Matrix)& $-0.287 \pm 0.003$\\

394: \hline with 14 (using MJ Matrix)& $-0.320 \pm 0.003$\\

395: \hline with 18 (using BLOSUM50 Matrix)& $-0.288 \pm 0.003$\\

396: \hline with 16 (using BLOSUM50 Matrix)& $-0.294 \pm 0.003$\\

397: \hline with 14 (using BLOSUM50 Matrix)& $-0.302 \pm 0.004$\\

398: \hline with 22 & $-0.265 \pm 0.001$\\

399: \hline with 24 & $-0.265 \pm 0.001$\\

400: \hline with 26 & $-0.273 \pm 0.001$\\

401: \hline with 28 & $-0.291 \pm 0.002$\\

402: \hline

403: \end{tabular}%

404: \end{table}

405: \end{center}

406:

407: \begin{figure}[!]

408: \begin{center}

409: \includegraphics[width=14cm]{figure3.eps}

410: \caption{``probabilities of survival" of different genetic code

411: tables. The fit to a parabola is just to giude the eye.}

412: \label{resultfig}

413: \end{center}

414: \end{figure}

415:

416: The results of shorter amino acid tables show that if we try to

417: conserve the information, the population which is represented by

418: less amino acids is affected more severely by mutations. However,

419: if we do not mind the information transferred by the gene (the

420: simulations with conserved genome length), the population is not

421: affected much.

422:

423: Even if we use different reducing algorithms (MJ or BLOSUM50) for

424: genetic code, the population is affected by mutations more than

425: the population represented by the genetic code of Nature.

426:

427: While we extend the amino acid table, we shorten the size of the

428: gene which means that we try to conserve the information. The

429: resistance of the population against mutations does not change

430: when the amino acid number is 22 or 24. However, after 24, the

431: resistance starts to decrease.

432:

433: The slopes of the simulations with 20, 22 and 24 amino acids are

434: very close. These results are along the same line with the results

435: of the investigators from the Scripps Research Institute

436: \cite{Anderson21930} and provides computational justification for their

437: belief that genetic codes with even more amino acids is feasible.

438: However the number of amino acids will be restricted to 20-22-24 if

439: we want the resulting life form to be resilient against mutations.

440:

441: \section{ACKNOWLEDGEMENTS}

442: I am grateful to Dr. Isil Aksan Kurnaz and Dr. Muhittin Mungan for

443: their contributions on the model and the calculations.

444:

445:

446:

447: \begin{thebibliography}{10}

448: \expandafter\ifx\csname bibnamefont\endcsname\relax

449:   \def\bibnamefont#1{#1}\fi

450: \expandafter\ifx\csname bibfnamefont\endcsname\relax

451:   \def\bibfnamefont#1{#1}\fi

452: \expandafter\ifx\csname url\endcsname\relax

453:   \def\url#1{\texttt{#1}}\fi

454: \expandafter\ifx\csname urlprefix\endcsname\relax\def\urlprefix{URL }\fi

455: \providecommand{\bibinfo}[2]{#2}

456: \providecommand{\eprint}[2][]{\url{#2}}

457:

458: \bibitem{Penna}

459: \bibinfo{author}{\bibfnamefont{T. J. P.} \bibnamefont{Penna}},

460:   \bibinfo{journal}{J. Stat. Phys.} \textbf{\bibinfo{volume}{78}},

461:   \bibinfo{pages}{1629} (\bibinfo{year}{1995}).

462:

463: \bibitem{Stauffer13580}

464: \bibinfo{author}{\bibfnamefont{D.} \bibnamefont{Stauffer}},

465:   \bibinfo{journal}{Physica A} \textbf{\bibinfo{volume}{273}},

466:   \bibinfo{pages}{132} (\bibinfo{year}{1999}).

467:

468: \bibitem{Stauffer13600}

469: \bibinfo{author}{\bibfnamefont{D.} \bibnamefont{Stauffer}},

470:   \bibinfo{author}{\bibfnamefont{P. M. C.} \bibnamefont{de Oliveira}},

471:   \bibinfo{author}{\bibfnamefont{S. M.} \bibnamefont{de Oliveira}} \bibnamefont{and}

472:   \bibinfo{author}{\bibfnamefont{R. M. Z.} \bibnamefont{dos Santos}},

473:   \bibinfo{journal}{Physica A} \textbf{\bibinfo{volume}{231}},

474:   \bibinfo{pages}{504} (\bibinfo{year}{1996}).

475:

476: \bibitem{Stauffer13570}

477: \bibinfo{author}{\bibfnamefont{D.} \bibnamefont{Stauffer}},

478:   \bibinfo{author}{\bibfnamefont{P. M. C.} \bibnamefont{de Oliveira}},

479:   \bibinfo{author}{\bibfnamefont{S. M.} \bibnamefont{de Oliveira}},

480:   \bibinfo{author}{\bibfnamefont{T. J. P.} \bibnamefont{Penna}} \bibnamefont{and}

481:   \bibinfo{author}{\bibfnamefont{J. S.} \bibnamefont{Sa Martins}},

482:   \bibinfo{journal}{Anais Da Academia Brasileira De Ciencias} \textbf{\bibinfo{volume}{73}},

483:   \bibinfo{pages}{15} (\bibinfo{year}{2001}).

484:

485: \bibitem{Sousa680}

486: \bibinfo{author}{\bibfnamefont{A. O.} \bibnamefont{Sousa}},

487:   \bibinfo{author}{\bibfnamefont{S. M.} \bibnamefont{de Oliveira}} \bibnamefont{and}

488:   \bibinfo{author}{\bibfnamefont{J. S.} \bibnamefont{Sa Martins}},

489:   \bibinfo{journal}{Phys. Rev. E} \textbf{\bibinfo{volume}{67}},

490:   \bibinfo{pages}{32903} (\bibinfo{year}{2003}).

491:

492: \bibitem{Tuzel2770}

493: \bibinfo{author}{\bibfnamefont{E.} \bibnamefont{Tuzel}},

494:   \bibinfo{author}{\bibfnamefont{V.} \bibnamefont{Sevim}} \bibnamefont{and}

495:   \bibinfo{author}{\bibfnamefont{A.} \bibnamefont{Erzan}},

496:   \bibinfo{journal}{Proc. Natl. Acad. Sci.} \textbf{\bibinfo{volume}{98}},

497:   \bibinfo{pages}{13774} (\bibinfo{year}{2001}).

498:

499: \bibitem{Tuzel21970}

500: \bibinfo{author}{\bibfnamefont{E.} \bibnamefont{Tuzel}},

501:   \bibinfo{author}{\bibfnamefont{V.} \bibnamefont{Sevim}} \bibnamefont{and}

502:   \bibinfo{author}{\bibfnamefont{A.} \bibnamefont{Erzan}},

503:   \bibinfo{journal}{Phys. Rev. E} \textbf{\bibinfo{volume}{64}},

504:   \bibinfo{pages}{061908} (\bibinfo{year}{2001}).

505:

506: \bibitem{Orcal3410}

507: \bibinfo{author}{\bibfnamefont{B.} \bibnamefont{Orcal}},

508:   \bibinfo{author}{\bibfnamefont{E.} \bibnamefont{Tuzel}},

509:   \bibinfo{author}{\bibfnamefont{V.} \bibnamefont{Sevim}},

510:   \bibinfo{author}{\bibfnamefont{N.} \bibnamefont{Jan}} \bibnamefont{and}

511:   \bibinfo{author}{\bibfnamefont{A.} \bibnamefont{Erzan}},

512:   \bibinfo{journal}{Int. J. Mod. Phys. C} \textbf{\bibinfo{volume}{11}},

513:   \bibinfo{pages}{973} (\bibinfo{year}{2000}).

514:

515: \bibitem{Penna2510}

516: \bibinfo{author}{\bibfnamefont{T. J. P.} \bibnamefont{Penna}},

517:   \bibinfo{author}{\bibfnamefont{A.} \bibnamefont{Racco}},

518:   \bibinfo{author}{\bibfnamefont{A. O.} \bibnamefont{Sousa}},

519:   \bibinfo{journal}{Physica A} \textbf{\bibinfo{volume}{295}},

520:   \bibinfo{pages}{31} (\bibinfo{year}{2001}).

521:

522: \bibitem{Penna13590}

523: \bibinfo{author}{\bibfnamefont{T. J. P.} \bibnamefont{Penna}} \bibnamefont{and}

524:   \bibinfo{author}{\bibfnamefont{D.} \bibnamefont{Stauffer}},

525:   \bibinfo{journal}{Zeitschrift Fur Physik B-Condensed Matter} \textbf{\bibinfo{volume}{101}},

526:   \bibinfo{pages}{469} (\bibinfo{year}{1996}).

527:

528: \bibitem{Penna13610}

529: \bibinfo{author}{\bibfnamefont{T. J. P.} \bibnamefont{Penna}},

530:   \bibinfo{author}{\bibfnamefont{S. M.} \bibnamefont{de Oliveira}} \bibnamefont{and}

531:   \bibinfo{author}{\bibfnamefont{D.} \bibnamefont{Stauffer}},

532:   \bibinfo{journal}{Phys. Rev. E} \textbf{\bibinfo{volume}{52}},

533:   \bibinfo{pages}{R3309} (\bibinfo{year}{1995}).

534:

535: \bibitem{Huang3140}

536: \bibinfo{author}{\bibfnamefont{Z. F.} \bibnamefont{Huang}} \bibnamefont{and}

537:   \bibinfo{author}{\bibfnamefont{D.} \bibnamefont{Stauffer}},

538:   \bibinfo{journal}{Theory in Biosciences} \textbf{\bibinfo{volume}{120}},

539:   \bibinfo{pages}{21} (\bibinfo{year}{2001}).

540:

541: \bibitem{gene}

542: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=nucleotide\&val=285912.

543:

544: \bibitem{Jukes21920}

545: \bibinfo{author}{\bibfnamefont{T. H.} \bibnamefont{Jukes}} \bibnamefont{and}

546:   \bibinfo{author}{\bibfnamefont{C. R.} \bibnamefont{Cantor}}

547:   \emph{\bibinfo{title}{in Mammalian Protein Metabolism, edited by H. N. Munro}}

548:   (\bibinfo{publisher}{Academic Press, New York},

549:   \bibinfo{pages}{21} \bibinfo{year}{1969}).

550:

551: \bibitem{Anderson21930}

552: \bibinfo{author}{\bibfnamefont{J. C.} \bibnamefont{Anderson}},

553:   \bibinfo{author}{\bibfnamefont{N.} \bibnamefont{Wu}},

554:   \bibinfo{author}{\bibfnamefont{S. W.} \bibnamefont{Santoro}},

555:   \bibinfo{author}{\bibfnamefont{V.} \bibnamefont{Lakshman}},

556:   \bibinfo{author}{\bibfnamefont{D. S.} \bibnamefont{King}} \bibnamefont{and}

557:   \bibinfo{author}{\bibfnamefont{P. G.} \bibnamefont{Schultz}},

558:   \bibinfo{journal}{Proc. Natl. Acad. Sci.} \textbf{\bibinfo{volume}{101}},

559:   \bibinfo{pages}{7566} (\bibinfo{year}{2004}).

560:

561: \bibitem{Matthews}

562: \bibinfo{author}{\bibfnamefont{T. J.}~\bibnamefont{Matthews}},

563:   \emph{\bibinfo{title}{Biochemistry}} (\bibinfo{publisher}{Addison

564:   Wesley Longman, San Fransisco}, \bibinfo{year}{2000}).

565:

566: \bibitem{Wang21950}

567: \bibinfo{author}{\bibfnamefont{J.} \bibnamefont{Wang}} \bibnamefont{and}

568:   \bibinfo{author}{\bibfnamefont{W.} \bibnamefont{Wang}}

569:   \bibinfo{journal}{Nature Structral Biology} \textbf{\bibinfo{volume}{6}},

570:   \bibinfo{pages}{1033} (\bibinfo{year}{1999}).

571:

572: \bibitem{Miyazawa21960}

573: \bibinfo{author}{\bibfnamefont{S.} \bibnamefont{Miyazawa}} \bibnamefont{and}

574:   \bibinfo{author}{\bibfnamefont{L. R.} \bibnamefont{Jernigan}}

575:   \bibinfo{journal}{J. Mol. Biol.} \textbf{\bibinfo{volume}{256}},

576:   \bibinfo{pages}{623} (\bibinfo{year}{1996}).

577:

578: \bibitem{Murphy21940}

579: \bibinfo{author}{\bibfnamefont{L. R.} \bibnamefont{Murphy}},

580:   \bibinfo{author}{\bibfnamefont{A.} \bibnamefont{Wallqvist}} \bibnamefont{and}

581:   \bibinfo{author}{\bibfnamefont{R. M.} \bibnamefont{Levy}},

582:   \bibinfo{journal}{Prot. Eng.} \textbf{\bibinfo{volume}{13}},

583:   \bibinfo{pages}{149} (\bibinfo{year}{2000}).

584:

585: \bibitem{Henikoff7460}

586: \bibinfo{author}{\bibfnamefont{S.} \bibnamefont{Henikoff}} \bibnamefont{and}

587:   \bibinfo{author}{\bibfnamefont{J. G.}~\bibnamefont{Henikoff}}

588:   \bibinfo{journal}{Proc. Natl. Acad. Sci.} \textbf{\bibinfo{volume}{89}},

589:   \bibinfo{pages}{10915} (\bibinfo{year}{1992}).

590:

591: \end{thebibliography}

592:

593:

594:

595: \end{document}

596: