q-bio0604006/Intro.tex
1: \section{Introduction and Motivation}\label{sec:intro}
2: The theory of complex networks plays an important role in a wide
3: variety of disciplines, ranging from communications and power
4: systems engineering to molecular and population biology
5: \cite{AlbBar02, BarOlt04, New03, DorMen02, AlmArk03, AlbJeoBar99,
6: Bra03, Alo03}. While the focus of this article is on biological
7: applications of the theory of graphs and networks, there are also
8: several other domains in which networks play a crucial role. For
9: instance, the Internet and the World Wide Web (WWW) have grown at
10: a remarkable rate, both in size and importance, in recent years,
11: leading to a pressing need both for systematic methods of
12: analysing such networks as well as a thorough understanding of
13: their properties. Moreover, in sociology and ecology, increasing
14: amounts of data on food-webs and the structure of human social
15: networks are becoming available. Given the critical role that
16: these networks play in many key questions relating to the
17: environment and public health, it is hardly surprising that
18: researchers in ecology and epidemiology have focussed attention on
19: network analysis in recent years.  In particular, the complex
20: interplay between the structure of social networks and the spread
21: of disease is a topic of critical importance.  The threats to
22: human health posed by new infectious diseases such as the SARS
23: virus and the Asian bird flu \cite{WanRua04, Meyetal05}, coupled
24: with modern travel patterns, underline the vital nature of this
25: issue.
26: 
27: On a more theoretical level, several recent studies have indicated
28: that networks from a broad range of application areas share common
29: structural properties. Furthermore, a number of the properties
30: observed in such real world networks are incompatible with those
31: of the random graphs which had been traditionally employed as
32: modelling tools for complex networks \cite{AlbBar02, New03}. The
33: latter observation naturally poses the challenge of devising more
34: accurate models for the topologies observed in biological and
35: technological networks, while the former further motivates the
36: development of analysis tools for complex networks. The common
37: structural properties shared by diverse networks offers the hope
38: that such tools may prove useful for applications in a wide
39: variety of disciplines.  Within the fields of Biology and
40: Medicine, applications include the identification of drug targets,
41: determining the role of proteins or genes of unknown function
42: \cite{JeoOltBar03, SamLia03}, the design of effective containment
43: strategies for infectious diseases \cite{Eubetal04}, and the early
44: diagnosis of neurological disorders through detecting abnormal
45: patterns of neural synchronization in specific brain regions
46: \cite{SchGro05}.
47: 
48: Recent advances in the development of high-throughput techniques
49: in molecular biology have led to an unprecedented amount of data
50: becoming available on key cellular networks in a variety of simple
51: organisms \cite{Itoetal01, Cosetal00}.  Broadly speaking, three
52: classes of bio-molecular networks have attracted most attention to
53: date: metabolic networks of biochemical reactions between
54: metabolic substrates; protein interaction networks consisting of
55: the physical interactions between an organism's proteins; and the
56: transcriptional regulatory networks which describe the regulatory
57: interactions between different genes.  At the time of writing, the
58: central metabolic networks of numerous bacterial organisms have
59: been mapped \cite{Ravetal02}. Also, large scale data sets are
60: available on the structure of the protein interaction networks of
61: {\it S. cerevisiae} \cite{Itoetal01, Uetzetal00}, {\it H. pylori}
62: \cite{Raietal01}, {\it D. melanogaster} \cite{Gioetal03} and {\it
63: C. elegans} \cite{Lietal04, Cosetal00}, and the transcriptional
64: regulatory networks of {\it E. coli} and {\it S. cerevisiae} have
65: been extensively studied \cite{Ihmetal02, Shenetal02}.  The large
66: amount of data now available on these networks provides the
67: network research community with both opportunities and challenges.
68: 
69: On the one hand, it is now possible to investigate the structural
70: properties of networks in living cells, to identify their key
71: properties and to hopefully shed light on how such properties may
72: have evolved biologically. A major motivation for the study of
73: biological networks is the need for tailored analysis methods
74: which can extract meaningful biological information from the data
75: becoming available through the efforts of experimentalists.  This
76: is all the more pertinent given that the network structures
77: emerging from the results of high-throughput techniques are too
78: complex to analyse in a non-systematic fashion. A knowledge of the
79: topologies of biological networks, and of their impact on
80: biological processes, is needed if we are to fully understand, and
81: develop more sophisticated treatment strategies for, complex
82: diseases such as cancer \cite{VogLanLev00}.  Also, recent work
83: suggesting connections between abnormal neural synchronization and
84: neurological disorders such as {\it Parkinson's disease} and {\it
85: Schizophrenia} \cite{SchGro05} provides strong motivation for
86: studying how network structure influences the emergence of
87: synchronization between interconnected dynamical systems.
88: 
89: The mathematical discipline which underpins the study of complex
90: networks in Biology and elsewhere, and on which the techniques
91: discussed throughout this article are based, is {\it graph theory}
92: \cite{Die00}.  Alongside the potential benefits of applying graph
93: theoretical methods in molecular biology, it should be emphasized
94: that the complexity of the networks encountered in cellular
95: biology and the mechanisms behind their emergence presents the
96: network researcher with numerous challenges and difficulties.  The
97: inherent variability in biological data, the high likelihood of
98: data inaccuracy \cite{Meretal02} and the need to incorporate
99: dynamics and network topology in the analysis of biological
100: systems are just a sample of the obstacles to be overcome if we
101: are to successfully understand the fundamental networks involved
102: in the operation of living cells.  Another important issue, which
103: we shall discuss at various points throughout the article, is that
104: the structure of biological and social networks is often inferred
105: from sampled subnetworks.  The precise impact of sampling on the
106: results and techniques published in the recent past needs to be
107: understood if these are to be reliably applied to real biological
108: data.
109: 
110: Motivated by the considerations outlined above, a substantial
111: literature dedicated to the analysis of biological networks has
112: emerged in the last few years, and some significant progress has
113: been made on identifying and interpreting the structure of such
114: networks. Our primary goal in the present article is to provide as
115: broad a survey as possible of the major advances made in this
116: field in the recent past, highlighting what has been achieved as
117: well as some of the most significant open issues that need to be
118: addressed.  The material discussed in the article can be divided
119: naturally into two strands, and this is reflected in the
120: organisation of the document.  The first part of the article will
121: primarily be concerned with the properties and analysis of
122: cellular networks such as protein interaction networks and
123: transcriptional regulatory networks.  In the second part, we turn
124: our attention to two important applications of Graph Theory in
125: Biology: the phenomenon of synchronisation and its role in
126: neurological disorders, and the interaction between network
127: structure and epidemic dynamics.
128: 
129: In the interests of clarity, we shall now give a brief outline of
130: the main topics covered throughout the rest of the paper. In
131: Section 2, we shall fix the principal notations used throughout
132: the paper, and briefly review the main mathematical and graph
133: theoretical concepts that are required in the remainder of the
134: article.  As mentioned above, the body of the article is divided
135: into two parts.  The first part consists of Sections 3, 4 and 5
136: and the second part of Sections 6 and 7.  At the end of each major
137: section, a brief summary of the main points covered in that
138: section is given.
139: 
140: In Section 3, we shall discuss recent findings on the structure of
141: bio-molecular networks and discuss several graph models, including
142: {\it Scale-Free} graphs and {\it Duplication-Divergence} models,
143: that have been proposed to account for the properties observed in
144: real biological networks.  Section 4 is concerned with the
145: application of graph theoretical {\it measures of centrality or
146: importance} to biological networks. In particular, we shall
147: concentrate on the connection between the centrality of a gene or
148: protein within an interaction network and its likelihood to be
149: {\it essential} for the organism's survival. In Section 5, we
150: shall consider the {\it hierarchical structure} of biological
151: networks.  In particular, we shall discuss {\it motifs in
152: bio-molecular networks} and the identification of (typically
153: larger) functional modules.
154: 
155: In the second part of the article, we shall discuss two major
156: applications of Graph Theory to Biology.  Section 6 is concerned
157: with a number of issues and results related to the phenomenon of
158: synchronization in networks of inter-connected dynamical systems
159: and its relevance in various biological contexts.  Particular
160: attention will be given to suggested links between {\it patterns
161: of synchrony} and {\it neurological disorders}. In Section~7, we
162: shall discuss some recent work on the influence that the structure
163: of a social network can have on the behaviour of various disease
164: propagation models, and discuss the epidemiological significance
165: of these findings. Finally, in Section 8 we shall present our
166: concluding remarks and highlight some possible directions for
167: future research.
168: