q-bio0502007/spec.tex
1: \documentclass{article}
2: %\documentclass{revtex4}
3: \newcommand\rcsinfo{ \$Source: /home/axel/paper/spec/RCS/spec.tex,v $ $
4:   -- \$Revision: 1.53 $ $ -- \$Date: 2005/05/26 09:10:34 $ $ }
5: 
6: \usepackage[sort]{natbib}
7: \bibpunct{(}{)}{,}{a}{,}{,}
8: %\usepackage{rotating}
9: %% \usepackage{bibendnote}
10: %% \let\footnote=\bibendnote
11: \let\bibendnote=\footnote
12: 
13: \newcommand{\var}{\mathop{\mathrm{var}}}
14: \newcommand{\std}{\mathop{\mathrm{std}}}
15: \newcommand{\cm}[1]{({\small \sf #1})} % use for comments
16: \usepackage{amsfonts}
17: \usepackage{amsmath} 
18: \usepackage{amssymb}
19: \usepackage{url} 
20: %\usepackage{srcltx}
21: \usepackage{graphicx}
22: %\usepackage{showkeys}
23: %\usepackage{hyperref}
24: \usepackage[hypertex]{hyperref}
25: \newlength{\imgwidth}\setlength{\imgwidth}{0.8\textwidth}
26: 
27: \pagestyle{myheadings}
28: %\markright{\footnotesize \rcsinfo}
29: 
30: \usepackage{version}
31: \includeversion{onlysubmission}\excludeversion{onlypreprint}
32: %\excludeversion{onlysubmission}\includeversion{onlypreprint}
33: \excludeversion{motivation}
34: 
35: % The following parameters seem to provide a reasonable page setup.
36: \topmargin 0.0cm
37: \oddsidemargin 0.7cm
38: \textwidth 15cm
39: \textheight 21cm
40: \footskip 1.0cm
41: 
42: \usepackage{fullpage}
43: %% \usepackage{mydoublespace}
44: %% \def\baselinestretch{2}
45: %% \addtolength{\footnotesep}{1ex}
46: 
47: \setlength{\rightskip}{0pt plus 1fil} % Flatterrand
48: 
49: %%%%  automatically count words according to journal rules:
50: \def\ifUnDefinedCs#1{\expandafter\ifx\csname#1\endcsname\relax}
51: \ifUnDefinedCs{FreezeFont} \includeversion{nowordcount} \else
52: \excludeversion{nowordcount}
53: %%this used to argument to eat also interpunctation:
54: \renewcommand\citet[2]{\ignorespaces}
55: \renewcommand\cm[1]{\ignorespaces}
56: \renewcommand{\maketitle}{\ignorespaces} 
57: \fi
58: \renewcommand{\cm}[1]{\ignorespaces}
59: %%%%%%%%%%%%%%%%% END OF LOCAL PREAMBLE %%%%%%%%%%%%%%%%
60: \title{Some Properties of the Speciation Model\\ for Food-Web
61:   Structure --- \\ Mechanisms for Degree Distributions and Intervality}
62: 
63: \author{A. G. Rossberg$^\ast$, H. Matsuda, T. Amemiya, K. Itoh\\
64:   \normalsize{Yokohama National University, Graduate School of Environment}\\
65:   \normalsize{and Information Sciences, Yokohama 240-8501, Japan}\\
66:   \normalsize{$^\ast$Corresponding author. Tel.:
67:     +81-45-339-4369, FAX: +81-45-339-4353}\\
68: \small{E-mail addresses: rossberg@ynu.ac.jp (A.G.R.),
69:   matsuda2@ynu.ac.jp  (H.M.),
70:  }\\\small{amemiyat@ynu.ac.jp (T.A.), itohkimi@ynu.ac.jp (K.I.)}}
71: \begin{onlypreprint}
72:    \date{\small \rcsinfo}
73: \end{onlypreprint}
74: \begin{onlysubmission}
75:   \date{\today}
76: \end{onlysubmission}
77: 
78: \begin{document}
79: 
80: % Make the title.
81: \begin{nowordcount}
82:   \maketitle
83: \end{nowordcount}
84: 
85: \thispagestyle{headings}
86: 
87: \begin{abstract}
88:   \cm{We understand everything, but will the reader?  Will the
89:     referee? }
90:   
91:   We present a mathematical analysis of the speciation model for
92:   food-web structure, which had in previous work been shown to yield a
93:   good description of empirical data of food-web topology.  The degree
94:   distributions of the network are derived.  Properties of the
95:   speciation model are compared to those of other models that
96:   successfully describe empirical data.  It is argued that the
97:   speciation model unifies the underlying ideas of previous theories.
98:   In particular, it offers a mechanistic explanation for the success
99:   of the niche model of Williams and Martinez and the frequent
100:   observation of intervality in empirical food webs.
101: \end{abstract}
102: %up to five:
103: \textbf{Keywords:} food-web, evolution, network dynamics, degree
104: distribution, intervality
105: 
106: \newpage{}
107: 
108: 
109: \cm{Everything that looks like this is a comment and will not appear
110:   in the submitted version.}
111: 
112: \tableofcontents{}
113: 
114: \section{Introduction}
115: \label{sec:introduction}
116: 
117: The theoretical study of the topology of food webs, the networks
118: formed by the trophic interactions in ecological communities, has led
119: to increasingly precise descriptions of the empirically observed
120: structures.  In the early work of
121: \citet{cohen78:_food_webs_niche_space},
122: \citet{briand87:_envir_chain_length}, \citet{sugihara92:_niche} and
123: others, several simple food-web models had been investigated.  The
124: \emph{cascade model} \citep{cohen90:_commun_food_webs} was identified
125: as a description that reproduced the available data particularly well.
126: In the cascade model a food web consists of a fixed number $S$ of
127: species, and each species consumes any species which precedes it in a
128: given linear ordering with a fixed probability $C_0$.  The analysis of
129: this model led to several predictions
130: \citep{cohen90:_commun_food_webs} which inspired a more systematic and
131: accurate collection of food-web data by empiricists
132: \citep[e.g.,][]{polis91:_compl_des_web,martinez91:_artif_attr,hall91:_food_rich_web,havens92:_scale_webs}.
133: 
134: Based on the new data, \citet{williams00:_simpl} showed that their
135: \emph{niche model} was a significant improvement.  In this model,
136: species are ordered according to their niche value $n$ that is chosen
137: randomly from the interval $[0,1]$.  To determine the diet of a
138: species, an interval of random width $\le n$ is drawn with even
139: distribution from within\footnote{The original description of the
140:   model \citep{williams00:_simpl} is inaccurate at this point.}
141: $[0,1]$, restricted by the condition that at least half of the
142: interval is located below the niche value $n$ of this species.  Its
143: diet then consists of all species with a niche value in this interval.
144: %% This model can also account for trophic loops and cannibalism, which
145: %% are frequently observed in newer data.
146: 
147: A mathematical analysis by \citet{camacho02:_analytic_food_webs}
148: revealed the importance of the specific rule for determining the width
149: of the feeding intervals: by choosing it from an approximately
150: exponential distribution, the resulting food webs show a distribution
151: of generality (the number of a species' resources) which is strongly
152: skewed towards low values, in good accordance with observations
153: \citep{camacho02:_robus_patter_food_web_struc,%
154: stouffer05:_quant_patterns_webs}.
155: 
156: By construction, the niche model also reproduces a property called
157: \emph{intervality} \citep{cohen78:_food_webs_niche_space}: Species can
158: be ordered on a line in such a way that the diet of each consumer is a
159: contiguous set.  Intervality is surprisingly often found in small webs
160: \citep{cohen78:_food_webs_niche_space}.  Larger webs exhibit it to
161: some degree \citep{cohen90:_commun_food_webs,cattin04:_phylog}.
162: \citet{cattin04:_phylog} argued that intervality can be a consequence
163: of the fact that similar, evolutionary related species consume similar
164: resources.  They proposed the \emph{nested hierarchy model}, a
165: modification of the niche model which incorporates this idea and
166: better accounts for the observed degree of intervality.
167: 
168: Apart from these mostly descriptive models of food-web topology there
169: have also been several attempts to explain the structure of food webs
170: by the interaction of population dynamical and evolutionary mechanisms
171: \citep[e.g.,][]{caldarelli98:_model_multispec_commun,%
172: drossel01:_influen_predat_prey_popul_dynam,%
173: yoshida03:_evolut_web_sys,%
174: tokita03:_emerg_complex_stab_net}.  Characteristic for most of these
175: models is their high computational complexity, which makes their
176: quantitative statistical validation difficult.  Therefore it can be
177: advantageous to consider first explanatory models that are explicit in
178: terms of either population dynamics
179: \citep[e.g.,][]{pimm84:_compl_stab,montoya03:_topol_webs_real_to_assemb}
180: or evolutionary mechanisms
181: \citep[e.g.,][]{amaral99:_envir_chang_coext_patter_fossil_recor,drossel98:_extin_event_species_lifet_simpl_ecolog_model,camacho00:_extin}
182: alone.
183: 
184: The recently proposed \emph{speciation model} \citep{rossberg05:_web}
185: is of the purely evolutionary type.  It combines mechanisms
186: corresponding to speciations and extinctions with simple assumptions
187: regarding the evolutionary inheritance of trophic links.  In spirit,
188: the model is similar to the duplication-divergence model of proteome
189: evolution by \citet{vazquez03:_model_protein_net} or the related model
190: by \cite{pastor-satorras03:_evolv_prot_net}, even though in the
191: speciation model directed links and the possibility of extinctions
192: complicate the situation.  
193: 
194: Furthermore, the speciation model takes the tendency of food webs to
195: respect a ``pecking order'', as it is ideally realized in the cascade
196: model, into account.  It is currently unclear if the dominating mechanism
197: imposing this ordering of species is the physical advantage that larger
198: predators have over smaller prey, energy conservation and dissipation,
199: or some other constraint.  The idea that the pecking order is
200: essentially an ordering by body size has often been discussed
201: \citep{cattin04:_phylog,warren87:_pre_pry_triang,cohen93:_body_size,
202:   memmott00:_predat_size_web}.  The speciation model makes this
203: hypothesis explicit by postulating an allometric relationship between
204: body sizes and evolution rates.
205: 
206: The speciation model has been validated by a systematic statistical
207: analysis based on a comparison of twelve model properties---such as
208: the average chain length, the fraction of top predators, the degree of
209: intervality, or the clustering coefficient---with empirical data
210: \citep{rossberg05:_web}.  These numerical results suggest that the
211: speciation model reproduces observed food-web properties even better
212: than the niche model or the nested hierarchy model.
213: %
214: The aim of the current work is to present some analytic results that
215: allow insights into how important food web properties derive from the
216: model specifications.  After stating the model definition in
217: Sec.~\ref{sec:speciation_model}, the steady-state distribution of the
218: number of species $S$ and the expectation value of the directed
219: connectance $C$ (sometimes referred to as the food-web
220: ``complexity'') are derived in Sec.~\ref{sec:analysis}.  These
221: quantities are important because they are used as control parameters
222: in other models.  Section~\ref{sec:analysis} also contains a
223: characterization of the species pool in terms of evolutionary
224: ``clades'' which invites a comparison with empirical data.
225: Section~\ref{sec:distributions} is devoted to a characterization of
226: the model in terms of the distributions of generality and
227: vulnerability (the number of a species' consumers).  Based on these
228: results, the speciation model is compared with \cm{to} the cascade
229: model, the niche model, and the nested hierarchy model in
230: Sec.~\ref{sec:other-models}; common properties and differences are
231: pointed out.  Two variants of the speciation model, which leave the
232: analytic properties derived below unchanged, are introduced in
233: Sec.~\ref{sec:variants}.  A discussion and interpretation of the
234: results is provided in Sec.~\ref{sec:conclusion}.
235: 
236: \section{Definition of the speciation model}
237: \label{sec:speciation_model}
238: 
239: This section restates the definition of the speciation model given
240: elsewhere \citep{rossberg05:_web}, since it will be the starting point
241: for the subsequent analysis.  For a motivation of the model and a
242: discussion of design decisions we refer to the original work.
243: %
244: The speciation model describes an abstract species pool, the set of
245: trophic links between the species, and the evolution of both.  The
246: model is described in terms of a stochastic process characterized by
247: the parameters $r_1$, $r_+$, $r_-$, $R$, $D$, $\lambda$, $C_0$, and
248: $\beta$. 
249: 
250: \subsection{The evolution of the species pool}
251: 
252: Each species $i$ in the pool is associated with a \emph{speed
253:   parameter} $s_i$ in the range $[0,R]$.  The speed parameter
254: characterizes the evolution rate of a species and is thought to be
255: inversely correlated with the logarithm of the species' body mass by
256: an allometric law (see \citet{rossberg05:_web} for discussion).  In
257: any infinitesimal time interval $[t,t+dt]$ three kinds of events can
258: occur: \emph{adaptations} of foreign species to the habitat (i.e.
259: invasions on an evolutionary time scale), \emph{extinctions}, and
260: \emph{speciations}.  The probability for the adaptation of a new
261: species $k$ with speed parameter in the infinitesimal range
262: $s_k\in[s,s+ds]$ is $r_1 \exp(s)\, ds\, dt$.  When a new species is
263: adapting to the habitat, it is added to the species pool.  The
264: probability that some species $i$ of the species pool becomes extinct
265: is $r_- \exp(s_i) \, dt$.  When a species becomes extinct, it is
266: removed from the species pool.  Finally, the probability that some
267: species $i$ from the species pool speciates is $r_+ \exp(s_i) \, dt$.
268: When $i$ speciates, a new species $j$ with speed parameter
269: $s_j=s_i+\delta$ is added to the species pool, where $\delta$ is a
270: zero-mean Gaussian random variable with
271: $\mathop{\mathrm{var}}\delta=D$.  If $s_i+\delta$ exceeds the range
272: $[0,R]$, $s_j=-(s_i+\delta)$ or $s_j=2 R-(s_i+\delta)$ are used
273: instead (reflecting boundaries).  The probabilities for any of these
274: events to occur are independent.
275: 
276: 
277: \subsection{The evolution of the food web}
278: 
279: The food web is described by a connectivity (or adjacency) matrix
280: $(m_{ij})$, with connectivity values $m_{ij}=1$ when $j$ eats $i$ and
281: $m_{ij}=0$ otherwise.  \emph{Possible consumers} $l$ of species $i$
282: are defined as species with $s_l<s_i+\lambda\,R$, \emph{possible
283:   resources} $h$ as those with $s_h>s_i-\lambda\,R$.  The connectivity
284: $m_{ij}$ can be $1$ only when $i$ is a possible resource of $j$.  The
285: connectivity of a new species adapting to the habitat to all possible
286: consumers and resources is set to $1$ with probability $C_0$ and to
287: $0$ otherwise.  Upon speciation, the connectivity values of the
288: decedent species $j$ to possible consumers and resources are copied
289: from the corresponding connectivity values of the parent species $i$
290: with probability $1-\beta$ (i.e., links break with probability
291: $\beta$).  The connectivity values to all possible resources and
292: consumers of $j$ which have not been copied are set to $1$ with
293: probability $C_0$ and to $0$ otherwise.
294: 
295: \subsection{Typical parameters}
296: 
297: \begin{table}[btp]
298:   \centering
299: \begin{tabular}{lrrrrrrr}
300:   Food web: &BB&Sk&Co&Ch&SM&Yth&LR\\
301:   \hline
302:   model parameters: \\
303:   $r_+$ ($=\rho$)& 0.914&      0.934&    0.961&     0.959&     0.801&     0.949&       0.991\\
304:   $r_1$&           0.17&          0.21&    0.13&     0.21&     0.92&     0.67&      0.13\\
305:   $\lambda$&       0.12&      0.082&   0.006&   0.25&     0&            0.001&   0.025\\
306:   $C_0$&           0.37&      0.53&    0.58&     0.064&    0.23&     0.081&     0.16\\
307:   $\beta$&         0.059&      0.012&    0.014&     0.029&     0.034&
308:   0.040&      0.0063\\
309:   \hline
310:   derived quantities: \\
311:   web size (before lumping) $\left<S\right>$& 18.2&       29.0&     31.4&      47.9&      42.7&      122.0&       137.4\\
312:   $\var S/\left<S\right>^2$& 0.64&      0.53&    0.81&     0.51&     0.12&     0.16&      0.81\\
313:   clade size $\left<n\right>$: Eq.~(\ref{species-per-clade})& 4.3&       5.2&     7.6&      7.3&      2.5&      6.3&       23.5\\
314:   number of clades $\left<c\right>$: Eq.~(\ref{clade-in-web}) & 4.2&       5.5&     4.2&      6.6&      17.1&      19.5&       5.8\\
315:   clade lifetime in gen.: $-\ln(1-\rho)$& 2.5&       2.7&     3.2&      3.2&      1.6&      3.0&       4.7\\
316:   clades in diet: Eq.~(\ref{clade-in-diet}), $\Lambda=R$& 2.3 & 3.2& 2.8& 0.7& 4.5 &2.8&  1.5\\
317:   diet breakout: Eq.~(\ref{breakout})& 0.44&      0.16&    0.31&       0.41&     0.12&     0.43&      0.44\\
318: \end{tabular}
319: \caption{Maximum-likelihood model parameters for the speciation model
320:   obtained for seven empirical food webs and quantities derived
321:   thereof.   The abbreviations stand for BB: 
322:   Bridge Brook Lake \citep{havens92:_scale_webs}, Sk:
323:   Skipwith Pond \citep{warren89:_spatial_freshw_web}, Co: Coachella
324:   Desert \citep{polis91:_compl_des_web}, Ch: Chesapeake Bay
325:   \citep{baird89:_chesap_bay}, SM: St.~Martin Island
326:   \citep{goldwasser93:_const_carib_web}, Yth: Ythan Estuary
327:   \citep{hall91:_food_rich_web}, LR: Little Rock Lake 
328:   \citep{martinez91:_artif_attr}. }
329: \label{tab:parameters}
330: \end{table}
331: 
332: In our previous study \citep{rossberg05:_web} the predictions of the
333: speciation model were compared to empirical data, and maximum
334: likelihood fits of the model to empirical data sets for fixed $R=\ln
335: 10^4$, $D=0.0025$, $r_-=1$ were computed.  For brevity we refer to
336: these parameter sets as ``typical values'' hereafter.  For the
337: convenience of the reader the fitted values are listed in
338: Table~\ref{tab:parameters} together with some derived expressions
339: relevant for the calculations below.
340: 
341: 
342: 
343: \section{Basic statistical properties of the model steady state}
344: %\section{Mathematical characterization of the model steady state}
345: \label{sec:analysis}
346: 
347: The number $S$ of species in a food web and the number $L$ of trophic
348: links connecting them belong to the simplest quantities used to
349: characterize food webs.  Often $L$ is expressed in terms of the
350: directed connectance $C=L/S^2$ or related quantities.  In what
351: follows, the steady-state distribution of $S$ and the expectation
352: value of $C$ for the speciation model are derived.
353: %
354: For these calculations, it is helpful to imagine the species pool as
355: being divided into clades.  Following
356: \cite{yoshida02:_long_living_fossils,yoshida03:_evolut_web_sys}, a
357: \emph{clade} is here defined as the group of all currently existing
358: descendant species of a \emph{founder species} that entered the
359: species pool through an adaptation process, in close correspondence with
360: the standard phylogenetic notion.  When $D$ is sufficiently small, the
361: speed parameter $s$ is approximately the same for all species in a
362: clades, and the ranges of $s$ covered by different clades do not
363: overlap.  We can then divide the $s$ axis into small intervals
364: $[s,s+\Delta s]$, and account for the number of species in each
365: interval separately.  The absence of overlap between clades is used
366: only as a trick to simplify accounting.  The final results do not
367: depend on this assumption.  The condition that the spread of $s$
368: within clades is small will be made more precise in the detailed
369: discussion of the clades in Section~\ref{sec:clades} below.
370: 
371: \subsection{The steady-state distribution of the species number $S$}
372: \label{sec:S}
373: 
374: In order to obtain the steady-state distribution of the total number
375: of species, consider first only a small interval $[s,s+\Delta s]$ on
376: the speed-parameter axis.
377: %
378: The master equation for the probability distribution $p_n$ of the
379: number $n$ of species in the interval  is given by
380: \begin{align}
381:   \label{probability_balance}
382:   \frac{d p_n}{dt}=&j_{n-1,n}-j_{n,n+1}\\
383:   \intertext{for $n\ge 1$ and}
384:   \label{dp0}
385:   \frac{d p_0}{dt}=&j_{0,1},
386: \end{align}
387: with the probability current $j_{n,n+1}$, resulting from the
388: balance of processes incrementing and decrementing~$n$, given by
389: \begin{align}
390:   \label{jnnp}
391:   j_{n,n+1}=& e^s
392:   \left[
393:     (n\,r_+ + r_1 \Delta s) p_n - (n+1) r_- p_{n+1} 
394:   \right].
395: \end{align}
396: The possibility of speciations that cross the boundaries of the range
397: $[s,s+\Delta s]$ is ignored here, because the corresponding
398: corrections would cancel out when summing up the $n$ values from
399: different intervals below.  The reflecting boundary conditions at the
400: endpoints of the full $s$-range $[0,R]$ ensure that~(\ref{jnnp}) holds
401: also for the intervals adjacent to the endpoints.
402: 
403: For the steady state $j_{n,n+1}=0$ one gets 
404: \begin{align}
405:   \label{p1}
406:    p_1=&\frac{r_1}{r_-}\,p_0\,\Delta s\\
407:    \intertext{and for $n\ge 1$ the recursive relation}
408:    \label{pn}
409:    p_{n+1}=&\frac{n\,r_+}{ (n+1) r_-} p_n +\mathcal{O}(\Delta s),\\
410:    \intertext{which is solved by}
411:    \label{pn2}
412:    p_{n}=&\frac{1}{n}
413:    \left(
414:      \frac{r_+}{r_-}
415:    \right)^n \frac{r_1 p_0}{r_+} \Delta s + \mathcal{O}(\Delta s^2).
416: \end{align}
417: With the abbreviations $\rho=r_+/r_-$ and $\kappa=r_1/r_+$, the
418: corresponding moment generating function is
419: \begin{align}
420:   \label{momds}
421:   m(z)=
422:   \left<
423:     z^n
424:   \right>=p_0\,
425:   \left[
426:     1- \kappa\,\Delta s\,\ln
427:       \left(
428:       1-\rho z
429:       \right) \right]+\mathcal{O}(\Delta s^2),
430: \end{align}
431: with
432: \begin{align}
433:   \label{p0}
434:   p_0=1+\kappa\, \Delta s\ln(1-\rho)+\mathcal{O}(\Delta s^2)
435: \end{align}
436: given by the normalization condition $m(1)=1$.  From $m(z)$ one
437: obtains the cumulant generating function
438: \begin{align}
439:   \label{kumds}
440:   \begin{split}
441:     k(z)=\ln m(z)=&\ln p_0 - 
442:     \kappa\,\Delta s\ln\left( 1-\rho z \right)+\mathcal{O}(\Delta s^2)\\
443:     =&\kappa\,\Delta s\,
444:       \ln\!\left( \frac{1-\rho}{1-\rho z}
445:     \right)
446:     +\mathcal{O}(\Delta s^2).
447:     \end{split}
448: \end{align}
449: Cumulant generating functions of this form and the corresponding
450: distributions are discussed in Appendix~\ref{sec:general}.  For
451: example, by Eq.~(\ref{meanAB}), the density of species along the
452: speed-parameter line is
453: \begin{align}
454:   \label{density}
455:   \lim_{\Delta s\to 0}\frac{ \left< n \right>}{\Delta s}=\frac{\kappa
456:     \rho}{1-\rho}=\frac{r_1}{r_--r_+}.
457: \end{align}
458: 
459: \begin{figure}[tbp]
460:   \centering
461:   \includegraphics[width=\imgwidth,keepaspectratio,clip]{BBHist}
462:   \caption{Typical steady-state distribution of the number of species
463:     $S$.  The solid line is $P(\kappa R,\rho;S)$ as defined by
464:     Eq.~(\ref{ABdist}); the histogram was obtained by direct
465:     simulations.  Parameters correspond to Bridge Brook Lake
466:     (Tab.~\ref{tab:parameters}).}
467:   \label{fig:S-distribution}
468: \end{figure}
469: 
470: The cumulant generating function of the sum of independent random
471: variables is the sum of their cumulant generating functions.  Thus,
472: the cumulant generating function for the total number of species $S$
473: can be obtained by dividing the range $[0,R]$ into small intervals of
474: width $\Delta s$ and summing the contributions.  With $\Delta s\to 0$
475: corrections $\mathcal{O}(\Delta s^2)$ become negligible and the
476: summation goes over into an integration:
477: \begin{align}
478:   \label{kums}
479:   \sum \frac{k(z)}{\Delta s} \Delta s +\mathcal{O}(\Delta s^2) \to
480:   \int_0^R \kappa
481:       \ln\left( \frac{1-\rho}{1-\rho z}
482:     \right)\,ds=\kappa R\ln
483:   \left(
484:     \frac{1-\rho}{1-\rho z}
485:   \right).
486: \end{align}
487: This is again of the general form Eq.~(\ref{generalK}) discussed in
488: Appendix~\ref{sec:general}.  Hence, the steady-state distribution of
489: the species number $S$ is $P(\kappa R,\rho;S)$ as defined by
490: Eq.~(\ref{ABdist}).  Figure~\ref{fig:S-distribution} shows a typical
491: distribution and corresponding simulation results.  The curves agree
492: well.  Only the probability for $S$ near zero seems to be
493: overestimated by the theory.  By Eq.~(\ref{meanAB}), the mean number
494: of species is
495: \begin{align}
496:   \label{meanS}
497:   \left<
498:     S
499:   \right>=\frac{\kappa R \rho}{1-\rho}
500: \end{align}
501: and by Eq.~(\ref{varAB}) the relative variance $(\var
502: S)/\left<S\right>^2=1/\kappa R \rho$. Typical relative variances
503: (Tab.~\ref{tab:parameters}) can become of the order unity.  Thus, in
504: the model, $S$ fluctuates strongly on evolutionary time scales.
505: 
506: 
507: 
508: \subsection{Basic properties of clades}
509: \label{sec:clades}
510: 
511: The division of $S$ into clades can be made more explicit.  For
512: example, the distribution of the number $n$ of species in a single
513: clade is given by Eq.~(\ref{pn2}) conditional to $n\ge 1$:
514: \begin{align}
515:   \label{clade-size-distribution}
516:   p_n=-\frac{\rho^n}{n \ln(1-\rho)}
517: \end{align}
518: Thus, the mean number of species per clade is 
519: \begin{align}
520:   \label{species-per-clade}
521:   \left<
522:     n
523:   \right>=\sum_n n\,p_n=-\frac{\rho}{(1-\rho)\,\ln(1-\rho)}.
524: \end{align}
525: Further, the expectation value of the number of clades $c$ in the food
526: web can be estimated as
527: \begin{align}
528:   \label{clade-in-web}
529:   \left<
530:     c
531:   \right>=\frac{
532:     \left<
533:       S
534:     \right>}{
535:     \left<
536:       n
537:     \right>}=-\kappa\, R \,\ln(1-\rho).
538: \end{align}
539: (An exact calculation yields the same result.)  Since appearances and
540: extinctions of clades are statistically independent, the number of
541: clades is Poisson distributed.  For typical values of $\left< n
542: \right>$ and $\left< c \right>$ see Tab.~\ref{tab:parameters}.
543: 
544: To obtain the average lifetime $\tau_c$ of a clade founded by a
545: species with speed parameter $s$, notice that the probability that a
546: clade exists in the interval $[s,s+\Delta s]$ is $1-p_0$ with $p_0$
547: given by~Eq.~(\ref{p0}).  On the other hand, new clades are founded at
548: a rate $r_1 \sigma\, \Delta s$ with $\sigma=\exp(s)$.  The fraction of
549: time when there is a clade in the interval is thus $\tau_c\, r_1
550: \sigma\, \Delta s$. (Note that in the limit $\Delta s\to 0$ there is no overlap
551: in the clade lifetimes.)  Thus
552: \begin{align}
553:   \label{tau-clade}
554:   \tau_c=\lim_{\Delta s\to 0} \,\frac{1-p_0}{r_1\sigma\Delta
555:     s}=-\frac{\ln(1-\rho)}{r_+\sigma}.
556: \end{align}
557: The time that it takes for the system to reach the steady state can be
558: estimated by the lifetime of the slowest clade, i.e., by
559: Eq.~(\ref{tau-clade}) with $\sigma=\exp(0)=1$.  This quantity is
560: important for model simulations.  For a detailed discussion of the
561: dynamics of the birth/death process relevant here, including the clade
562: lifetime distribution, see the book of \cite{bailey64:_stoch_proces}.
563: 
564: The typical number of evolutionary ``generations'' that a clade exists
565: is $\tau_c/\text{(generation time)}=\tau_c r_+ \sigma=-\ln(1-\rho)$
566: (see Tab.~\ref{tab:parameters} for typical values).  Since in each
567: generation the variance of the distribution of $s$ over a clade
568: increases by $D$, the width of a clade on the speed-parameter
569: line is of the order
570: \begin{align}
571:   \label{cladwidth}
572:   \std s\approx\sqrt{- D \ln(1-\rho)}.
573: \end{align}
574: The assumption made above that all
575: members of a clade have approximately the same $s$ is justified when
576: $s\ll1$. 
577: 
578: \subsection{The expected directed connectance}
579: \label{sec:links}
580: 
581: A food-web property that has found much attention in both empirical
582: and theoretical research is the connectance, for example measured in
583: terms of the directed connectance $C=L/S^2$
584: \citep{martinez91:_artif_attr} with $L$ denoting the total number of
585: trophic links.  To compute the expectation value of this quantity,
586: note that from all $S^2$ topologically possible links only some are
587: allometrically possible in the model, namely those from consumers $i$
588: to their possible resources $h$ with $s_h > s_i-\lambda R$ (s.\ 
589: Sec.~\ref{sec:speciation_model}).  A fraction $(1-\lambda)^2/2$ of the
590: $s_h$-$s_i$ plane is forbidden.  By construction, exactly a fraction
591: $C_0$ of all allometrically possible links is realized on the average
592: in the model.  Thus, as a simple estimate one gets $S^2
593: [1-(1-\lambda)^2/2]=S^2 (1+2 \lambda-\lambda^2)/2$ allometrically
594: possible links and
595: \begin{align}
596:   \label{simple-C}
597:   C\approx C_0 (1+2\lambda-\lambda^2)/2.
598: \end{align}
599: %
600: The exact value differs due to subtle correlations stemming from
601: intra-clade links.  As an example, we derive $C$ for the case that the
602: typical intra-clade spread of $s$ given by Eq.~(\ref{cladwidth}) is
603: much smaller than $\lambda R$, so that all intra-clade links are
604: allometrically possible.  As in Sec.~\ref{sec:S}, we divide the $s$ axis
605: into small intervals of width $\Delta s$, and do again as if each
606: clade was located in its own interval.  Let the $p$-th interval range
607: from $s_p$ to $s_p+\Delta s=s_{p+1}$ and denote the number of species
608: it contains by $n_p$.  We first compute the expected number of
609: allometrically possible links conditional to fixed $S$
610: \begin{align}
611:   \label{L}
612:   \left<
613:     L_\text{al}|S
614:   \right>=\mathop{\sum_{p,q}}_{s_p>s_q-\lambda R}
615:   \left<
616:     n_p\,n_q|S
617:   \right>=\mathop{\sum_{p\ne q}}_{s_p>s_q-\lambda R}
618:   \left<
619:     n_p\,n_q|S
620:   \right>+\sum_{p}
621:   \left<
622:     n_p^2|S
623:   \right>.
624: \end{align}
625: Consider the last term first.  The distribution $p_n$ of $n_p$ is
626: given by Eqs.~(\ref{pn2},\ref{p0}).  Since clades appear and disappear
627: independently, the probability that there are $S-n_p$ species outside
628: the $p$-th interval is, just as for the total number of species,
629: $P(\kappa R,\rho;S-n_p)$, defined by Eq.(\ref{gen-dist}) to lowest
630: order in $\Delta s$.  The probability for a particular pair $(n_p,S)$
631: is therefore $p_{n_p}\,P(\kappa R,\rho;S-n_p)$.  This can be used to
632: calculate the probability $p(n_p|S)$ of $n_p$ conditional to $S$ in
633: the usual way, giving
634: \begin{align}
635:   \label{np2}
636:   \left<
637:     n_p^2|S
638:   \right>
639:   =\sum_{n=0}^S n^2\,p(n|S)
640:   =\sum_{n=0}^S n^2\, \frac{p_n P(\kappa
641:       R,\rho;S-n)}{P(\kappa R,\rho;S)}
642:   =
643:   \frac{S\,(\kappa
644:     R+S)
645:   }{R\,(1+\kappa R)} \Delta s+\mathcal{O}(\Delta s^2).
646: \end{align}
647: The dependence on $\rho$ drops out.  By a similar argument one obtains
648: to lowest order in $\Delta s$
649: \begin{align}
650:   \label{npnq}
651:   \left<
652:     n_p n_q|S
653:   \right>
654:   =\sum_{m+n\le S} n\,m\,p(m,n|S)
655:   =
656:   \frac{\kappa (S-1) S}{R\,(1+\kappa R)} \Delta s^2.
657: \end{align}
658: Inserting both results into~(\ref{L}) and taking the limit $\Delta
659: s\to 0$ yields
660: \begin{align}
661:   \label{L2}
662:   \left<
663:     L_\text{al}|S
664:   \right>=&S\,\frac{S+\kappa R\,
665:     \left[
666:     1+\frac{1}{2}(1+2\lambda-\lambda^2)\,(S-1)
667:     \right]}{1+\kappa R}\\
668:   \label{L2approx}
669:   = &S^2 \,\left[ \frac{1+\kappa
670:     R\frac{1}{2}(1+2\lambda-\lambda^2)}{1+\kappa R}+\mathcal{O}
671:   \left(
672:   \frac{\kappa R}{S}
673:   \right)\right].
674: \end{align}
675: Expression~(\ref{L2approx}) is often a good approximation
676: of~(\ref{L2}).  The expected directed connectance conditional to $S$
677: is $ \left< C|S \right>=C_0 \left< L_\text{al}|S \right>/S^2$.
678: Dropping the undefined case $S=0$, the expected connectance for
679: freely fluctuating $S$ can be evaluated as
680: \begin{align}
681:   \label{C}
682:   \left<C\right> = C_0 
683:   \left[
684:     1-P(\kappa R,\rho;0)
685:   \right]^{-1}\,
686:   \sum_{S=1}^\infty \frac{
687:     \left<
688:       L_\text{al}|S
689:     \right>}{S^2}\,
690:   P(\kappa R,\rho;S),
691: \end{align}
692: either directly numerically or, for a (complicated) closed-form
693: expression, with the help of symbolic algebra software.  For the
694: parameters of Bridge Brook Lake (Tab.~\ref{tab:parameters}), for which
695: $\lambda R/\std s=14.7$, Eq.~(\ref{C}) yields $\left<C\right>=0.294$
696: while the simple estimate~(\ref{simple-C}) gives
697: $\left<C\right>=0.230$.  Simulations yield $\left<C\right>=0.286$.
698: The cases that $\lambda R>0$ is much smaller than the typical
699: intra-clade spread of $s$ and that $\lambda=0$ (no cannibalism) can be
700: handled by replacing $n^2$ in Eq.~(\ref{np2}) by $n(n+1)/2$ or
701: $n(n-1)/2$ respectively.  For both cases the approximation
702: $\left<C|S\right>= \frac{1}{2}\,C_0 [1+\kappa R (1+2
703: \lambda-\lambda^2)]/(1+\kappa R)+\mathcal{O}(S^{-1})$ holds.  For the
704: parameters of St.\ Martin Island ($\lambda=0$) this yields
705: $\left<C\right>=0.115$, while numerically $\left<C\right>=0.112$ is
706: obtained.
707: 
708: \section{The distributions of generality and vulnerability}
709: \label{sec:distributions}
710: 
711: In this Section, analytic approximations for the distributions of
712: generality $k$ (the number of resources of a consumer) and
713: vulnerability $m$ (the number of consumers of a resource) are derived.
714: When defining the direction of trophic links in the standard way from
715: the resource to the consumer, these are the distributions of the
716: in-degree and the out-degree of the food web, respectively.  Degree
717: distributions are often thought to belong to the major determinants of
718: the overall network topology.  Due to the inherent randomness of food
719: webs and their finite size, instances of degree distributions of
720: empirical or model webs are also random quantities. Nevertheless, they
721: contain information regarding the probability distributions of
722: generality $P_\text{gen}(k)$ and vulnerability $P_\text{vul}(m)$ in
723: the steady state.  Specifically, if $N(k)$ denotes the number of
724: species with generality $k$ in a web and the total number of species
725: is $S$, then $\left< N(k)/S \right>=P_\text{gen}(k)$ in the steady
726: state.  While this is trivial for fixed $S$, it is worth noting that
727: this relation is valid also when the value of $S$ fluctuates randomly
728: and when the generalities of individual species are strongly
729: correlated with each other and with $S$, as can be seen by a
730: straightforward calculation.
731: %
732: Below it is shown that the conditional probability
733: $P_\text{gen}(k|S)$, i.e. the conditional expectation value
734: $\left<N(k)/S|S \right>$, does in fact strongly depend on $S$.  For a
735: comparison with single instances of empirical distributions $N(k)/S$
736: the conditional distribution $P_\text{gen}(k|S)$ is therefore better
737: suited than $P_\text{gen}(k)$.  Similar considerations hold for the
738: vulnerabilities.  Thus, the conditional distributions are computed
739: below.
740: 
741: Following \cite{camacho02:_analytic_food_webs}, we consider the
742: distinguished limit of large food-web sizes $S$ and small
743: connectances $C$ while keeping the link density $Z:=L/S=CS$ fixed.
744: (Fixing $Z$ for asymptotic expansions is not meant to suggest that $Z$
745: is actually fixed for large food webs.)  For simplicity, we make use
746: of the hypothesis that resources typically evolve faster than their
747: consumers in the extreme form that resources evolve \emph{much} faster
748: than their consumers.  This corresponds to assuming a large spread of
749: time scales $R$ and a small loopiness $\lambda$.  Errors due to
750: intra-clade trophic links, which violate this hierarchy of timescales,
751: are small when the total number of clades~(\ref{clade-in-web}) is
752: large, due either to large $\kappa R$ or to small $1-\rho$.  We note
753: that in the case $\kappa R\gg 1$ the combined effect of these
754: assumptions would reduce the formula for the directed connectance
755: derived above to $\left<C\right>=C_0/2$, which shows that the
756: approximations employed here are much coarser than those used in the
757: forgoing Sections.  Nevertheless they retain the main effects that
758: determine the general forms of the degree distributions.
759: 
760: 
761: \subsection{Reduction to the dynamics of the actual resources}
762: \label{sec:actual-resources}
763: 
764: When most resource species evolve much faster than their consumers,
765: the distribution of generality for a given consumer can be
766: approximated by the steady-state generality distribution with the
767: consumer assumed fixed while its resources evolve.
768: %
769: We first show that, using a simple mean-field-type approximation, the
770: stochastic dynamics of the actual resources of the fixed consumer can
771: be separated from the dynamics of the possible resources which are not
772: actual resources (called \emph{spurned resources} below) in a
773: self-consistent way.
774: 
775: To derive the dynamics of the actual resources, consider a small
776: interval $[s,s+ds]$ in the range of possible resources.  Let
777: $\sigma=\exp(s)$.  The rate at which actual resource species in the
778: interval speciate in such a way that the descendant species remain actual
779: resources is $r_+^*\,\sigma$ with
780: \begin{align}
781:   \label{rpB}
782:   r_+^*=(1-\beta) r_+ + \beta\,C_0 r_+.
783: \end{align}
784: The first term accounts for trophic links that do not break in the
785: speciation, and the second term for trophic links that break but are
786: immediately reconnected.  The probability that a resource species
787: becomes extinct in a time interval of length $dt$ is simply $r_-^*\,
788: \sigma \,dt$ with
789: \begin{align}
790:   \label{rmB}
791:   r_-^*=r_-.
792: \end{align}
793: %
794: Finally, the consumer can acquire a novel resource species either by
795: an adaptation of a new species to the habitat or by a speciation of a
796: spurned resource in such a way that the decedent species becomes an
797: actual resource.  For the rate at which the latter event occurs, a
798: mean-field type approximation is employed: The number of spurned
799: resources $n^\circ$ in the speed-parameter range $[s,s+\Delta s]$ is
800: approximated by its expectation value $ \left< n^\circ \right>$.  The
801: rate at which a predator acquires novel resources (that did not
802: speciate from an existing resource species) in this range is then
803: given by $C_0 r_1^* \,\sigma\, \Delta s$ with
804: \begin{align}
805:   \label{r1B}
806:   r_1^*=r_1+  \frac{\beta\,
807:   \left<
808:     n^\circ
809:   \right> r_+}{\Delta s}.
810: \end{align}
811: The first term represents new adaptations, the second term mutations of
812: spurned species.  With this approximation, the expectation value for
813: the number $n^*$ of actual resources in the range $[s,s+\Delta s]$ can
814: be calculated as
815: \begin{align}
816:   \label{nB}
817:   \left<
818:     n^*
819:   \right>=\frac{C_0\,r_1^*}{r_-^*-r_+^*}\, \Delta s
820: \end{align}
821: by methods analogue to those used in Section~\ref{sec:S}.  Deviations
822: from this mean-field approximation occur because the expectation value
823: $\left< n^\circ \right>$ is correlated to $n^*$ by the breaking of
824: actual links, which occurs at a rate $\mathcal{O}(\beta)$.  Since the
825: contribution of $\left< n^\circ \right>$ to the dynamics of $n^*$ is
826: also of order $\mathcal{O}(\beta)$, the resulting error in the
827: distribution of $n^*$ is $\mathcal{O}(\beta^2)$.
828: 
829: For the dynamics of the number of spurned resource, a set of equations
830: corresponding to Eqs.~(\ref{rpB}-\ref{nB}) can be set up by replacing
831: $C_0\to 1-C_0$ and interchanging the indices $*$ and
832: $\circ$~(\ref{rpB}-\ref{nB}).  These equations can be used to
833: eliminate $\left< n^\circ \right>$ from Eq.~(\ref{r1B}), yielding
834: \begin{align}
835:   \label{r1Bnew}
836:   r_1^*=r_1+\frac{\beta\,(1-C_0)\, r_+}{r_- - r_+}\,r_1.
837: \end{align}
838: 
839: \subsection{The generality distribution for fluctuating $S$}
840: \label{sec:generality-free-S}
841: 
842: Analogous to the calculations of Section~\ref{sec:S}, the cumulant
843: generating function for the number of actual resources for a species
844: with speed parameter $s$ can now be obtained as
845: \begin{align}
846:   \label{cumulantk}
847:   K_\text{gen}(s,z)=C_0 \,\kappa^* \Lambda(s) \ln \left( \frac{1-\rho^*}{1-\rho^*
848:       z} \right),
849: \end{align}
850: where $\rho^*=r_+^*/r_-^*$ and $\kappa^*=r_1^*/r_+^*$ are given by
851: Eqs.~(\ref{rpB},\ref{rmB},\ref{r1Bnew}), and
852: $\Lambda(s)=\min[(1+\lambda)\,R-s,R]\approx R-s$ is the size of the
853: speed-parameter range of possible resources.  The corresponding
854: distribution function is
855: \begin{align}
856:   \label{Pgen}
857:   P_\text{gen}(s,k)=P(C_0\,\kappa^* \Lambda(s),\rho^*;k)
858: \end{align}
859: as defined by Eq.~(\ref{ABdist}).  In particular, the expected number of a
860: consumer's resources is
861: \begin{align}
862:   \label{averagem}
863:   \left< k \right>=\frac{C_0\,\kappa^* \Lambda(s) \rho^*}{1-\rho^*}=C_0
864:   \Lambda(s)\,\frac{r_1}{r_--r_+}.
865: \end{align}
866: Comparison with Eq.~(\ref{density}) shows that the mean-field
867: approximation preserves the model property that, on the average, a
868: fraction $C_0$ of all allometrically possible links is realized.  Just
869: as for the overall species pool, the diet of a consumer can be divided
870: into several clades, each descending from a single newly acquired
871: resource.  For example, the expected number of resource clades for a
872: consumer is
873: \begin{align}
874:   \label{clade-in-diet}
875:   -C_0\kappa^* \Lambda(s) \ln (1-\rho^*),
876: \end{align}
877: in analogy to Eq.~(\ref{clade-in-web}).  
878: 
879: \begin{figure}[tbp]
880:   \begin{center}
881:     \includegraphics[width=\imgwidth,keepaspectratio,clip]{fluctuating-S-dists}
882:   \end{center}
883:   \caption{Steady-state generality distributions for fluctuating
884:     species number, calculated from Eq.~(\ref{Mgen-overall}) with
885:     $\beta=0$ (solid) and $\beta=0.05$ (dashed) in comparison with
886:     direct numerical simulations (circles, squares).  The other
887:     parameters were $R=\ln 10^{20}$, $D=0.005$, $\rho=0.95$,
888:     $\kappa=10/R$, $C_0=0.1$, $\lambda=10^{-3}$ (no fitting).  The
889:     inset shows the same data on a double-logarithmic scales.}
890:   \label{fig:fluctuating-S-dists}
891: \end{figure}
892: 
893: Since, on the average, species are homogeneously distributed along
894: $s$, the probability distribution $P_\text{gen}(k)$ of the generality
895: of a species chosen arbitrarily from a food web is, to a good
896: approximation, the average of $P_\text{gen}(s,k)$ over $s$.
897: Analytically, this average is more easily calculated in terms of the
898: moment generating function $M_\text{gen}(s,z)=\sum_k
899: P_\text{gen}(s,k)\,z^k=\exp\,K_\text{gen}(s,z)$.  For the simple case
900: $\lambda=0$ one obtains
901: \begin{align}
902:   \label{Mgen-overall}
903:   M_\text{gen}(z)=\frac{1}{R}\int_0^R M_\text{gen}(s,z)\,ds=
904:   \frac{u-1}{\log u}\quad\text{with } u=
905:     \left(
906:       \frac{1-\rho^*}{1-\rho^*\,z}
907:     \right)^{\textstyle C_0 \kappa^* R}.
908: \end{align}
909: The generality distribution $P_\text{gen}(k)$ itself can be calculated
910: by a Taylor expansion of $M_\text{gen}(z)$ in $z$ or numerically from
911: the Fourier transformation of
912: $\mathop{\mathrm{Re}}\{M_\text{gen}(e^{i\phi})\}$.  A comparison with
913: direct numerical simulations shows that the condition that $R$ is
914: large is important for the numerical validity of
915: Eq.~(\ref{Mgen-overall}).  For example,
916: Fig.~\ref{fig:fluctuating-S-dists} shows analytic and numerical
917: results for $R=\ln 10^{20}$ in good agreement.
918: 
919: 
920: 
921: \subsection{The generality distribution for fixed $S$}
922: \label{sec:generality-fixed-S}
923: 
924: In order to compute the generality distribution $P_\text{gen}(k|S)$
925: conditional to fixed $S$, we start again from the distribution
926: $P_\text{gen}(s,k|S)$ for a consumer with speed parameter $s$.  In
927: order to simplify the calculations $\beta=0$ is assumed here.  Then
928: $\kappa^*=\kappa^\circ=\kappa$ and $\rho^*=\rho^\circ=\rho$.
929: %% This quantity is usually small,
930: %% and we have no reason to believe that the limit $\beta \to 0$ is
931: %% singular\footnote{The parameter was introduce to control the trophic
932: %%   similarity between species \cite{jaccard08:_nouvel_florale}, which
933: %%   does not play a role here.}.  The mean-field approximation
934: %% introduced in Section~\ref{sec:generality-distribution} does the
935: %% become exact and its parameters simplify to $r_1^*=C_0 r_1$,
936: %% $r_1^\circ=(1-C_0)r_1$, and $r_+^*=r_+^\circ=r_+$, so that
937: %% $\kappa^*=C_0 \kappa$, $\kappa^\circ=(1-C_0) \kappa$, and
938: %% $\rho^*=\rho^\circ=\rho$.
939: 
940: For a given consumer, the species pool can be divided into three
941: subsets: (i) the actual resources of the consumer, (ii) the
942: allometrically possible but spurned resources, and (iii) the
943: allometrically forbidden resource (see Sec.~\ref{sec:links}).  For
944: small enough $D$, each clade is located in a single subset, and the
945: species distributions in the three subsets become independent.  We
946: first calculate the probability distribution for the number of species
947: in the union of the sets (ii) and (iii) for freely fluctuating $S$.
948: As above, denote the width of the range of allometrically possible
949: resources on the $s$ axis by $\Lambda$.  The distribution of the
950: species number in set (ii) can be obtained from Eq.~(\ref{cumulantk})
951: by substituting $C_0 \to 1-C_0$ and is therefore given by
952: $P((1-C_0)\kappa \Lambda,\rho;n)$ as defined in Eq.~(\ref{ABdist}).
953: The distribution of the number of species in (iii) can be obtained in
954: the same way as the distribution of the total number of species
955: (Sec.~\ref{sec:S}), just that the relevant range of $s$ is now
956: $R-\Lambda$, and not $R$.  Hence this distribution is given by
957: $P(\kappa \,(R-\Lambda),\rho;n)$.  The distribution of the number of
958: species in the union of these two sets is given by the convolution
959: \begin{align}
960:   \label{uni-dist}
961:   P_\text{union}(n)=P((1-C_0) \kappa
962:   \Lambda,\rho;n)*P(\kappa \,(R-\Lambda),\rho;n) =
963:   P(\kappa\,(R-C_0\Lambda),\rho;n).
964: \end{align}
965: The second equation is easily verified by comparing the corresponding
966: cumulant generating functions~(\ref{generalK}).
967: 
968: The number of species in set (i) is given by
969: $P_\text{gen}(k)=P(C_0\kappa\Lambda,\rho;k)$ as defined above.  Using the
970: known distribution $P(\kappa R,\rho;S)$ for $S$, the conditional
971: distribution of generality can be obtained as
972: \begin{align}
973:   \label{gen-dist}
974:   \begin{split}
975:     P_\text{gen}(k|S) =&\, \frac{P_\text{gen}(k) \,
976:       P_\text{union}(S-k) }{ P(\kappa R,\rho;S) }\\
977:     =&\, \frac{
978:       \Gamma(C_0 \kappa \Lambda + k)\,
979:       \Gamma(\kappa R)\,
980:       \Gamma(1 + S)\,
981:       \Gamma( \kappa\,(R-C_0\Lambda)-k+S)
982:    }{
983:      \Gamma(C_0\kappa\Lambda)\,
984:      \Gamma(1 + k)\, 
985:      \Gamma(\kappa\,(R-C_0 \Lambda))\,
986:      \Gamma(1 - k + S)\,
987:      \Gamma(\kappa  R + S)}.
988: %% pGeneralityGivenS[n_, S_] = (Gamma[C*kappa*L + n]*Gamma[kappa*R]*Gamma[1 + S]*
989: %%      Gamma[-(C*kappa*L) - n + kappa*R + S])/(Gamma[C*kappa*L]*Gamma[1 + n]*
990: %%      Gamma[kappa*(-(C*L) + R)]*Gamma[1 - n + S]*Gamma[kappa*R + S])
991: \end{split}
992: \end{align}
993: Remarkably, just as the conditional expectations Eq.~(\ref{np2}) and
994: (\ref{npnq}), this result is independent of $\rho$.  The parameters
995: $S$ is playing a similar role instead (see below).
996: Equation~(\ref{gen-dist}) is now evaluated for large $S$.
997: Specifically, we assume (i) $S\gg\kappa R$, which is natural when $S$
998: is of the order of its expectation value $\kappa R \rho/(1-\rho)$ and
999: $1-\rho\ll 1$, (ii) $S \gg 1$, (iii) we restrict ourselves to values
1000: of $k\ll S$, and (iv) in order to take the distinguished limit of
1001: fixed link density, we set $C_0=Z_0/S$ with fixed $Z_0$.
1002: %
1003: Expanding the logarithm of Eq.~(\ref{gen-dist}) for large $S$ (e.g.,
1004: using Stirling's formula) then gives
1005: \begin{align}
1006:   \label{large-S-expansion}
1007:   \ln P_\text{gen}(k|S)=&\ln \frac{\kappa \Lambda Z_0}{k\,S}+
1008: %  \\  &
1009:   \frac{
1010:     -(\kappa R-1)\,k+\kappa L Z_0\,
1011:     \left[
1012:       \gamma+\psi_0(\kappa R)+\psi_0(k) -\ln S
1013:     \right]
1014:   }{S} + \cdots,
1015: \end{align}
1016: where $\gamma\approx 0.57$ is the Euler constant and
1017: $\psi_0(x)=(d/dx)\ln\Gamma(x)$ the digamma function. 
1018: 
1019: A similar expansion can be obtained for a distribution of the form
1020: $P(C_0 \kappa \Lambda,\tilde \rho;k)$ given by Eq.~(\ref{ABdist}),
1021: when the parameter $\tilde \rho$ is assumed to behave such that
1022: $S=b/(1-\tilde \rho)$ with fixed $b$ for large $S$, which is natural
1023: in view of $ \left< S \right>\sim 1/(1-\rho)$.  One obtains
1024: \begin{align}
1025:   \label{Pgen-large-S}
1026:   \ln P(C_0 \kappa \Lambda,\tilde \rho;k)=\ln \frac{\kappa \Lambda
1027:   Z_0}{k\,S}+
1028: %  \\  &
1029: \frac{
1030:     - b\,k+\kappa L Z_0\,
1031:     \left[\gamma+\ln b+
1032:       \psi_0(k)-\ln S
1033:     \right]
1034:   }{S} + \cdots.
1035: \end{align}
1036: A comparison of the two expansions shows that 
1037: \begin{align}
1038:   \label{compare-dists}
1039:   P_\text{gen}(s,k|S)\approx \mathcal{N}\, P(C_0 \kappa
1040:   \Lambda(s),\tilde \rho;k),
1041: \end{align}
1042: where
1043: \begin{align}
1044:   \label{normalizator}
1045:   \mathcal{N}=\mathcal{N}(s)=\exp
1046:   \left\{
1047:     C_0\kappa \Lambda(s)\,
1048:     \left[
1049:       \gamma_0(\kappa R)-\ln (\kappa R-1)
1050:     \right]
1051:   \right\}
1052: \end{align}
1053: and
1054: \begin{align}
1055:   \label{tilde-rho}
1056:   \tilde\rho=1-\frac{\kappa R-1}{S}.
1057: \end{align}
1058: Hence, apart from the new parameters $\mathcal{N}$ and $\tilde \rho$,
1059: the form of the generality distribution for fixed $S$ is approximately
1060: the same as for fluctuating $S$.
1061: 
1062: The additional normalization factor $\mathcal{N}$ enters because $k$
1063: can never exceed $S$, while $P(C_0 \kappa \Lambda,\tilde \rho;k)$ is
1064: nonzero for all $k$.  When the expected number of consumers is much
1065: smaller than $S$, i.e., for small connectances $C_0$, the value of
1066: $\mathcal{N}$ approaches 1.  This can be seen by noting that
1067: $\gamma_0(x)-\ln (x-1)= 1/(2x)+\mathcal{O}(x^{-2})$, so that we can
1068: write $\mathcal{N}=\exp[\tilde C \Lambda/(2 R)]$ with $\tilde C\approx
1069: C_0$.
1070: %
1071: The dependence on $S$ is fully contained in the new parameter $\tilde
1072: \rho$.  Its relation to $\rho$ can be understood by substituting $S$
1073: in Eq.~(\ref{tilde-rho}) by $\left<S\right>=\kappa R\rho/(1-\rho)$, which
1074: leads to
1075: \begin{align}
1076:   \label{tilde-rho2}
1077:   1-\tilde\rho=\frac{\kappa R-1}{\kappa R\rho}\,(1-\rho)\approx (1-\rho).
1078: \end{align}
1079: %
1080: Of course, the forgoing interpretation of Eq.~(\ref{compare-dists})
1081: makes sense only when $\kappa R>1$.  Yet, Eq.~(\ref{compare-dists}) is
1082: numerically valid also when continued analytically to the region
1083: $0<\kappa R \le 1$ where $\tilde \rho \ge 1$.
1084: 
1085: In Section~\ref{sec:actual-resources} it was shown that the effect of
1086: a small, non-zero $\beta$ can be approximated by a renormalization of
1087: the coefficients $\kappa$ and $\rho$.  Equation~(\ref{compare-dists})
1088: shows that for $\beta=0$ the effect of fixing $S$ is also essentially
1089: a renormalization of $\rho$.  Even though the generality distribution
1090: for fixed $S$ and non-zero $\beta$ is difficult to compute
1091: analytically, it is reasonable to assume that this too can be
1092: approximated by an expression of the form~(\ref{compare-dists}) with
1093: an appropriate pair of parameters $\tilde \kappa$ and $\tilde \rho$.
1094: 
1095: \begin{figure}[tbp]
1096:   \centering
1097:   \includegraphics[width=\imgwidth,keepaspectratio,clip]{fixed-S-dists}
1098:   \caption{Steady-state generality distributions conditional to fixed
1099:     species number $S$ obtained from simulations with $C_0=0.1$,
1100:     $S=100$ ($\bullet$), $C_0=0.1$, $S=300$ ($+$), $C_0=0.5$, $S=100$
1101:     ($\circ$), and $C_0=0.5$, $S=300$ ($\times$) in comparison with
1102:     the corresponding predictions by Eq.~(\ref{Mgen-fixed-overall})
1103:     (solid) and by directly averaging Eq.~(\ref{gen-dist}) over
1104:     $\Lambda=0..R$ (dashed).  The other parameters were $R=\ln
1105:     10^{20}$, $D=0.005$, $\rho=0.95$, $\kappa=10/R$,
1106:     $\lambda=10^{-3}$, $\beta=0$ (no fitting). For all examples
1107:     $\left<S\right>=190$.  The inset shows the same data on a
1108:     double-logarithmic scale.}
1109:   \label{fig:fixed-S-dists}
1110: \end{figure}
1111: 
1112: In order to obtain the overall conditional generality distribution we
1113: go, again, over to the moment-generating function
1114: \begin{align}
1115:   \label{Mgen-fixed}
1116:   M_\text{gen}(s,z|S):=\sum_{k=0}^\infty P_\text{gen}(k,z|S) z^k\approx
1117:   \mathcal{N}\,
1118:   \left(
1119:     \frac{1-\tilde \rho}{1-\tilde \rho z}
1120:   \right)^{\textstyle C_0 \tilde \kappa \Lambda(s)}.
1121: \end{align}
1122: %
1123: The average of this expression over $s$ for the simple case
1124: $\lambda\to0$ is
1125: \begin{align}
1126:   \label{Mgen-fixed-overall}
1127:   M_\text{gen}(z|S)\approx\frac{\tilde u-1}{\log \tilde u}\quad\text{with }
1128:   \tilde u=  \exp\left(\frac{\vphantom{C}\smash{\tilde C}}{2}\right)\,
1129:     \left(
1130:       \frac{1-\tilde \rho}{1-\tilde \rho\,z}
1131:     \right)^{\textstyle C_0 \tilde \kappa R}.
1132: \end{align}
1133: %
1134: This result was verified by comparison with a direct numerical
1135: simulations of the model.  Figure~\ref{fig:fixed-S-dists} shows
1136: simulation results in comparison with the predictions of
1137: Eq.~(\ref{Mgen-fixed-overall}) and with the results of numerically
1138: averaging Eq.~(\ref{gen-dist}) directly over $\Lambda=0..R$.  Although
1139: the precision of the approximation Eq.~(\ref{Mgen-fixed-overall})
1140: decreases for increasing $C_0$ and $k$ in comparison with the
1141: prediction using Eq.~(\ref{gen-dist}), it is surprisingly good even
1142: for large values of $C_0$ and $k$.  For large $C_0$ \emph{and} small
1143: $k$ the simulations deviate noticeably also from the prediction using
1144: Eq.~(\ref{gen-dist}), because in this parameter range the effects of
1145: intra-clade consumption, that had here been ignored, become relevant.
1146: %
1147: Even for smaller $R$, $\kappa R=\mathcal{O}(1)$, and $\beta>0$, where
1148: Eq.~(\ref{Mgen-fixed-overall}) does not make quantitative predictions,
1149: the general form of this expression still seems to be valid.
1150: Figure~\ref{fig:fitted-dists} shows some examples of numerical results
1151: in this regime compared with curves obtained by fitting $\tilde C$,
1152: $\tilde \rho$ and $\tilde \kappa$ in Eq.~(\ref{Mgen-fixed-overall}).
1153: The fitted curves describe the distributions similarly well as the
1154: quantitative predictions above: deviations occur many for very small
1155: and very large $k$.
1156: 
1157: \begin{figure}[tbp]
1158:   \centering
1159:   \includegraphics[width=\imgwidth,keepaspectratio,clip]{fitted}
1160:   \caption{Simulation results for the generality distributions
1161:     conditional to $S=40$ with $C_0=0.1$, $\beta=0$ (squares),
1162:     $C_0=0.1$, $\beta=0.05$ (circles), $C_0=0.3$, $\beta=0.05$
1163:     (triangles) compared to distributions fitted by adjusting the
1164:     parameters $\tilde C$, $\tilde \rho$ and $\tilde \kappa$ in
1165:     Eq.~(\ref{Mgen-fixed-overall}) (dashed, solid, dotted line). The
1166:     other parameters were $R=\ln 10^4$, $D=0.005$, $\rho=0.95$,
1167:     $\kappa=2/R$, $\lambda=10^{-3}$. }
1168:   \label{fig:fitted-dists}
1169: \end{figure}
1170: 
1171: %% More numerical examples for the distribution given by
1172: %% Eq.~(\ref{Mgen-fixed-overall}) can be found in the comparison with the
1173: %% niche model in Section~\ref{sec:niche_model} below.
1174: 
1175: \subsection{The  vulnerability distribution for fixed $S$}
1176: \label{sec:vulnerability-distribution}
1177: 
1178: \begin{figure}[tbp]
1179:   \centering
1180:   \includegraphics[width=\imgwidth,keepaspectratio,clip]{fixed-S-vul}
1181:   \caption{Steady-state vulnerability distributions conditional to fixed
1182:     species number $S$.  Parameters are the same as in
1183:     Fig.~\ref{fig:fixed-S-dists}.  The solid and dashed lines
1184:     correspond to Eqs.~(\ref{vul-dist-int}) and (\ref{vul-dist-sum}),
1185:     respectively.  The inset shows the same data on a
1186:     double-logarithmic scale.}
1187:   \label{fig:fixed-S-vul}
1188: \end{figure}
1189: 
1190: The distribution of the vulnerability $m$ is most easily computed
1191: directly conditional to fixed $S$: Assume species to be indexed in the
1192: order of increasing $s$ starting with $1$.  For $\lambda\to0$ the
1193: number of possible consumers of species $i$ is then simply $i$.  When
1194: assuming again that resources evolve much faster than their consumers,
1195: the consumers of $i$ are determined by (i) the random connection of
1196: consumers with probability $C_0$ when the resource-clade founder
1197: enteres the food web and (ii) random re-connections with probability
1198: $C_0$ during speciations of resources.  Neither of this processes
1199: introduces correlations in the connectivities within the set of
1200: possible consumers of $i$.  Thus, links are statistically independent
1201: and the vulnerability of $i$ is given by a binomial distribution.
1202: Averaging over the food web yields
1203: \begin{align}
1204:   \label{vul-dist-sum}
1205:   P_\text{vul}(m|S)=\frac{1}{S} \sum_{i=m}^{S} \binom{i}{m} C_0^m (1-C_0)^{i-m}.
1206: \end{align}
1207: This is exactly the expression that
1208: \cite{camacho02:_analytic_food_webs} obtained in their analysis of the
1209: niche model.  Following their observation that in the limit of large
1210: $S$ with constant $Z_0=C_0 S$ and $i=\mathcal{O}(S)$ the binomial
1211: distribution can be approximated as Poisson and the sum by an
1212: integral, one obtains
1213: \begin{align}
1214:   \label{vul-dist-int}
1215:   P_\text{vul}(m|S)=\frac{1}{Z_0}\int_0^{Z_0} \frac{t^m\,e^{-t}}{m!}  dt.
1216: \end{align}
1217: As is shown in Fig.~\ref{fig:fixed-S-vul}, this result predicts the
1218: vulnerability distribution similarly well as
1219: Eq.~(\ref{Mgen-fixed-overall}) the generality distribution.
1220: 
1221: Note that the Poisson distribution entering Eq.~(\ref{vul-dist-int})
1222: is the special case $P(t/B,B;n)$, $B\to0$ of the general distribution
1223: $P(A,B;n)$ entering Eq.~(\ref{compare-dists}).  Thus, the
1224: integral~(\ref{vul-dist-int}) is also a limiting case of the general
1225: form Eq.~(\ref{Mgen-fixed-overall}).  In the case of generality
1226: distributions, however, $B$ is typically close to one.
1227: 
1228: 
1229: \section{Comparison with other topological food-web models}
1230: \label{sec:other-models}
1231: 
1232: 
1233: \subsection{Comparison with  the cascade model}
1234: \label{sec:cascade}
1235: 
1236: The main idea upon which the cascade model is based, random
1237: connections restricted by a trophic hierarchy, is retained in the
1238: speciation model, albeit refined in several ways.  The cascade model
1239: is recovered from the speciation model in the limit of no loops
1240: ($\lambda=0$), and no speciations\footnote{Observe that for $r_+\to 0$
1241:   the often encountered combination $-\kappa \ln (1-\rho)$ simplifies
1242:   to $r_1/r_-$.}, i.e. $r_+\to 0$.  Then all species enter the species
1243: pool by adaptations and are independently, randomly connected to their
1244: resources and consumers, just as it was assumed for the consumers
1245: alone in the foregoing section.  However, the limit $r_+\to 0$ does not
1246: describe empirical data particularly well \citep{rossberg05:_web}.
1247: Typical parameter sets for the speciation model have $r_+\approx r_-$
1248: (Tab.~\ref{tab:parameters}).
1249: 
1250: 
1251: \subsection{Comparison with the niche model}
1252: \label{sec:niche_model}
1253: 
1254: %% The niche model is defined by a set of rules for determining trophic links
1255: %% between the members of an abstract species pool.  It has two
1256: %% parameters, the number of species $S$ and the connectance $C$ (?? not
1257: %% exact).  Each species $i$ of the pool of $S$ species is associated
1258: %% with a \emph{niche parameter} $n_i$, a trophic \emph{niche width}
1259: %% $r_i$, and the \emph{niche position} $c_i$.  The niche parameter is
1260: %% drawn uniformly from the range $[0,1]$, the niche width from the range
1261: %% $[0,n_i]$ with the distribution $\beta (1-r_i/n_i)^{\beta-1}$ where
1262: %% $\beta=(1-2\,C)/(2\,C)$, and the niche center uniformly from the
1263: %% range\footnote{The original description \cite{williams00:_simpl} is
1264: %%   inaccurate at this point.}  $[r_i/2,\min(n_i,1-r_i/2)]$.  The range
1265: %% for the species with the lowest niche value is set to zero.  The
1266: %% resource species of a species $i$ is given by all species $j$ with $n_j$ in
1267: %% the range $[c_i-r_i/2,c_i+r_i/2]$.
1268: 
1269: \subsubsection{Degree distributions}
1270: 
1271: It was mentioned already that the distribution of vulnerability in the
1272: niche model is approximately the same as in the speciation model, in
1273: both cases given by Eq.~(\ref{vul-dist-int}).  In the case of the
1274: niche model $Z_0=2 C S=2 Z$ where the targeted connectance $C$ and
1275: the species number $S$ are parameters of the model.  In both cases the
1276: distribution is due to random connections with possible consumers.
1277: 
1278: For the generality distribution the situation is more complex.  As
1279: the analysis of \cite{camacho02:_analytic_food_webs} showed, it is
1280: for the niche model essentially determined by the distribution of the
1281: ``niche width'', i.e., the size of the interval containing the
1282: resources of a species on the niche-parameter scale.
1283: \cite{williams00:_simpl} chose this width for each species as its niche
1284: value $n$ times a random variable $x$ with a beta distribution of the
1285: form
1286: \begin{align}
1287:   \label{beta-dist}
1288:   p_x(x)=b(1-x)^{(b-1)}\approx b\,e^{-b x},
1289: \end{align}
1290: where $b=(1-2\,C)/(2\,C)$ depends on the targeted directed
1291: connectance $C$.  The approximation by an exponential is valid for
1292: $b\gg 1$, i.e. for $C\approx1/(2b) \ll 1$. \cite{williams00:_simpl}
1293: used this particular form for its computational simplicity.  No
1294: ecological arguments to motivate it seem to be known.  Since species
1295: are independently and evenly distributed with density $S$ in the
1296: one-dimensional ``niche space'', the number of species in the niche
1297: interval follows a Poisson distribution with expectation value $S n x$
1298: when $x$ is fixed.  Averaging over all $x$ yields the geometric
1299: distribution
1300: \begin{align}
1301:   \label{niche-generality-distribution}
1302:   P_\text{gen}^{(\text{niche})}(n,k)
1303:   %
1304:   =
1305:   %
1306:   \int_0^{\infty} \frac{(S n x)^k}{k!} e^{-S n x} b\,e^{-b x} dx 
1307:   %
1308:   = 
1309:   %
1310:   \frac{1}{1+n Z_0} \left(
1311:     \frac{n Z_0}{1+n Z_0} \right)^k.
1312: \end{align}
1313: %% The overall generality distribution is given by the average of
1314: %% $P_\text{gen}^{(\text{niche})}(n,k)$ over $n=0..1$, which can be expressed
1315: %% in terms of a hypergeometric function.
1316: %% \begin{align}
1317: %%   \label{niche-overall-generality-distribution}
1318: %%   P_\text{gen}^{(\text{niche})}(k)=\int_0^1
1319: %%   P_\text{gen}^{(\text{niche})}(n,k) dn=\frac{Z_0^k}{k+1}
1320: %%   \,{}_2F_1(k+1,k+1;k+2;-Z_0).
1321: %% \end{align}
1322: 
1323: %% Consequently, averaging the moment generating function pertaining to
1324: %% Eq.~(\ref{niche-generality-distribution}),
1325: %% $M_\text{gen}^{(\text{niche})}(n,z)=1/[1+(1-z)n Z_0]$, over $n$ gives
1326: %% \begin{align}
1327: %%   \label{niche-overall-generality-mom}
1328: %%   M_\text{gen}^{(\text{niche})}(z)
1329: %% %
1330: %%   =
1331: %% %
1332: %%   \int_0^1 M_\text{gen}^{(\text{niche})}(n,z)dn
1333: %% %
1334: %%   =
1335: %% %
1336: %%   \frac{\ln\left[1+(1-z)
1337: %%       Z_0\right]}{(1-z) Z_0}.
1338: %% \end{align}
1339: %% This result directly corresponds to Eq.~(\ref{Mgen-fixed-overall}) for
1340: %% the speciation model.  Although these expressions are formally
1341: %% different and there is no obvious way to reduce them to a common form,
1342: %% the distributions they describe are numerically quite similar over a
1343: %% wide parameter range.
1344: 
1345: The overall generality distribution is obtained by averaging
1346: Eq.~(\ref{niche-generality-distribution}) over $n$.  The calculation
1347: is simplified by the approximation $k \approx S n x$, i.e.\ 
1348: \begin{align}
1349:   \label{niche-generality-camacho}
1350:   P_\text{gen}^{(\text{niche})}(n,k)\approx \frac{1}{S n}p_x\!\left(\frac{k}{S n}\right)=\frac{1}{n Z_0}\exp\left(-\frac{k}{n Z_0}\right),
1351: \end{align}
1352: which is valid for $n Z_0\gg 1$ [cf.\ 
1353: Eq.~(\ref{niche-generality-distribution})].  This leads to the result of
1354: \cite{camacho02:_analytic_food_webs}
1355: \begin{align}
1356:   \label{niche-overall-generality-camacho}
1357:   P_\text{gen}^{(\text{niche})}(k)=\int_0^1
1358:   P_\text{gen}^{(\text{niche})}(n,k) dn\approx
1359:   \frac{1}{Z_0}E_1(-\frac{k}{Z_0})
1360: \end{align}
1361: with $E_1(x):=\int_x^\infty t^{-1}\exp(-t)dt$ denoting the exponential
1362: integral function.  \cite{camacho02:_robus_patter_food_web_struc}
1363: concluded that the distribution of the scaled generality $k/(2 Z)$ or,
1364: for single instances of food webs more appropriate, its cumulative
1365: distribution, should have the universal form
1366: \begin{align}
1367:   \label{universal-cumulative-generality}
1368:   P\left(\frac{k}{2 Z} \ge x\right)=\int_x^{\infty}
1369:   E_1(x')dx'=\exp(-x)-x E_1(x),
1370: \end{align}
1371: and verified this impressively by a comparison with empirical data.
1372: 
1373: 
1374: In order to see if this observed regularity is reproduced also by the
1375: speciation model, cumulative distribution functions for the speciation
1376: mode obtained from Eq.~(\ref{Mgen-fixed-overall}) were compared with
1377: Eq.~(\ref{universal-cumulative-generality}).  The value for $k=0$ was
1378: excluded from the comparison because (i) in many empirical food-webs
1379: the lowest trophic level ($k=0$) is only poorly resolved and (ii) the
1380: approximation (\ref{niche-overall-generality-camacho}) is undefined at
1381: $k=0$ and Eq.(\ref{Mgen-fixed-overall}) is not accurate at this point
1382: either.  The scaling factor $Z_0^{-1}$ for the generality and the
1383: correction $\mathcal{\tilde N}$ of the normalization constant were
1384: therefore determined directly by transforming the cumulative
1385: speciation-model distributions to $\mathcal{\tilde
1386:   N}\sum_{k'=k}^\infty\,P_\text{gen}(k'/Z_0|S)$ such as to minimize
1387: the mean-least-square deviation
1388: from~(\ref{universal-cumulative-generality}) for $k\ge 1$.  These
1389: curves match Eq.~(\ref{universal-cumulative-generality}) surprisingly
1390: well over a wide parameter range (Fig.~\ref{fig:curve-fitting}a).  The
1391: empirical data is described well by both distributions
1392: (Fig.~\ref{fig:curve-fitting}b).
1393: 
1394: \begin{figure}[tbp]
1395:   \centering
1396:   \includegraphics[width=0.7\imgwidth,keepaspectratio,clip]{scaledTheory.eps}
1397:   \includegraphics[width=0.7\imgwidth,keepaspectratio,clip]{scaledLittleRock.eps}
1398:   \caption{Comparison of niche-model and speciation-model predictions
1399:     for the cumulative generality distribution.  (a) The
1400:     approximation~(\ref{Mgen-fixed-overall}) for the speciation model
1401:     with $C_0\tilde \kappa R=0.2$, $\tilde \rho=0.75$
1402:     (triangles), $C_0\tilde \kappa R=1.5$, $\tilde \rho=0.75$
1403:     (circles), $C_0\tilde \kappa R=0.2$, $\tilde \rho=0.98$
1404:     (plus), $C_0\tilde \kappa R=1.5$, $\tilde \rho=0.98$ (dotted line)
1405:     in comparison with the
1406:     approximation~(\ref{universal-cumulative-generality}) for the niche
1407:     model (solid line). (b) The empirical distribution for Little Rock
1408:     Lake~\cite{martinez91:_artif_attr} (dots), the speciation model
1409:     prediction from numerical simulations (dashed, shaded area is the
1410:     1-$\sigma$ range of fluctuations before scaling), and again the
1411:     approximation~(\ref{universal-cumulative-generality}) for the niche
1412:     model (solid line).  All distributions have the point $k=0$
1413:     removed and are scaled and normalized to minimize mean-square
1414:     deviations from Eq.~(\ref{universal-cumulative-generality}).}
1415:   \label{fig:curve-fitting}
1416: \end{figure}
1417: 
1418: To understand the reason for this apparent scaling law of
1419: speciation-model food webs, consider the speciation-model generality
1420: distribution~(\ref{Mgen-fixed-overall}) conditional to $k\ge 1$ in the
1421: limit of low connectance $C_0,\tilde C\to 0$ (now at fixed $S$),
1422: i.e.\ the distribution with the moment generating function
1423: \begin{align}
1424:   \label{low-C-limit}
1425:   \lim_{C_0,\tilde
1426:       C\to0}\frac{M_\text{gen}(z|S)-M_\text{gen}(0|S)}{1-M_\text{gen}(0|S)}=\frac{\ln(1-\tilde\rho
1427:       z)}{\ln (1-\tilde\rho)}.
1428: %    =-\sum_{k=1}^\infty \frac{(\tilde\rho z)^k}{k\ln(1-\tilde\rho)}.
1429: \end{align}
1430: This is easily seen to be the distribution of resources-clade sizes
1431: [cf. Eq.~(\ref{clade-size-distribution})]
1432: \begin{align}
1433:   \label{lower-animal-spec}
1434:   \frac{{\tilde \rho}^k}{k\ln(1-\tilde \rho)}.
1435: \end{align}
1436: In this limit of low connectance most species belong to the lowest
1437: trophic level, only a few heterotrophs remain, and the percolation of
1438: the network is lost.  Therefore, this limit does not correspond to the
1439: general situation encountered in the field.  But the
1440: approximate form of the log-series
1441: distribution~(\ref{lower-animal-spec}) is retained also for more
1442: complex networks.  For values of $\tilde \rho\approx 0.8$, this
1443: distribution has a shape quite similar to the exponential integral
1444: distribution Eq.~(\ref{niche-generality-camacho}).  When going over to
1445: cumulative distributions, the fit looks even better.  Thus, the
1446: observed generality distributions can be interpreted mechanistically
1447: in terms of the steady-state distributions of evolutionary clade
1448: sizes, corrected for fixing $S$ and trophic link breaking.  This also
1449: suggests that the ``scaling'' distribution~(\ref{lower-animal-spec})
1450: $\sim k^{-1} \exp(-k \ln \tilde \rho)$ or the more accurate
1451: result~(\ref{Mgen-fixed-overall}) would rather be the adequate
1452: functional forms than the exponential integral
1453: function~(\ref{niche-generality-camacho}).
1454: 
1455: In spite of the similarities of the overall generality and
1456: vulnerability distributions, there are marked differences in the
1457: detailed predictions of the two models.  Consider, for example, the
1458: generality distribution for species near the lower end of the trophic
1459: cascade, i.e., species with $\kappa \Lambda(s)\ll 1$ in the speciation
1460: model and $n\ll 1$ in the niche model, that have at least one resource
1461: species ($k\ge 1$).  For the speciation model
1462: Eqs.~(\ref{compare-dists}) and (\ref{ABdist}) lead again to the
1463: clade-size distribution~(\ref{lower-animal-spec}), while for the niche
1464: model Eq.~(\ref{niche-generality-distribution}) predicts
1465: \begin{align}
1466:   \label{lower-animal-niche}
1467:   (1-n Z_0)\,(n Z_0)^{k-1}. 
1468: \end{align}
1469: Thus, for the niche model it is very probable that such a species has
1470: exactly one resource, whereas for the speciation model larger
1471: generalities can also be expected.  An empirical test should be
1472: capable of distinguishing these two predictions.
1473: 
1474: 
1475: \subsubsection{Intervality}
1476: 
1477: A major distinction of the niche model from the cascade model is the
1478: intervality it enforces upon the diets of consumers.  While the degree
1479: of intervality obtained with the cascade model is typically too small
1480: compared with empirical data \citep{cohen90:_commun_food_webs}, it is
1481: too large for the niche model \citep{cattin04:_phylog}.  Under certain
1482: conditions the speciation model can also produce a high degree of
1483: intervality.  Consider some arbitrary ordering of clades, for example
1484: by the speed parameter of the founder species, and an ordering of the
1485: species within each clade given by a traversal of the evolutionary
1486: tree\footnote{For example, the order given by the recursive algorithm
1487:   \texttt{list(\textit{A})}
1488:   defined as \texttt{\\
1489:     \hspace*{4ex}1. if \textit{A} has not become extinct\\
1490:     \hspace*{12ex}print \textit{A};\\
1491:     \hspace*{4ex}2. for all direct descendants \textit{B} of
1492:     \textit{A}
1493:     in order of appearance\\
1494:     \hspace*{12ex}list(\textit{B});
1495:   }\\
1496:   starting with \texttt{list(}clade founder\texttt{)}.
1497: }. %
1498: For this ordering (which differs from an ordering by $s$) diets will
1499: form contiguous sets when (i) the average number of resources
1500: clades~(\ref{clade-in-diet}) is low, i.e., when most consumers have
1501: either one or no resource clade, and (ii) the probability that
1502: resources break out of a resource clade during the clade's lifetime is
1503: low.  Then the set of a consumer's resources is usually simply the
1504: non-extinct part of an evolutionary subtree.  The probability of
1505: resource beak-out is small when $\beta\times\text{(resource clade
1506:   size)}\times\text{(clade lifetime in generations)}$ is small, which,
1507: by arguments analogous to those used in Section~\ref{sec:clades}, the
1508: case when 
1509: \begin{align}
1510:   \label{breakout}
1511:   \beta \rho^*/(1-\rho^*)\ll 1.
1512: \end{align}
1513: For typical model parameters we find that these two conditions are
1514: satisfied to some extent but not too well (Tab.~\ref{tab:parameters}),
1515: in accordance with expectations.  Correspondingly, the degree of
1516: intervality $D_\text{diet}$ \citep{cattin04:_phylog} of empirical data
1517: is reproduced well by the model \citep{rossberg05:_web}.
1518: 
1519: %% \begin{align}
1520: %%   \label{break-out-probability}
1521: %%   \beta\times \left<m\right>\times r_+\sigma\tau_c=\beta
1522: %%   \frac{-\rho^*}{(1-\rho^*)\ln(1-\rho^*)}r_+\sigma\frac{-\ln(1-\rho)}{r_+\sigma}=\frac{\beta\rho^*}{1-\rho^*}
1523: %% \end{align}
1524: 
1525: We conclude with \cite{cattin04:_phylog} that the larger-than-random
1526: intervality observed in food webs may not so much result from a low
1527: dimensionality of the niche space, as has been proposed
1528: \citep{cohen78:_food_webs_niche_space}, but rather reflects the
1529: importance of the phylogenetic history for the food-web structure.
1530: 
1531: 
1532: \subsection{Comparison with the nested hierarchy model}
1533: \label{sec:hierarchy}
1534: 
1535: Just as for the niche model, the generality distribution for the
1536: nested hierarchy model is imposed ``by hand'' by specifying the
1537: distribution~(\ref{beta-dist}) and setting $k\approx S n x$.  But the
1538: structure of the set of resources is determined by a more complex
1539: algorithm that has been designed in such a way that consumers and
1540: resources form groups ($\approx$ clades), and consumers and resources
1541: from the same groups share resources and consumers, respectively.  The
1542: algorithm is intended to mimic a structure that would result from a
1543: phylogenetic evolution of the web, without explicitly modeling this
1544: evolution.  The speciation model achieves a similar effect by
1545: explicitly modeling the evolutionary dynamics.
1546: %% <a^2>=c^2/d -> a.r=N(0,r c/d)
1547: %% delta r2=(r+a)^2-r^2=2 r.a+a^2=2 r.a+c^2/d
1548: %% d delta r = c^2 + N(0,2 r c)
1549: 
1550: \section{Variants of the speciation model}
1551: \label{sec:variants}
1552: 
1553: Modeling complex ecological systems often requires difficult decisions
1554: with regards to which kinds of effects ought to be incorporated into a
1555: model and which can be ignored.  Here, two variants of the speciation
1556: model are shortly discussed that include aspects of the real system
1557: that had been left out in the original model.  For both variants, the
1558: analytic results derived in the previous sections remain valid without
1559: change.
1560: 
1561: \subsection{A variant with  asymmetric link persistence}
1562: \label{sec:asymmetric}
1563: 
1564: In the analysis above it was assumed that consumer-resource links are
1565: statistically independent of the phylogenetic history of the
1566: consumers.  If this assumption is valid, one may as well modify the
1567: model such as to choose all resources of a descendant species at
1568: random after its speciation, without affecting the analytic results
1569: obtained above.  More generally, one might incorporate an asymmetry in
1570: the persistence (or reconnection probability) of links between
1571: consumers and resources in the following way:
1572: 
1573: In the original form of the model, the connectivity of the descendant
1574: species was (randomly) re-assigned for a fraction $\beta$ of all
1575: possible trophic links.  In the asymmetric variant of the model, the
1576: connectivity from the descendant species to its consumers is
1577: re-assigned for a fraction $\beta_\text{c}$ of all possible consumers,
1578: and the connectivity to resources is re-assigned for a fraction
1579: $\beta_\text{r}$ of all possible resources, with
1580: $\beta_\text{c}\ne\beta_\text{r}$ in general.
1581: 
1582: In fact, there is no ecological reason to expect
1583: $\beta_\text{c}=\beta_\text{r}$.  A large difference between the
1584: values of $\beta_\text{c}$ and $\beta_\text{r}$ such as considered
1585: above ($\beta_\text{c}=\beta \ll \beta_\text{r}=1$) could be
1586: understood from the assumption that in the competition between
1587: related species their sets of resources are much more important than
1588: their sets of consumers: In order to avoid competitive exclusion,
1589: related species need drastically different sets of resources
1590: ($\beta_\text{c}=1$), while there is only little evolutionary pressure
1591: for a descendant species to have a different set of consumers than its
1592: predecessor ($\beta_\text{c}\ll1 $).
1593: 
1594: However, one might also argue that by the direct resource-consumer
1595: interaction alone.  Then one could expect it to be advantageous for a
1596: descendant species to evade its predecessors consumers (large
1597: $\beta_\text{c}$), while maintaining its resources (small
1598: $\beta_\text{r}$).  This would lead to the reverse relation between
1599: $\beta_\text{c}$ and $\beta_\text{r}$.  An empirical test to establish
1600: which of these two mechanisms is more relevant might be possible.
1601: 
1602: \subsection{A variant with quantitative link strength}
1603: \label{sec:quantitative-links}
1604: 
1605: Topological food-web models are often criticised for ignoring the fact
1606: that the link strength in food webs, instead of being either $1$ or
1607: $0$, is in reality a continuous quantity
1608: \citep{berlow04:_interac_strength}.  There is a simple way to
1609: incorporate continuously varying link strengths in the speciation
1610: model without affecting its statistical properties.
1611: 
1612: Instead of assigning to each possible trophic link a connectivity of
1613: either $0$ and $1$, quantify the strength of each possible link by are
1614: real number between $0$ and $1$.  Where the connectivity was copied
1615: during speciations in the original model, the links strength is copied
1616: now.  Where the connectivity was set to $1$ with probability $C_0$ and
1617: to $0$ otherwise, set the link strength to an appropriately
1618: distributed random number between $0$ and $1$ now.  For a
1619: characterization of the resulting food webs in terms of topological
1620: food-web statistics, count each link with strength larger than some
1621: threshold as present, and all other links as absent.  That is, the
1622: thresholding of the link strength is just delayed to the time of the
1623: characterization.
1624: %
1625: While this modification is straightforward for the speciation model,
1626: modifications of other topological models to postpone the thresholding
1627: of link strength might be possible, if at all, only at the price of
1628: increasing the model complexity.
1629: 
1630: Of course, an evolution where the link strength either does not changes
1631: at all or is reset to a completely new random value is quite
1632: artificial.  More natural it would be to vary the link strength by a
1633: small random amount at each evolutionary step.  In such a model, link
1634: breaking and reconnecting events relative to some threshold $(1-C_0)$
1635: would be correlated.  They would be concentrated at certain pairs of
1636: consumer and resources clades with link strength near the threshold.
1637: Further studies are required to understand what effect this would have
1638: on the overall network structure.
1639: 
1640: \section{Discussion and Outlook}
1641: \label{sec:conclusion}
1642: 
1643: Besides improving the general understanding of the properties of the
1644: speciation model and their dependence on model parameters, a purpose
1645: of this work was also to show that the speciation model integrates the
1646: underlying ideas from previous, simpler models (see
1647: Section~\ref{sec:other-models}).
1648: %
1649: The speciation model retains the trophic ordering of the cascade
1650: model.  In fact, it contains the cascade model as a special case.  By
1651: the interplay of speciations, extinctions, and adaptations of new
1652: species to the habitat, the speciation model reproduces three key
1653: features of the niche model and the nested hierarchy model at the same
1654: time: (1) the empirical distributions of generality, which in the
1655: niche model and similarly in the nested hierarchy model are obtained
1656: only by a special, ecologically unmotivated choice of the niche-width
1657: distributions; (2) intervality, to the degree that is actually
1658: observed \citep{cattin04:_phylog, rossberg05:_web}; (3) the
1659: organization of resources into groups of related species that share
1660: consumers and \textit{vice versa}.
1661: %
1662: This unifying character of the speciation model is probably the main
1663: reason for its high accuracy in reproducing empirical data
1664: \citep{rossberg05:_web}.
1665: 
1666: %% Other results of this paper relate to an aspect of the internal
1667: %% structure of the model food webs which is not obvious from the
1668: %% topology at a single instance in time: a characterization of the food
1669: %% webs in terms of ``clades''.  Table~\ref{tab:parameters} lists the
1670: %% expectation value for the number of clades corresponding to some
1671: %% empirical food webs.  It might be interesting to compare these results
1672: %% with the taxonomic structure of the actual empirical webs or the model
1673: %% dynamics with paleontological records.
1674: 
1675: The observed broad, log-series-like generality distributions have been
1676: traced back to, among others, a condition $1-\rho \ll 1$.  This means
1677: that the rate constant for speciations $r_+$ is numerically close to
1678: the rate constant for extinctions $r_-$.  For any phylogenetically
1679: closed system, a steady state always requires that extinction rates
1680: and speciation rates are equal, independent of the statistical details
1681: of the branching pattern.  For the half-open system considered
1682: here, $1-\rho \ll 1$ implies that the contributions from foreign
1683: adaptations to the species pool are small compared to the contribution
1684: from speciations.  In fact, $1-\rho$ directly equals the fraction of
1685: species in the food web that have entered by foreign adaptations.
1686: However, in order to obtain broad, left-skewed generality
1687: distributions, the independence of the speciation and extinction
1688: probability of a species from the actual size of its clade is also
1689: important.  If, instead, large clades would notably favor extinctions
1690: and small clades speciations, clade size distributions would be
1691: dominated by a ``typical'' clade size, which would, in the model, also
1692: lead to a narrower generality distribution.  In an analysis of
1693: paleontological time series \citet{raup91:_phaner_kill} applied a
1694: model for the size of genera identical to the model used here for the
1695: dynamics of clade sizes
1696: [Eqs.~(\ref{probability_balance})-(\ref{jnnp})].  While, on the
1697: average, this model (with $\rho=0.996$) reproduced the data well, the
1698: scatter in the paleontological data was larger than in the model.
1699: \citeauthor{raup91:_phaner_kill} could explain this observation by
1700: assuming that the overall evolution rate varies over time.  Since such
1701: a variation can also be described by a (random) nonlinear
1702: transformation of the time axis, it does not affect statistics that
1703: refer only to a particular moment in time, such as food-web
1704: structures.  Thus our assumption of a simple birth/death process is
1705: supported by paleontological observations.
1706: 
1707: As a direct consequence of this birth/death process, a
1708: characterization of food webs in terms of ``clades'' has been derived.
1709: Table~\ref{tab:parameters} lists expectation values for characteristic
1710: quantities corresponding to some empirical food webs.  It might be
1711: interesting to compare these results with the taxonomic structure of
1712: the actual empirical webs or the model dynamics with paleontological
1713: records.
1714: 
1715: In Section~\ref{sec:distributions} it was shown that a correlation
1716: between the evolution rates and the trophic height leads to the
1717: observed asymmetry between generality and vulnerability distributions.
1718: However, in the present model this requires evolution rates spanning
1719: an unrealistically large range of about 20 orders of magnitude.  We
1720: are currently evaluating a variant of the speciation model that
1721: achieves a similar effect without any differences in evolution rates
1722: by making not directly the trophic links hereditary but the properties
1723: of species determining link strengths.  An asymmetry of the heredity
1724: between species-as-consumers and species-as-resources leads
1725: effectively to an asymmetry of the link persistence as described in
1726: Section~\ref{sec:asymmetric} above.  Numerical results with the new
1727: model are promising, but analytically we understand it only in so far
1728: as it can be approximated by the speciation model, so that the analysis
1729: presented here remains valid.  Details regarding the new model will be
1730: reported elsewhere.
1731: 
1732: Our findings indicate that a food web's population dynamical
1733: stability and persistence are not as important determinants of its
1734: structure as is sometimes assumed.  From a technical point of view,
1735: this is good news.  It appears possible to obtain natural food-web
1736: structures without time-consuming population dynamical simulations.
1737: These food webs could then be investigated also with respect to the
1738: question how their structure affects population dynamical stability.
1739: 
1740: In the course of this work, analytic approximations for several
1741: empirically testable predictions of the speciation model could be
1742: obtained.  These include the average clade size $\left<n\right>$, the
1743: number of clades $\left<c\right>$ in the web, the age of clades in
1744: generations (speciation times) $-\ln(1-\rho)$, the average number of
1745: resource clades Eq.~(33), and the generality distribution of consumers
1746: at low trophic levels Eq.~(53).
1747: %
1748: A careful comparison of the models discussed here and other food-web
1749: models with existing empirical data and new results from ongoing
1750: efforts in the field will reveal discrepancies and, hopefully, suggest
1751: new ideas to bringing us another step closer to understanding this
1752: fascinating aspect of life on earth.
1753: 
1754: \section{Acknowledgements}
1755: \label{sec:thanks}
1756: 
1757: The authors acknowledge generous support by The 21st Century COE
1758: Program ``Bio-Eco Environmental Risk Management'' of the Ministry of
1759: Education, Culture, Sports, Science and Technology of Japan.
1760: 
1761: \appendix
1762: 
1763: \section{A family of distribution functions encountered in the
1764:   analysis of the speciation model}
1765: \label{sec:general}
1766: 
1767: \renewcommand{\theequation}{A.\arabic{equation}}
1768: \setcounter{equation}{0}
1769: 
1770: The analysis of the steady-state of a simple model of evolutionary
1771: dynamics (Sec.~\ref{sec:S}) naturally leads to probability
1772: distributions $p_n$ for species number $n$ with a cumulant generating
1773: function
1774: \begin{align}
1775:   \label{generalK}
1776:   \ln\sum_n p_n z^n=K_{A,B}(z)=A \ln
1777:   \left(
1778:     \frac{1-B}{1-B z}
1779:   \right),
1780: \end{align}
1781: where $0< A$, $0<B<1$.
1782: 
1783: From this, the mean
1784: \begin{align}
1785:   \label{meanAB}
1786:   \left<
1787:     n
1788:   \right>=\left.
1789:     \frac{dK_{A,B}(e^u)}{du}
1790:   \right|_{u=0}=\frac{A\,B}{1-B},
1791: \end{align}
1792: and variance
1793: \begin{align}
1794:   \label{varAB}
1795:   \var n =\left.
1796:     \frac{d^2K_{A,B}(e^u)}{du^2}
1797:   \right|_{u=0}=\frac{A\,B}{(1-B)^2}
1798: \end{align}
1799: can be calculated directly.  The ratio $(\var n)/\! \left< n \right>$
1800: is $(1-B)^{-1}$ times larger than for  Poisson
1801: distributions.
1802: 
1803: The distribution function itself is given by
1804: \begin{align}
1805:   \label{ABdist}
1806:   p_n=P(A,B;n):=\,\frac{(1-B)^A \, B^n \, \Gamma(A+n)}{\Gamma(A)\, \Gamma(1+n)}.%=B^n\,(1-B)^A\,\binom{A+n-1}{n}.
1807: \end{align}
1808: This implies that the ratio of consecutive probabilities is
1809: \begin{align}
1810:   \label{pk1pk}
1811:   \frac{p_{n+1}}{p_{n}}=\frac{B\,(A+n)}{1+n}.
1812: \end{align}
1813: In particular, the most probable value is $n=0$ whenever $A\,B<1$.
1814: Since $B<1$, this is always the case when $A\le 1$.  For $A=1$ one gets
1815: exactly a geometric distribution
1816: \begin{align}
1817:   \label{unitA}
1818:   p_n=(1-B)\,B^n,
1819: \end{align}
1820: and for small $A$ Eq.~(\ref{ABdist}) simplifies to the log-series
1821: distribution
1822: \begin{align}
1823:   \label{smallA}
1824:   p_n=
1825:   \left\{
1826:     \begin{matrix}
1827:       1+A\,\ln(1-B) +\mathcal{O}(A^2) & \text{for $n=0$},\\
1828:       \displaystyle\frac{A B^n}{n}+\mathcal{O}(A^2) & \text{otherwise}.
1829:   \end{matrix}
1830: \right.
1831: \end{align}
1832: For small $B$ a Poisson distribution is obtained:  With fixed $AB$,
1833: \begin{align}
1834:   \label{smallB}
1835:   p_n=\frac{(AB)^n}{n!}\,e^{-AB}+\mathcal{O}(B)
1836: \end{align}
1837: uniformly in $n$.  Finally, when $A B \gg 1$ the distribution $p_n$
1838: can be approximated by a Gaussian with mean and variance given by
1839: Eqs.~(\ref{meanAB},\ref{varAB}).
1840: 
1841: \begin{nowordcount}
1842: \bibliographystyle{elsart-harv} 
1843: \bibliography{/home/axel/bib/bibview}
1844: \end{nowordcount}
1845: 
1846: 
1847: 
1848: 
1849: 
1850: 
1851: 
1852: 
1853: \end{document}
1854: 
1855: 
1856: 
1857: %%% Local Variables: 
1858: %%% mode: latex
1859: %%% mode: flyspell
1860: %%% TeX-master: t
1861: %%% End: 
1862: 
1863: % LocalWords:  intervality cumulatives   loopiness Eqs allometrically clade vul
1864: % LocalWords:  digamma renormalize autotrophs heterotrophs
1865: