0805:0805.0084/sad.tex

1: \documentclass[floatfix,showkeys,twocolumn,showpacs,preprintnumbers]{revtex4}

2: %\usepackage{natbib}

3: \usepackage{graphicx}

4: \usepackage{amssymb}

5:

6: \begin{document}

7:

8: \title{Predicting the species abundance distribution using a model food web}

9:

10: \author{Craig R. Powell}

11: \email{craig.powell@manchester.ac.uk}

12: \affiliation{Theoretical Physics Group, School of Physics and Astronomy, University of Manchester, Manchester, M13 9PL, UK}

13:

14: \author{Alan J. McKane}

15: \email{alan.mckane@manchester.ac.uk}

16: \affiliation{Theoretical Physics Group, School of Physics and Astronomy, University of Manchester, Manchester, M13 9PL, UK}

17:

18:

19: \date{\today}% It is always \today, today,

20:              %  but any date may be explicitly specified

21:

22: \begin{abstract}

23: A large number of models of the species abundance distribution (SAD)

24: have been proposed, many of which are generically similar to the

25: log-normal distribution, from which they are often indistinguishable

26: when describing a given data set.  Ecological data sets are

27: necessarily incomplete samples of an ecosystem, subject to

28: statistical noise, and cannot readily be combined to yield a closer

29: approximation to the underlying distribution.  In this paper we use

30: empirical data obtained from an ecosystem model to study the predicted

31: SAD in detail, resolving features which can distinguish between models

32: but which are poorly seen in field data.  We find that the power-law

33: normal distribution is superior to both the log-normal and

34: logit-normal distributions, and that the data can improve on even this

35: at the high-population cut-off.

36: \end{abstract}

37:

38: \pacs{87.23.Cc}

39:

40: \keywords{ecological diversity; trophic distribution; ecological community model}

41:

42: \maketitle

43:

44: \section{Introduction}

45: \label{intro}

46: The species abundance distribution (SAD) is one of the most widely

47: studied descriptions of an ecological community.  To determine it, the

48: number of species in a given community which have $n$ observed

49: individuals is plotted against $n$.  The shape of this plot has been

50: investigated by a great many empiricists and theorists over the years,

51: beginning with the classic work of \citet{fis43} and \citet{pre48}.

52: Reviews of the subject \citep{whi65,may75,gra87,mar03,mcg07} reveal

53: the large number of mechanisms that have been proposed to explain the

54: observed SAD.  Many of these mechanisms predict the essential aspects

55: of the observations, that is, a few very abundant species and many

56: rare species.  As a consequence it has become very difficult to

57: falsify proposed mechanisms from empirical data, which has led to the

58: authors of the most recent multi-author review \citep{mcg07} to

59: contrast the development of the analysis of SADs with ``a healthy

60: scientific field'' in which ``theoretical, empirical and statistical

61: developments [...] advance roughly in parallel''.

62:

63: In this paper we suggest a way forward which is in effect intermediate

64: between the theoretical and empirical approaches.  We measure the SAD

65: in an established model which constructs an ecological community as

66: a set of predator-prey interactions \citep{dro01}.  The model itself

67: was originally created so that many of its key properties were

68: emergent and not put in by hand.  So, for instance, trophic levels

69: emerge from the nature of the predator-prey interactions; species are

70: not labelled as ``plants'', ``herbivores'' or ``carnivores'' a

71: priori.  This contrasts with traditional theoretical approaches which

72: either postulate a mechanism, or if a model community is put forward

73: it is usually rather simple, with the form of the SAD following from

74: one of the fundamental aspects of the theory.  Conversely,

75: measurements taken in the field will of necessity include numerous

76: influences involving climate, terrain, location etc., which are not

77: present in the model we use to measure SADs.  Thus the SADs we measure

78: will be free of these external influences, but still be determined by

79: influences which are too complex to easily characterise.  This

80: approach will also allow us to measure SADs for a multi-trophic level

81: community whereas, so far as we are aware, all other predictions for

82: SADs have been for communities of trophically similar species.

83:

84: The model we will be using (called the Webworld model) has been

85: developed over a number of years \citep{cal98,dro01,dro04,qui05}.  In

86: it, species are defined in terms of traits (phenotypic and behavioural

87: characteristics), and it is the nature of the interactions between

88: these traits which define the nature of interactions between species.

89: This community is built up from a small number of species through a

90: speciation mechanism which creates a new species with a novel set of

91: features.  Resources are distributed through a quite elaborate set of

92: equations governing population dynamics with adaptive foraging.

93: Overviews of the model are given in review articles

94: \citep{dro03,mck05,mck06}, and more briefly in section~\ref{Sec:model}.

95: In  section 3 we outline the method of our analysis, in

96: section 4 we describe the results obtained and we conclude with a review

97: of the results in section 5.

98:

99: \section{Model description}

100: \label{Sec:model}

101: The Webworld model consists of a set of species, each defined by its

102: unique combination of ten different features.  The features are chosen

103: from a set of $L$ possible features determined at the start of the

104: simulation, at which point two species are created.  One of these is

105: the environment species, which has a fixed population for all time and

106: is the ultimate source of energy for all species in the food web.  The

107: other initial species is the common ancestor of all species encountered

108: during a simulation run.

109:

110: The dynamics of the Webworld model occur on three separated

111: time-scales.  The longest of these is the evolutionary time-scale, on

112: which new species are added as mutated versions of extant species.

113: Specifically to the implementation of Webworld, a species  is selected

114: at random without regard to its population, except that this must be

115: non-zero.  One individual of that species is then used to found a new

116: species identity, sharing all features but one with the parent

117: species.  The remaining feature is selected to avoid repetition either

118: of the same feature within one species or the same set of features

119: between species, but is otherwise selected at random.  The newly introduced

120: species is then subject to the same population dynamics as all other

121: species, which is the dynamical process that occurs on the

122: intermediate time-scale.

123:

124: Population dynamics occurs by balance of energy; energy is gained

125: through ``predation'', which in the case of feeding on the environment

126: species we interpret as autotrophy.  Each species $i$ changes its

127: population $n_i$ according to the balance equation

128: \begin{equation}

129:   \dot n_i=\lambda\sum_jg_{ij}n_j - \sum_jg_{ji}n_i - dn_i,

130:   \label{eq:popdyn}

131: \end{equation}

132: where $g_{ij}$ is the functional response, the dependence of the rate

133: of energy consumption by species $i$ on the population of species

134: $j$.  The factor of $\lambda=0.1$ introduces an ecological efficiency

135: whereby the energy lost to species $j$ is greater than that gained by

136: its predator $i$.  Thus the first term on the right hand side of

137: Eq.~(\ref{eq:popdyn}) is the energy income of species $i$ summed

138: across all prey species, while the second term is the energy loss

139: summed across all predators.  If species $a$ does not feed on species

140: $b$ then $g_{ab}=0$, and hence this does not contribute to either

141: sum.  The final term in Eq.~(\ref{eq:popdyn}) is the loss of energy

142: from species $i$ due to death of its constituent individuals at rate

143: $d$ per individual; the expected lifespan of an individual is

144: therefore $1/d$, which for simplicity we take to be the same across

145: all species.  Death appears in our model purely as an energy

146: loss term and cannot be made an evolvable quantity, since it has a

147: preferred value of zero.

148:

149: The shortest time-scale in the Webworld model reflects the choice of

150: foraging strategy by individuals of each species.  The functional

151: response for Eq.~(\ref{eq:popdyn}) is given by

152: \begin{equation}

153:   g_{ij} = \frac{f_{ij}S_{ij}}{bn_j+\sum_k\alpha_{ik}f_{kj}S_{kj}n_k}n_i,

154:   \label{eq:functionalResponse}

155: \end{equation}

156: where $f_{ij}$ is the fraction of its time species $i$ spends feeding

157: on prey species $j$, which is the quantity to be optimised in order to

158: maximise $\sum_jg_{ij}n_j$.  $S_{ij}$ and $\alpha_{ij}$ are constants

159: defined by relating the features of species $i$ and $j$, $S$

160: indicating the ability of $i$ to feed on $j$, and $\alpha$ relating to

161: the degree of inter-specific competition.  To prohibit mutual

162: predation the matrix $S$ is made anti-symmetric, thus

163: $S_{ij}=-S_{ji}$, and the shortest possible feeding loop involves

164: three species.  Matrix $\alpha$ is symmetric, with maximum competition

165: $\alpha_{ii}=1$ between members of the same species, and minimum

166: competition 0.5 between highly-different species.  By calculating $S$

167: and $\alpha$ based on a set of features largely conserved during

168: speciation we ensure that each newly-founded species has similar

169: abilities to its parent species, with which it is also in strong

170: competition, and in particular the dynamics of two identical species,

171: were they allowed, would be indistinguishable from the dynamics of

172: pooling them as one species.

173:

174: In \citet{dro01} an evolutionarily stable strategy was shown to exist

175: for foraging, which can be found by iteratively solving

176: Eq.~(\ref{eq:functionalResponse}) with the condition that

177: \begin{equation}

178:   f_{ij}=\frac{g_{ij}}{\sum_kg_{ik}}.

179: \end{equation}

180:

181: The result of the repeated application of these dynamics is the

182: gradual construction of a complex food web.  Species are removed if

183: their population falls below 1, and the fixed population of the

184: environment species, $R$, as such determines the expected number of

185: species present in the food web at any time, though there is a

186: continual turnover of species and consequent fluctuation in any given

187: food web measure.  After running the model for a large number of

188: evolutionary time steps, there is no systematic change in quantities

189: such as the number of concurrent species, and the food web structure

190: appears to have matured.  It is on such webs that we examine the

191: species abundance distribution.

192:

193: \section{Method}

194: Using the Webworld model discussed in the previous section we generate

195: sets of communities for which the ensemble species abundance

196: distribution (SAD) can be examined in detail.  Because we use the same

197: set of possible features and the same environment species in each

198: case, we assume that the underlying SAD does not alter between model

199: realisations.  In this case it is possible to pool the resultant

200: communities in order to determine the SAD with improved statistical

201: noise.  Details of the computational approach are given in

202: section~\ref{Sec:Computational}.  In section~\ref{Sec:Fitting} we

203: discuss the functions which we fit to the data, and the optimisation

204: criteria of the fitting.  In section~\ref{Sec:Generalisation} we

205: discuss the problems of generalising fits to include communities

206: differing in size or trophic level.

207:

208: \subsection{Comparative models}

209: \label{Sec:Computational}

210: Although the Webworld model can simulate ecological communities in

211: reasonable time, to create large complex communities takes

212: considerable computation, and to generate enough simulations to get

213: good statistics across a broad range of parameter space is difficult.

214: We therefore perform the first examination on a variant of Webworld in

215: which all species feed exclusively on the environment.  Because all

216: species are basal, the relative populations are determined by the

217: relative ability, $S$, and competition, $\alpha$, terms between

218: existing species, which are selected by evolution in the same way as

219: in the full model.  By avoiding a large part of the computational

220: effort we are able to generate large numbers of webs for comparison,

221: and in the results presented here gather statistics from a set of one

222: hundred model runs for each value of resources, $R$.  In

223: section~\ref{Sec:Results} we focus on the fitting of food webs with

224: resources $10^3$, $10^4$, $10^5$ and $10^6$, but simulations were

225: performed for numerous other values of $R$ within this range to show

226: that interpolation of the results is reasonable.  The minimum value of

227: $R$ results in communities with few species, which become

228: correspondingly harder to characterise in terms of an SAD.  Larger

229: values of $R$ become increasingly computationally expensive.  Rather

230: than attempting to extend the range of $R$ to larger values, we

231: created a total of 900 basal communities at $R=10^6$ for more detailed

232: analysis of the tails of the distribution.  Because the common

233: theoretical SADs have been selected based on reproduction of the modal

234: peak, and are poorly constrained by observations, the tails offer the

235: largest differences between candidate SADs.  Due to the much larger

236: computational complexity of the full Webworld model, we have only a

237: sample of ten comparable food webs for large $R$ from which to deduce

238: trophic SADs.

239:

240: \subsection{Fitting method}

241: \label{Sec:Fitting}

242: As can be seen in Figure~\ref{fig:basalPDF}, the probability

243: distribution function (p.d.f.) of species abundance has a rather noisy

244: histogram even for the largest collection of independent communities

245: we were able to assemble with the available computer time.  Fitting a

246: distribution function to such histograms is problematic for several

247: reasons.  The noise makes it difficult to algorithmically optimise the

248: fitting function, and hence can obscure differences in the strength of

249: different functional forms.  More importantly, the apparently optimal

250: parameters and associated fitness will depend on the arbitrary choice

251: of bin width and position, since changing these parameters can

252: significantly alter the distribution of noise between the bins.

253: Furthermore, the distribution function underlying the observed SAD is

254: likely to have a functional form other than our approximations, and in

255: general may be significantly more complicated than we can extract from

256: data so long as the noise remains.  Rather than obtaining a function

257: which closely matches the value of the p.d.f. for most population

258: sizes, but which omits important features, we prefer to recover a

259: smoothed version of the distribution function which correctly predicts

260: the total number of species.  As a consequence of these considerations

261: we fit the integrated version of the fitting function to the empirical

262: cumulative distribution function (c.d.f.), whose value at a given

263: population $N$ is the measured number of species with $n_i<N$.  This

264: definition matches the type of p.d.f. used by \cite{may75} whose

265: integral is the expected number of species.  P.d.f.s may also be

266: defined such that the area enclosed is unity.  To illustrate the

267: fitting procedure we present plots of the measured and fitted c.d.f.s

268: in addition to the p.d.f.s, and indicate the goodness-of-fit by

269: plotting the residuals of the c.d.f., that is, the difference between

270: the integrated fitting function and the measured c.d.f.

271:

272: The strongest condition that we impose on each fitting function is

273: that it should correctly predict the number of species more abundant

274: than the least abundant species observed.  Below this population the

275: distribution may be terminated by a veil line, but we do not allow any

276: such consideration for populations above the most abundant species

277: observed.  Subject to this condition we optimise the parameters of

278: each theoretical distribution function, $f\!\left(\ln N\right)$, by

279: minimising a quantity analogous to $\chi^2$.  One such statistic is

280: the Cram\'er-von Mises test \citep{bai91}, defined as

281: \begin{equation}

282:  CM=\frac1{12S}+\frac1{S}\sum_{i=1}^S\left(i-0.5-\hat f\!\left(n_i\right)

283: \right)^2,

284: \end{equation}

285: where $\hat f\!\left(n_i\right)$ is the predicted number of species

286: less abundant than $n_i$, subject to the veil line at $n_1$, and $S$

287: is the number of species observed.  Although this is readily

288: generalised to an ensemble of SADs, it attributes most weight to the

289: peak of the distribution at the expense of fitting the tails, and we

290: instead minimise the quantity

291: \begin{equation}

292:   k^2=\int_{\ln n_1}^{\ln N_{\rm max}}\left(

293:   C\!\left(N\right)-\hat f\!\left(N\right)

294:   \right)^2{\rm d}\ln N,

295: \end{equation}

296: where $C\!\left(N\right)$ is the observed number of species less

297: abundant than $N$.  For many distributions $N_{\rm

298: max}\rightarrow\infty$, but functions such as the logit-normal

299: distribution are parametrised by the total number of individuals

300: observed, $J$, in which case $N_{\rm max}=J$.  Unlike the Cram\'er-von

301: Mises statistic, $k^2$ places equal weight in all intervals of $\ln

302: N$.  Given that the theoretical distribution almost certainly differs

303: from the distribution underlying the data, this tends to avoid

304: problematic regions, such as ranges of $N$ in which few species are

305: observed, but where the empirical and theoretical c.d.f.s differ.  The

306: tails of the distribution often behave in this manner.  Having

307: identified optimal fitting parameters by minimising $k^2$, we follow

308: the advice of \cite{mcg07} that ``claim[s] of a superior fit must be

309: robust by being superior on multiple measures'' by evaluating the

310: Kolmogorov-Smirnov statistic \citep{hay02} for each theoretical

311: distribution.  Defined for a single realisation as

312: \begin{equation}

313:   d=S^{-1/2}{\rm max}_i\left\{\left|i-1-\hat f\!\left(n_i\right)\right|,

314:                               \left|i-\hat f\!\left(n_i\right)\right|\right\},

315:   \label{Eq:KS}

316: \end{equation}

317: $d$ corresponds to the greatest deviation between the empirical and

318: theoretical c.d.f.s.  This must occur at one of the observed species,

319: which correspond to steps in the empirical c.d.f.  It is necessary to

320: evaluate the difference between the empirical and theoretical

321: c.d.f. both immediately before and after the step, and hence the

322: `maximum' operator in Eq.~(\ref{Eq:KS}) contains two terms for each

323: observation $i$.  Although the values of $d$ obtained imply rejection

324: of the theoretical distributions given the amount of data available,

325: we use $d$ as a measure of the relative goodness-of-fit to distinguish

326: between theoretical distributions.  Other measures of goodness-of-fit

327: tend to relate to binned data rather than the c.d.f., and provide

328: correspondingly weaker evidence \citep{mcg03}.

329:

330: \begin{figure}

331:   \centering

332:   \includegraphics[width=0.45\textwidth]{fig1.eps}

333:   \caption{The fitted species abundance distribution for basal

334:     communities with resources $R=1000$, 10\,000, 100\,000 and

335:     1\,000\,000. The histogram indicates the data in bins of width 0.1

336:     in $\ln N$.  The solid curves indicate optimal log-normal fits,

337:     the dotted lines optimal logit-normal fits, and the dashed lines

338:     optimal power-law normal fits.  Distributions to the right

339:     correspond to increasing $R$.

340:   }

341:   \label{fig:basalPDF}

342: \end{figure}

343: Although the log-normal distribution has been criticised as

344: inappropriate for application to SADs \citep{wil05}, it is a commonly

345: examined form of the SAD and we therefore adopt it as one of the

346: theoretical SADs we fit to the data.  We also consider the

347: logit-normal distribution preferred by \citet{wil05}.  Whereas the

348: log-normal distribution appears as a normal distribution when plotted

349: against a logarithmic population-axis, the logit-normal has a normal

350: distribution when plotted against a logit population axis.  Our

351: analysis will consistently use the logarithmic axis both for plotting

352: and for the integration of $k^2$, so while the log-normal distribution

353: has the form

354: \begin{equation}

355:   P\!\left(\ln N\right){\rm d}\ln N=A\exp\left\{-\frac{\left(\ln

356:   N-\ln\mu\right)^2}{2\sigma^2}\right\}{\rm d}\ln N,

357: \end{equation}

358: with the fitting parameters $A$, $\mu$ and $\sigma$, the logit-normal

359: distribution includes an extra factor, giving

360: \begin{equation}

361:  P\!\left(\ln N\right)=A\frac{J}{J-N}

362: \exp\left\{-\frac{\left(\ln\frac{N}{J-N}

363:  -\ln\frac{\mu}{J-\mu}\right)^2}{2\sigma^2}\right\}.

364: \end{equation}

365: We also consider a third fitting function, the power-law normal

366: distribution, which appears normal against a power-law population

367: axis.  Transformed to a logarithmic axis, this has the functional form

368: \begin{equation}

369: P\!\left(\ln N\right)=A\alpha

370: N^\alpha\exp\left\{-\frac{\left(N^\alpha-\mu^\alpha\right)^2}

371: {2\sigma^2}\right\},

372: \label{Eq:PowerLawNormaldlnN}

373: \end{equation}

374: where $\alpha$ is the power-law index.  We do not consider the

375: log-series distribution since our data are with few exceptions peaked

376: at large $N$, whereas the p.d.f. of the log-series distribution

377: decreases from $N=1$ even when drawn against a logarithmic

378: population-axis.  The broken stick distribution \citep{mag88} was found to be

379: similar in form to the observed distribution, but inferior to the

380: log-normal in all cases.

381:

382: \subsection{Comparison of food webs}

383: \label{Sec:Generalisation}

384: \begin{figure}

385:   \centering

386:   \includegraphics[width=0.45\textwidth]{fig2.eps}

387:   \caption{The fitted cumulative species abundance distribution for

388:     basal communities with resources $R=1000$, 10\,000, 100\,000 and

389:     1\,000\,000.  The solid line shows the data, the dotted line marks

390:     the log-normal distribution, and the dashed line the power-law

391:     normal distribution.

392:     Distributions to the right correspond to increasing $R$.

393:   }

394:   \label{fig:basalCDF}

395: \end{figure}

396: Since we are applying the same distribution function with different

397: parameters to basal food webs of different sizes, and to the SADs of

398: different trophic levels within a single community, in the ideal case

399: a parametrisation of the fitting coefficients in terms of resources,

400: $R$, and trophic level, $l$, would be found.  Because small values of

401: $R$ correspond to food webs with fewer species, complications arise in

402: weighting the contribution to goodness-of-fit from differently sized

403: webs, and we do not in this paper attempt to simultaneously fit webs

404: of different sizes.  By examination of the best-fitting parameters for

405: each web we can determine the dependence of parameters on $R$ except

406: in one case; the power-law index $\alpha$ of the power-law normal

407: distribution.  For most values of $R$ the goodness-of-fit depends

408: quite weakly on this parameter, and the optimal value of $\alpha$ is

409: poorly constrained for any one web.  Since we were unable to identify

410: a systematic trend or strongly constrain the value of $\alpha$, we

411: chose $\alpha=0.2$ as a constant value consistent with the optimised

412: parameters, and fixed this value for all results presented here.

413:

414: \section{Results}

415: \label{Sec:Results}

416: In section~\ref{Sec:Basal} we present the results of the fitting

417: procedure for the basal communities.  These should give the least

418: complicated species abundance distributions (SADs), since all species

419: feed on a single resource and are in direct competition with each

420: other.  In comparison, the trophic communities examined in

421: section~\ref{Sec:Trophic} feed on multiple food sources themselves

422: distributed in abundance, and compete with different subsets of the

423: other species.  In section~\ref{Sec:Tails} we make use of the large

424: number of simulation runs which can be performed to make a detailed

425: examination of the low- and high-population tails of the empirical

426: distribution, and compare this to the behaviour of the fitted

427: distributions.

428:

429: \subsection{Basal community}

430: \label{Sec:Basal}

431: \begin{figure}

432:   \centering

433:   \includegraphics[width=0.45\textwidth]{fig3.eps}

434:   \caption{The same data plotted in Figure~\ref{fig:basalCDF} shown as

435:     residuals; the solid line corresponds to the empirical c.d.f. minus the

436:     log-normal distribution, the dotted line to the data minus the

437:     logit-normal distribution, and the dashed line to the data minus the

438:     power-law normal distribution.  Offsets of -0.5, -1.5 and -2.5 have been

439:     applied to data for resources $R=10\,000$, 100\,000 and

440:     1\,000\,000 respectively.

441:   }

442:   \label{fig:basalCDFr}

443: \end{figure}

444: \begin{figure}

445:   \centering

446:   \includegraphics[width=0.45\textwidth]{fig4.eps}

447:   \caption{Parameters of the power-law normal fit to the basal

448:     community SAD for all values of resources examined.  The solid

449:     line passes through points marking the mean population, $\mu$ in

450:     Eq.~(\ref{Eq:PowerLawNormaldlnN}); the dashed line marks the

451:     standard deviation, $\sigma$.  Squares mark the Kolmogorov-Smirnov

452:     $d$ value, and stars mark the quantity $K$ described in the text.

453:   }

454:   \label{fig:basalFitPar}

455: \end{figure}

456: The results for this version of the model are the most complete in

457: that one hundred simulations runs were examined for each value of

458: resources $R$, and a large number of values of the continuous

459: parameter were examined.  In Figures~\ref{fig:basalPDF} to

460: \ref{fig:basalCDFr} only four of these realisations are plotted,

461: corresponding to $R=10^3$, $10^4$, $10^5$ and $10^6$, which include

462: the two most extreme values of $R$ for which webs were calculated.

463: The general features of the SAD for these four values are typical, as

464: is the goodness of fit achieved by each of the three fitting functions

465: examined.  It is clear from Figure~\ref{fig:basalPDF} that the

466: observed distribution is left-skewed (has an over-abundance of rarer

467: species), a characteristic absent from the log-normal distribution.

468: The logit-normal distribution does not have significantly improved

469: skew over the log-normal distribution, since the most abundant species

470: from any run has less than one quarter of the mean number of

471: individuals $J$, and the logit function is therefore well below its

472: asymptotic cut-off.  \citet{wil05} note that in this limit the

473: logit-normal distribution approaches the log-normal.  The power-law

474: normal distribution much more closely captures the smaller high-$N$

475: tail.  The corresponding cumulative distribution functions (c.d.f.s)

476: are plotted in Figure~\ref{fig:basalCDF}, where the logit-normal

477: distribution has been omitted for clarity.  It can be seen, especially

478: for $R=10^6$, that the log-normal distribution underestimates the

479: cumulative number of species in both tails, which corresponds to the

480: skew of the p.d.f., and that even for one hundred realisations the

481: empirical c.d.f. is far from smooth.  More instructive than the

482: c.d.f. are the residuals of this plot, that is, the difference between

483: the instantaneous value of the empirical c.d.f. and the fitting

484: function.  These are shown in Figure~\ref{fig:basalCDFr} for all three

485: fitting functions.  The integral of the square of this plot is our

486: goodness-of-fit measure $k^2$, and the maximum deviation from zero is

487: the Kolmogorov-Smirnov $d$-measure.  Substantial structure can be seen

488: in the residuals, especially the central peak for each value of $R$

489: when examining the power-law normal distribution, which most closely

490: mimics the tails.  Table~\ref{Tab:Fitting} records the values of $k^2$

491: and the Kolmogorov-Smirnov $d$ value for each fit, for fitting

492: parameters minimising $k^2$.  Basal communities are labelled by the

493: value of resources, $R$, while trophic levels examined in

494: section~\ref{Sec:Trophic} are labelled according to the trophic level,

495: $l$.  For the basal food webs the power-law normal fit always

496: outperforms both the logit-normal and log-normal distributions in

497: terms of $k^2$, and is only in one case inferior to the logit-normal

498: distribution as measured by $d$.  A further comparison of the relative

499: merits of the theoretical distribution functions is given in

500: section~\ref{Sec:Tails}.

501:

502: In Figure~\ref{fig:basalFitPar} we plot the dependence of the

503: parameters of the power-law normal fit on $R$, as well as the two

504: goodness-of-fit indicators used.  The solid line, marking the

505: population of the peak of the distribution, indicates the very near

506: linearity of the value of the peak of the distribution with $\ln R$.

507: The standard deviation of the distribution increases more rapidly than

508: linearly, as indicated by the dashed line.  The value of $k^2$

509: increases with $\ln R$ for two reasons.  Firstly, it is measured on

510: the full c.d.f. rather than the normalised distribution, and so tends

511: to increase as the square of the expected number of species, $S$.

512: Secondly, because it is an integrated measure, it tends to increase

513: with the width of the distribution, which we characterise by the

514: standard deviation of the log-normal distribution, $\sigma_{\rm LN}$.

515: It is more appropriate to use this measure than the standard deviation

516: of the power-law normal itself since the former corresponds naturally

517: to the width along the logarithmic population axis.  In

518: Figure~\ref{fig:basalFitPar} we plot the quantity

519: \begin{equation}

520:   K=\frac{1000k^2}{S^2\sigma_{\rm LN}},

521:   \label{Eq:K}

522: \end{equation}

523: which compensates for these effects, and includes a factor of $1000$

524: to scale it appropriately for that plot.  It can be seen that

525: intermediate values of $R$ are the best fitted, as measured by either

526: $K$ or the Kolmogorov-Smirnov $d$, perhaps due to relatively small

527: amounts of additional structure.

528:

529: \subsection{Trophic levels}

530: \label{Sec:Trophic}

531: \begin{figure}

532:   \centering

533:   \includegraphics[width=0.45\textwidth]{fig5.eps}

534:   \caption{Histograms mark the observed species abundance distribution

535:     for the four trophic levels found in the ten Webworld communities

536:     examined.  Trophic levels two and four are marked by dotted and

537:     dashed lines respectively.  Solid curves mark the optimal

538:     log-normal fits to each trophic level, and dashed lines the

539:     optimal power-law normal fits.

540:   }

541:   \label{fig:trophicPDF}

542: \end{figure}

543: \begin{figure}

544:   \centering

545:   \includegraphics[width=0.45\textwidth]{fig6.eps}

546:   \caption{The cumulative species abundance distribution for

547:     Webworld communities corresponding to the four observed trophic

548:     levels.  Higher trophic levels are to the left of lower levels,

549:     having smaller typical populations.  Optimal log-normal fits are

550:     marked by dotted lines, and optimal power-law normal fits by

551:     dashed lines.

552:   }

553:   \label{fig:trophicCDF}

554: \end{figure}

555: Having established that the power-law normal distribution describes

556: the SAD reasonably well for basal communities, we apply it to

557: individual trophic levels of full Webworld communities to determine

558: the relevant fitting parameters.  Due to the small number of food webs

559: available, and the small number of species in each trophic level for

560: any one web, it is inappropriate to seek deviations from this

561: distribution with the data available, although we find that the

562: power-law normal distribution is adequate, and superior in all cases

563: to the log-normal distribution, having smaller values of both $k^2$

564: and $d$.  As indicated by the values given in table~\ref{Tab:Fitting},

565: the logit-normal distribution marginally improves upon the power-law

566: normal distribution for trophic levels 1 and 3, but is significantly

567: inferior to the power-law normal for trophic level 2.  For trophic

568: level 1, the typical number of species observed per web in the data

569: examined was only 5.9, the most abundant species being nearly half the

570: total population of its trophic level.  For trophic level 3 the lower

571: tail of the distribution was truncated, and although here the

572: logit-normal distribution performed better than the power-law normal,

573: it is not clear that the logit-normal is able to adequately reproduce

574: the whole SAD.  Although four trophic levels were found in the

575: empirical data, a very small number of species were found in trophic

576: level 4.  It can be seen in Figure~\ref{fig:trophicPDF} that the

577: distribution function of this level is little more than the

578: high-population tail of the distribution function, and no reliable

579: results can be obtained by its analysis.

580:

581: For comprehension of the empirical distribution being fitted we

582: reproduce, in Figure~\ref{fig:trophicCDF}, the cumulative distribution

583: function constructed from the simulation data along with the optimal

584: log-normal and power-law normal fits.  It can be seen clearly from

585: this figure that the distribution of the second trophic level, which

586: has the largest number of species in total, is closest in form the

587: those of the basal communities.  The distribution of trophic level

588: three, to its left, passes the veil line before a significant fraction

589: of the low-population tail has been exposed, but is otherwise in good

590: agreement with the basal community distributions.  The lowest trophic

591: level, however, seems relatively truncated, resulting in a much

592: sharper cutoff at large $N$ than is reproduced by either the

593: log-normal or power-law normal distributions.  The cause of this may

594: relate to the presence of predators, who can be expected to

595: preferentially target the most abundant prey, but additional data are

596: required to investigate this hypothesis.  The residuals of the

597: c.d.f. fits are shown in Figure~\ref{fig:trophicCDFr}; it is possible

598: that similar structure in these is present to that seen for the basal

599: communities in Figure~\ref{fig:basalCDFr}, but the degree of noise is

600: greater.

601:

602: In Figure~\ref{fig:trophicFit} the mean and standard deviation of the

603: power-law normal distribution are plotted as a function of trophic

604: level.  While the standard deviation appears to decline linearly with

605: trophic level, the distribution mean may decrease more slowly.

606: However, if the results for trophic level four are misleading due to

607: \begin{figure}

608:   \centering

609:   \includegraphics[width=0.45\textwidth]{fig7.eps}

610:   \caption{The same data plotted in Figure~\ref{fig:trophicCDF} shown as

611:     residuals; the solid line corresponds to the empirical c.d.f. minus the

612:     log-normal distribution, the dotted line to the data minus the

613:     logit-normal distribution, and the dashed line to the data minus the

614:     power-law normal distribution.  Offsets of -1.0, -2.0 and -2.5 have been

615:     applied to data for trophic levels 2, 3 and 4 respectively.  No

616:     logit-normal fit was obtained for trophic level 4 due to the

617:     absence of a positive optimal mean.

618:   }

619:   \label{fig:trophicCDFr}

620: \end{figure}

621: the extremely high position of the veil line, and the distribution of

622: basal species is possibly altered through predation as discussed, the

623: reliability of these results is limited.  The quantity $K$, defined in

624: Eq.~(\ref{Eq:K}), is much better for trophic levels two and three

625: than for either the basal or fourth trophic level, although only

626: marginal improvements in the Kolmogorov-Smirnov $d$ value can be seen.

627:

628: \subsection{Distribution tails}

629: \label{Sec:Tails}

630: \begin{figure}

631:   \centering

632:   \includegraphics[width=0.45\textwidth]{fig8.eps}

633:   \caption{

634:     Parameters of the power-law normal fit to the trophic community SAD

635:     for all values of resources examined.  The solid line passes

636:     through points marking the mean value of $N$.  The lower dashed line

637:     marks the standard deviation, while the upper dashed line multiplies

638:     this quantity by 10 for clarity.  Squares mark the

639:     Kolmogorov-Smirnov $d$ value.  Stars mark the quantity $K$ defined

640:     in the text.

641:   }

642:   \label{fig:trophicFit}

643: \end{figure}

644: An advantage of examining computationally derived communities of

645: species is that extremely large data sets can be constructed with

646: relative ease, subject only to the availability of computer time.  In

647: addition, the Webworld model produces complete ecological communities,

648: and the sampling effects associated with field data are avoidable.  As

649: such it is much more feasible to examine the form taken by the tails

650: of the distribution function, which \citet{mcg07} note are subject to

651: noisy data, but which often contain the main differences between

652: theoretical distributions.

653:

654: To construct a high-quality empirical SAD whose tails could be

655: examined, nine hundred simulation runs were performed for the basal

656: community with $R=10^6$.  The low-population tail of this distribution

657: is plotted in Figure~\ref{fig:lowTail}, where the logarithm of the binned

658: species abundance has been taken to expose the tail.  The fact that a

659: linear regression to this data (not shown) produces a good fit for

660: $\ln N<7$ implies that in this regime a power-law fit,

661: \begin{equation}

662:  P\!\left(\ln N\right)\propto N^a,

663: \end{equation}

664: with $a\sim 4/3$, is obeyed.  The power-law normal distribution is

665: able to reproduce this form reasonably well, while both the log-normal

666: and logit-normal distributions significantly underestimate the

667: number of species present.

668:

669: The distribution tail for large populations is shown in

670: Figure~\ref{fig:highTail}.  Here bins have been chosen to be uniform

671: in width in population, rather than uniform in $\ln N$, in order to

672: resolve the tail.  The result is that a different version of the

673: distribution is shown,

674: \begin{equation}

675: P\!\left(N\right){\rm d}N=\frac{P\!\left(\ln N\right)}{N}{\rm d}N,

676: \end{equation}

677: which, when integrated with respect to $N$, gives the c.d.f.  Note

678: that in order to highlight the form of the decay, the population axis

679: has been stretched to a power-law.  The regression line, plotted as a

680: dash-dot line, indicates that the high-population tail has the form

681: \begin{equation}

682: P\!\left(N\right){\rm d}N\propto

683: \exp\left\{-\left(\frac{N}{7140}\right)^{1.4116}\right\}{\rm d}N.

684: \end{equation}

685: As can be seen in Figure~\ref{fig:highTail}, this form of the decay

686: declines more rapidly with $N$ than any of the log-normal,

687: logit-normal or power-law normal distributions examined.

688:

689: Having established probable forms for the low- and high-population

690: tails by regression to Figures~\ref{fig:lowTail} and

691: \ref{fig:highTail}, we combine these into a distribution which has the

692: minimum value of the two tail-fitting functions for all $N$.  In

693: addition to the dashed line marking the empirical c.d.f., identical to

694: that shown in Figure~\ref{fig:basalCDF}, this fit is shown in

695: Figure~\ref{fig:tails} in two forms.  The lower plot is the

696: c.d.f. integrated from zero species at $N=0$, and the upper curve is

697: integrated down from the observed number of species so as to converge

698: with the empirical distribution at large $N$.  The fact that the

699: latter curve is above the former indicates that the combined

700: distribution underestimates the total number of species, implying that

701: it under-predicts the p.d.f. near the peak, to which it was not

702: fitted.  Figure~\ref{fig:tails} therefore also plots the residuals of

703: the tail-fitting distribution as a histogram.  There appear to be at

704: least three peaks in the residuals, making it difficult to identify a

705: plausible general form.  Since we do not have unrelated basal food

706: webs to examine, in particular to establish what parameters of the

707: tail distributions are generic and whether the residuals show a common

708: pattern, it is not appropriate to draw further conclusions about the

709: central part of the distribution.  We are also unable to ascribe a

710: goodness-of-fit to the tail-based distribution due to its inability to

711: reproduce the peak of the distribution.

712: \begin{figure}

713:   \centering

714:   \includegraphics[width=0.45\textwidth]{fig9.eps}

715:   \caption{

716:     The small population tail of the basal community p.d.f. for resources

717:     $R=10^6$. The p.d.f. is described in the text.  The solid, dotted and

718:     dashed curves mark the log-normal, logit-normal and power-law normal

719:     fits to Figure~\ref{fig:basalCDF} respectively.

720:   }

721:   \label{fig:lowTail}

722: \end{figure}

723:

724: \section{Conclusions}

725: \label{Sec:Conclusions}

726: \begin{figure}

727:   \centering

728:   \includegraphics[width=0.45\textwidth]{fig10.eps}

729:   \caption{The high population tail of the basal community p.d.f. for

730:     resources $R=10^6$.  The $x$-axis is linear in $N^{1.4116}$, which

731:     was found to be the power-law index minimising the $\chi^2$ of the

732:     regression line, but has been marked with corresponding values of

733:     $N$ for clarity.  The histogram marks the value of

734:     $P\!\left(N\right)$, the population density in bins of equal width

735:     in $N$.  The solid, dotted and dashed curves mark the log-normal,

736:     logit-normal and power-law normal fits to

737:     Figure~\ref{fig:basalCDF} respectively.  The dash-dotted line

738:     indicates the best-fitting regression for $N>2000$.

739:   }

740:   \label{fig:highTail}

741: \end{figure}

742: We have investigated the form of the species abundance distribution

743: empirically derived from simulation results of the Webworld food web

744: model.  This model was created to examine patterns of food web

745: assembly, and the form of the species abundance distribution (SAD) was

746: not a factor in its construction.  Rather, the use of population

747: dynamics to establish the success of particular species and feeding

748: strategies within the community lead naturally to variation in the

749: abundance of species which appears similar to the SADs identified from

750: real ecosystems.  By investigating the empirical SAD from the

751: simulations in the same manner as data from real ecosystems we are

752: able to characterise not only the peak of the distribution, which is

753: frequently observed to have a form similar to the log-normal

754: distribution, but to examine in detail parts of the distribution

755: difficult to obtain data on from field studies.  We agree with the

756: conclusion of \citet{wil05} that the logit-normal distribution fits

757: better, but with particular reference to the tails of the distribution

758: find that the power-law normal distribution function is better still.

759: In particular, the log-normal and logit-normal distributions predict

760: that the number of species with population $N$ falls more rapidly with

761: decreasing $N$ than we obtain from our simulation results, which the

762: power-law normal distribution matches very well in this tail.

763:

764: The presence of structure in Figure~\ref{fig:basalCDFr} suggests that

765: a more complicated function is needed to properly reproduce the

766: observed SAD, but we have not been able to examine the reproducibility

767: of this remaining structure.  All the food webs examined were created

768: for the same set of possible features and the same environment

769: species.  To fully explore the results even for a single value of $R$

770: would require the use of food webs constructed for `worlds' with

771: different environment species and feature sets.  In undertaking such a

772: programme it would first be necessary to establish whether such

773: parameters as the mean and variance of the fitted distribution

774: changed, or more generally to construct the meta-distribution of a

775: large number of Webworld `worlds' and test, using the

776: Kolmogorov-Smirnov $d$ value, whether the empirical distribution

777: constructed from webs of a single family was consistent with the

778: meta-distribution.

779:

780: We find that the power-law normal distribution identified as well

781: describing the SAD of a basal community is also successful in

782: describing individual trophic levels of a food web.  It is

783: particularly descriptive of the second trophic level, which can be

784: seen in Figures~\ref{fig:trophicPDF} and \ref{fig:trophicCDF} to be

785: the most completely realised by our empirical data.  The higher

786: trophic levels can also be expected to be well-fitted by the power-law

787: normal distribution, although the truncation of the distribution at

788: low populations results in the log-normal and logit-normal

789: descriptions also being adequate.  The empirical distribution of the

790: lowest trophic level is more sharply truncated at high populations

791: than seen for other communities, the reason for which would require

792: substantial additional investigation.  Unlike the case of examining

793: basal communities at different values of $R$, only a small number of

794: trophic levels are ever possible, and hence the relation between them

795: is harder to quantify.  While it would be possible to construct

796: meta-distributions from larger numbers of food webs, it is more

797: feasible to first examine the agreement between the meta-distributions

798: of basal communities and the constituent distributions.  If there is

799: good agreement, the agreement between the meta-distribution and the

800: trophic distributions should be examined.  If not, then a very large

801: number of communities need to be evolved in the same environment in

802: order to study the relation between trophic levels, potentially also

803: examining the effect of different values of $R$.  The main problem in

804: investigating the SAD of numerically modelled ecosystems is the

805: extensive computer time required to provide data.

806:

807: The SADs constructed for this paper are complete not only in the sense

808: that they contain all individuals present in the sample area, but also

809: in that they do not feature immigrant or transient species, which can

810: contribute to the low-population tail without representing a viable

811: population.  While features such as immigration from surrounding

812: communities can easily be incorporated into our model, as can finite

813: population effects, their exclusion demonstrates the existence of an

814: extensive low-population tail to the distribution even for a closed

815: ecosystem.  This contrasts with the proposal by \citet{mag03} that the

816: low-population tail is a log-series distribution of ``occasional''

817: species added to a core log-normal distribution.  Although we do not

818: agree with \citet{mcg03a} that left-skew is purely an effect of

819: sampling, it may be the case that the left-skew of incomplete

820: samples does not reflect the underlying distribution.

821: \begin{figure}

822:   \centering

823:   \includegraphics[width=0.45\textwidth]{fig11.eps}

824:   \caption{

825:     The c.d.f. for the basal community with resources $R=10^6$, as shown

826:     in Figure~\ref{fig:basalCDF}, is shown as a dashed line.  The

827:     solid lines mark the c.d.f. constructed from fits to the tails as

828:     described in the text.  The histogram marks the residuals of the

829:     p.d.f. of this fit.

830:   }

831:   \label{fig:tails}

832: \end{figure}

833:

834: \citet{mcg07} observe that most proposed SADs are similar to one

835: another except in the tails, which is precisely the region which field

836: observations are least able to address due to paucity of data.  This

837: issue can be addressed by the use of any model which can produce

838: multiple independent realisations of its dynamics from which a

839: composite SAD can be constructed, but this process can only be used to

840: inform the analyses which should be performed on ecological data,

841: since it is not known a priori that any given model accurately

842: reproduces the real SAD.  A virtue of the Webworld model is that is

843: produces a plausible SAD without any such consideration having been

844: used during the model design, being based rather on plausible

845: ecological rules.

846:

847: \section*{Acknowledgements}

848: The authors thank Carlos A. Lugo for providing additional simulation

849: data.  This work was supported by EPSRC under grant GR/T11784.

850:

851: \begin{thebibliography}{21}

852: \providecommand{\natexlab}[1]{#1}

853:

854: \bibitem[{{Fisher} \textit{et~al.}(1943){Fisher}, {Corbet}, \&

855:   {Williams}}]{fis43}

856: \textsc{{Fisher}, R.~A., {Corbet}, A.~S., {Williams}, C.}, 1943; \textit{The

857:   relationship between the number of species and the number of individuals in a

858:   random sample from an animal population}.

859: \newblock J. Animal Ecology \textbf{12} 42

860:

861: \bibitem[{{Preston}(1948)}]{pre48}

862: \textsc{{Preston}, F.~W.}, 1948; \textit{The commonness and rarity of species}.

863: \newblock Ecology \textbf{29} 254

864:

865: \bibitem[{{Gray}(1987)}]{gra87}

866: \textsc{{Gray}, J.~S.}, 1987; \textit{Species-abundance patterns}.

867: \newblock In {O}rganization of {C}ommunities {P}ast and {P}resent (J.~H.~R.

868:   {Gee}, P.~S. {Giller}, eds.) (Blackwell Science, Oxford), 53--68

869:

870: \bibitem[{{Marquet} \textit{et~al.}(2003){Marquet}, {Fern\'andez}, \&

871:   {Cofre}}]{mar03}

872: \textsc{{Marquet}, P.~A., {Fern\'andez}, J.~A., {Cofre}, H.}, 2003;

873:   \textit{Breaking the stick in space: of niche models, metacommunities and

874:   patterns in the relative abundance of species}.

875: \newblock In {M}acroecology: {C}oncepts and {C}onsequences (T.~M. {Blackburn},

876:   K.~J. {Gaston}, eds.) (Blackwell Science, Oxford), 64--86

877:

878: \bibitem[{{May}(1975)}]{may75}

879: \textsc{{May}, R.~M.}, 1975; \textit{Patterns of species abundance and

880:   diversity}.

881: \newblock In {Ecology and evolution of communities} (M.~L. {Cody}, J.~M.

882:   {Diamond}, eds.) (Belknap Press, Harvard), 81--120

883:

884: \bibitem[{{McGill} \textit{et~al.}(2007){McGill}, {Etienne}, {Gray}, {Alonso},

885:   {Anderson}, {Benecha}, {Dornelas}, {Enquist}, {Green}, {He}, {Hurlbert},

886:   {Magurran}, {Marquet}, {Maurer}, {Ostling}, {Soykan}, {Ugland}, \&

887:   {White}}]{mcg07}

888: \textsc{{McGill}, B.~J., {Etienne}, R.~S., {Gray}, J.~S., {Alonso}, D.,

889:   {Anderson}, M.~J., {Benecha}, H.~K., {Dornelas}, M., {Enquist}, B.~J.,

890:   {Green}, J.~L., {He}, F., {Hurlbert}, A.~H., {Magurran}, A.~E., {Marquet},

891:   P.~A., {Maurer}, B.~A., {Ostling}, A., {Soykan}, C.~U., {Ugland}, K.~I.,

892:   {White}, E.~P.}, 2007; \textit{Species abundance distributions: moving beyond

893:   single prediction theories to integration within an ecological framework}.

894: \newblock Ecology Letters \textbf{10} 995

895:

896: \bibitem[{{Whittaker}(1965)}]{whi65}

897: \textsc{{Whittaker}, R.~H.}, 1965; \textit{Dominance and diversity in land

898:   plant communities}.

899: \newblock Science \textbf{147} 250

900:

901: \bibitem[{{Drossel} \textit{et~al.}(2001){Drossel}, {Higgs}, \&

902:   {McKane}}]{dro01}

903: \textsc{{Drossel}, B., {Higgs}, P.~G., {McKane}, A.~J.}, 2001; \textit{The

904:   influence of predator-prey population dynamics on the long term evolution of

905:   food web structure}.

906: \newblock J. Theor. Biol. \textbf{208} 91

907:

908: \bibitem[{{Caldarelli} \textit{et~al.}(1998){Caldarelli}, {Higgs}, \&

909:   {McKane}}]{cal98}

910: \textsc{{Caldarelli}, G., {Higgs}, P.~G., {McKane}, A.~J.}, 1998;

911:   \textit{Modelling coevolution in multispecies communities}.

912: \newblock J. Theor. Biol. \textbf{193} 345

913:

914: \bibitem[{{Drossel} \textit{et~al.}(2004){Drossel}, {McKane}, \&

915:   {Quince}}]{dro04}

916: \textsc{{Drossel}, B., {McKane}, A.~J., {Quince}, C.}, 2004; \textit{The impact

917:   of nonlinear functional responses on the long-term evolution of food web

918:   structure}.

919: \newblock J. Theor. Biol. \textbf{229} 539

920:

921: \bibitem[{{Quince} \textit{et~al.}(2005){Quince}, {Higgs}, \& {McKane}}]{qui05}

922: \textsc{{Quince}, C., {Higgs}, P.~G., {McKane}, A.~J.}, 2005;

923:   \textit{Topological structure and interaction strengths in model food webs}.

924: \newblock Ecol. Model. \textbf{187} 389

925:

926: \bibitem[{{Drossel} \& {McKane}(2003)}]{dro03}

927: \textsc{{Drossel}, B., {McKane}, A.~J.}, 2003; \textit{Modelling food webs}.

928: \newblock In Handbook of {G}raphs and {N}etworks ({S. Bornholdt and H.G.

929:   Schuster}, ed.) (Wiley-VCH), 218--247

930:

931: \bibitem[{{McKane} \& {Drossel}(2005)}]{mck05}

932: \textsc{{McKane}, A.~J., {Drossel}, B.}, 2005; \textit{Modelling evolving food

933:   webs}.

934: \newblock In {D}ynamical {F}ood {W}ebs (P.~C. de~Ruiter, V.~Wolters, J.~C.

935:   Moore, eds.) (Elsevier, Singapore), 74--88

936:

937: \bibitem[{{McKane} \& {Drossel}(2006)}]{mck06}

938: \textsc{{McKane}, A.~J., {Drossel}, B.}, 2006; \textit{Models of food web

939:   evolution}.

940: \newblock In {E}cological {N}etworks: {L}inking {S}tructure to {D}ynamics in

941:   {F}ood {W}ebs (Oxford University Press), 223--243

942:

943: \bibitem[{{Bain} \& {Engelhardt}(1991)}]{bai91}

944: \textsc{{Bain}, L.~J., {Engelhardt}, M.}, 1991; Introduction to {P}robability

945:   and {M}athematical {S}tatistics (Duxbury)

946:

947: \bibitem[{{Hayter}(2002)}]{hay02}

948: \textsc{{Hayter}, A.~J.}, 2002; Probability and {S}tatistics for {E}ngineers

949:   and {S}cientists (Duxbury), 2nd edn.

950:

951: \bibitem[{{McGill}(2003{\natexlab{\textit{a}}})}]{mcg03}

952: \textsc{{McGill}, B.~J.}, 2003{\natexlab{\textit{a}}}; \textit{Strong and weak

953:   tests of macroecological theory}.

954: \newblock Oikos \textbf{102} 679

955:

956: \bibitem[{{Williamson} \& {Gaston}(2005)}]{wil05}

957: \textsc{{Williamson}, M., {Gaston}, K.~J.}, 2005; \textit{The lognormal

958:   distribution is not an appropriate null hypothesis for the species adundance

959:   distribution}.

960: \newblock J. Animal Ecology \textbf{74} 409

961:

962: \bibitem[{{Magurran}(1988)}]{mag88}

963: \textsc{{Magurran}, A.~E.}, 1988; Ecological diversity and its measurement

964:   (Cambridge University Press)

965:

966: \bibitem[{{Magurran}(2003)}]{mag03}

967: \textsc{{Magurran}, A.~E.}, 2003; \textit{Explaining the excess of rare species

968:   in natural species abundance distributions}.

969: \newblock Nature \textbf{422} 714

970:

971: \bibitem[{{McGill}(2003{\natexlab{\textit{b}}})}]{mcg03a}

972: \textsc{{McGill}, B.~J.}, 2003{\natexlab{\textit{b}}}; \textit{Does Mother

973:   Nature really prefer rare species or are log-left-skewed SADs a sampling

974:   artefact?}

975: \newblock Ecology Letters \textbf{6} 766

976:

977: \end{thebibliography}

978:

979: \newpage

980:

981: \begin{table}

982: \caption{Comparison of measures of goodness-of-fit for the log-normal,

983:   logit-normal and power-law normal distributions to basal community SADs

984:   and trophic SADs.

985: }

986: \centering

987: \label{Tab:Fitting}

988: \begin{tabular}{ccccc}

989: \hline\noalign{\smallskip}

990: Community & Species & \multicolumn{3}{c}{$k^2$}   \\[3pt]

991:           & $S$     & Log    & Logit  & Power-law \\[3pt]

992: $R=10^3$  & 11.22   & 0.0599 & 0.0390 & 0.0314    \\

993: $R=10^4$  & 18.15   & 0.2333 & 0.1571 & 0.1175    \\

994: $R=10^5$  & 25.42   & 0.8820 & 0.6296 & 0.1676    \\

995: $R=10^6$  & 30.16   & 1.5759 & 1.1752 & 0.4708    \\

996: $l=1$     &  5.9    & 0.1532 & 0.0902 & 0.0877    \\

997: $l=2$     & 18.1    & 0.7972 & 0.5328 & 0.1480    \\

998: $l=3$     & 14.0    & 0.2111 & 0.1126 & 0.1093    \\

999: \hline\noalign{\smallskip}

1000: Community & Species & \multicolumn{3}{c}{Kolmogorov-Smirnov $d$} \\[3pt]

1001:           & $S$     & Log    & Logit  & Power-law            \\[3pt]

1002: $R=10^3$  & 11.22   & 1.0648 & 1.1094 & 0.9990               \\

1003: $R=10^4$  & 18.15   & 1.3244 & 1.0409 & 1.1660               \\

1004: $R=10^5$  & 25.42   & 1.4826 & 1.3669 & 0.9919               \\

1005: $R=10^6$  & 30.16   & 2.0255 & 1.7661 & 1.6167               \\

1006: $l=1$     &  5.9    & 1.8391 & 1.5798 & 1.5934               \\

1007: $l=2$     & 18.1    & 2.0051 & 1.7753 & 1.1430               \\

1008: $l=3$     & 14.0    & 1.8270 & 1.4797 & 1.5241               \\

1009: \hline\noalign{\smallskip}

1010: \end{tabular}

1011: \end{table}

1012:

1013: \end{document}

1014:

1015: