0612:cs0612014/cs0612014

1: \documentclass{article}

2: \usepackage{pstricks}

3:

4: \newcommand{\EcoLab}{{\sffamily\slshape

5:     \mbox{\raisebox{.5ex}{Eco}\hspace{-.4em}{\makebox[.5em]{L}ab}}}}

6:

7: \title{Going Stupid with \EcoLab}

8: \author{Russell K. Standish\\

9: School of Mathematics and Statistics\\

10: University of New South Wales}

11:

12: \begin{document}

13: \maketitle

14:

15: \begin{abstract}

16:   In 2005, Railsback et al. proposed a very simple model ({\em Stupid

17:     Model}) that could be implemented within a couple of hours, and

18:   later extended to demonstrate the use of common ABM platform

19:   functionality. They provided implementations of the model in several

20:   agent based modelling platforms, and compared the platforms for ease

21:   of implementation of this simple model, and performance.

22:

23: In this paper, I implement Railsback et al's Stupid Model in the

24:  \EcoLab{} simulation platform, a C++ based modelling platform,

25:  demonstrating that it is a feasible platform for these sorts of

26:  models, and compare the performance of the implementation with

27:  Repast, Mason and Swarm versions.

28: \end{abstract}

29:

30: \section{Introduction}

31:

32: Newcomers to {\em agent based modelling} (ABM) will be confused by the

33: variety of different software platforms available to assist in

34: implementing the models. Very few comparative studies between the

35: different platforms have been done, as it is a time consuming task

36: implementing all but the most trivial of models. Furthermore,

37: familiarity with one platform and programming language will lend an

38: automatic advantage in any metrics to that platform over other

39: platforms that the model implementer is less familiar with.

40:

41: In 2005, Railsback et al.\cite{Railsback-etal06} proposed a very

42: simple model that could be implemented within a couple of hours, and

43: later extended to demonstrate the use of common ABM platform

44: functionality. They gave it the name ``Stupid Model'', partly for fun,

45: but also to reiterate the recommendation of Grimm and Railsback

46: \cite{Grimm-Railsback05} that modelling projects should start with a

47: ``ridiculously simplified model''. Railsback et al. implemented their

48: model across a range of ABM platforms: Objective-C and Java

49: Swarm\cite{Swarm}, Repast\cite{North-etal06} and

50: Mason\cite{Luke-etal05} (both pure Java implementations) and Netlogo.

51: This range of platforms reflects the authors' collective programming

52: expertise in Objective-C and Java, and with Netlogo having low barrier

53: of entry (Logo was a popular language for teaching school children in

54: the 1980s).

55:

56: \EcoLab{} grew out of a simulation platform supporting a particular

57: class of model, into a general purpose simulation environment using

58: C++\cite{Standish-Leow03}. Other C++ agent-based modelling environments

59: exist, eg SymBioSys\cite{McFadzean94}, but none are as general purpose

60: as \EcoLab{}. Other general purpose agent based platforms can be used

61: with C++ models. For instance, with Swarm, C++ code can be linked to

62: Swarm's objective C library through the shared C language interface,

63: and C++ code can be linked to Repast's Java library through the Java

64: Native Interface. However, maintaining the interface code quickly

65: becomes prohibitive in the face of evolving models, negating much of

66: the benefits in using a simulation platform in the first place.

67:

68: With \EcoLab{}, it is possible to have a similar level of

69: functionality as provided by Swarm or Repast for models implemented in

70: C++, without the interface maintenance overhead. Additionally,

71: \EcoLab{} provides features for distributing the computation over

72: multiple processors in a way that is easier to program than the raw

73: Message Passing Interface (MPI)\cite{mpiref}. With Railsback et al.'s

74: Stupid Model specification, the possibility exists for directly

75: comparing an \EcoLab{} implemented agent based model with other

76: platforms for both ease of implementation, and execution

77: performance. Furthermore, the exercise illuminates those parts of

78: \EcoLab{} requiring improvement.

79:

80: \subsection{Why C++}

81:

82: C++\cite{Stroustrup97} is a mature object oriented programming

83: language of more than 20 years standing. It has been widely adopted in

84: industry, consequently open source reference compilers, as well as

85: vendor-tuned optimising compilers exist for most contemporary computer

86: architectures. Because of this popularity, and the availability of

87: compilers, C++ has been extensively deployed for scientific computing

88: since the mid-1990s. In {\em High Performance Computing} (HPC), the extreme

89: end of scientific computing, the predominant computing language used

90: for applications is Fortran, with code written in Fortran 77, or

91: increasingly written using the newer Fortran 90 features. However

92: C/C++ applications also make up a substantial fraction of the deployed

93: applications, perhaps as high as 30\%, with C++ standing to C in the

94: same relationship as Fortran 90 does to Fortran 77, i.e. typically

95: used as a ``better C''\footnote{These numbers come from a decade of

96:   personal experience at managing the resource allocation process at a

97:   High Performance Computing Centre. These general numbers are backed

98:   up by anecdotal reports from a number of other people I have

99:   corresponded with}. By contrast, Java\cite{GoslingJava} has made negligible impact in

100: HPC\footnote{Over the ten years of my personal

101:   experience, only one project used Java, out of several hundred that

102:   were mostly C/C++ or Fortran.}. There are several possible reasons

103: for the lack of Java adoption in high performance computing. Firstly,

104: most implementations compile to a virtual machine,

105: and early Java Virtual Machines (JVMs) had performance

106: problems. However, more recent JVMs deploy {\em just in time

107:   compilation}, which closes the performance gap between JVM executed

108: code and natively compiled code. Secondly, certain language features

109: missing in Java

110: (notably operator overloading, and to a lesser extent generic

111: programming) of C++ (and Fortran 90 for that matter) assist in writing

112: scientific codes that are closer to the mathematical

113: specification. However, probably the most significant factor is

114: time and innate conservatism of scientific programmers. C++ did not

115: appear significantly in HPC applications until around 15 years after

116: the language was first developed. With only a decade under its belt,

117: Java's time as an HPC application language might just be

118: beginning\cite{JavaGrande}.

119:

120: However, for agent based simulation, C++ is not a popular choice,

121: primarily due to its lack of {\em reflection}. Reflection is the

122: ability to query an object's type information at runtime, and in ABM

123: systems like Swarm, reflection is used to implement {\em probes}, or

124: the ability to observe all parts of a running simulation from within a

125: graphical user interface\cite{Swarm}. However, with Classdesc, an

126: effective reflection mechanism for C++ is

127: possible\cite{Madina-Standish01,Standish-Madina06}. \EcoLab{} uses

128: Classdesc to implement probing, along with automatic checkpointing,

129: the ability to script the model's initialisation and ongoing

130: computation, and for distributing agents to exploit any parallel

131: computing capability.

132:

133: \section{Method}\label{method}

134:

135: In line with Railsback et al.'s\cite{Railsback-etal06} methodology, I

136: implemented Stupid Model using the current \EcoLab{} release, version

137: 4.D21. This is important to give a sense of the maturity of the

138: platform. Otherwise, I might have been tempted to fix up any

139: weaknesses encountered.

140:

141: I followed the the explicit model specification\cite{StupidSpec} step

142: by step, referring to the Repast Java implementation on the rare

143: occasions the specification was ambiguous. Stupid Model consists of

144: agents called ``Stupid Bugs'' moving around a Cartesian lattice. No

145: two agents can occupy the same location, so movement involves

146: selecting a cell within a $9\times9$ Moore neighbourhood, testing whether

147: the cell is occupied and moving into the cell if empty. The search

148: procedure is repeated until an empty cell is found. Since different

149: frameworks potentially use different random number algorithms,

150: initialised with a different seed, this introduces indeterminism into

151: model runtimes. In order to reduce the impact of this indeterminism,

152: the density of agents was chosen to be 0.1 (4000 agents in a

153: $200\times200$ world) so that the standard deviation of runtimes was

154: less than 10\% of the mean.

155:

156: For measuring application performance, I did both GUI runs, and batch

157: mode runs. In \EcoLab{}, a non-GUI batch run simply involves replacing

158: the ``GUI'' command from the experiment script, with a call to

159: ``simulate'', and commenting out any graphical calls (plot, histogram and

160: draw). In Repast, Swarm and Mason, a separate ``BatchSwarm'' needs to

161: be provided by the programmer, but only the GUI versions of each model

162: were published by Railsback et al.  For batch measurements, I commented out

163: the call to addAction that added the display actions. For the Repast

164: implementation, I changed the batch parameter of

165: \verb+SimInit::loadModel+ to \verb+true+, and timed the run from the

166: command line. With the Mason implementation, I again commented out the

167: display action, and recorded the CPU time so as to discount the delays

168: introduced by having to click the button. In fact for all platforms,

169: the reported values are the CPU time. For the Objective C Swarm

170: version, I modified the code so that the \verb+StupidModelSwarm+ was

171: directly called from \verb+main()+ rather than indirectly through

172: \verb+StupidModelObserverSwarm+.

173:

174: I chose to measure the versions 10 and 11 of the Stupid

175: Model. However, the stopping criteria is specified as when the maximum

176: bug size reaches 100. Since bug growth depends on the availability of

177: food, which itself is a function of a random number generator call,

178: and also of the grazing history, this stopping criterion is

179: indeterministic. For the purposes of inter-framework performance

180: comparisons, I changed the stopping criterion to be a fixed number of

181: bug updates (500).

182:

183: In version 10 of Stupid Model, bugs will randomly select a cell within

184: their neighbourhood, and moving to it if the cell is empty, otherwise

185: repeating the selection process. In version 11, all cells in the

186: neighbourhood are iterated over, and the bug moves to the empty cell

187: with the most food.

188:

189: From version 12, bugs can reproduce and die according to random

190: dynamics, so the amount of work per update step will depend on the

191: number of living bugs. Even though these higher version models are

192: more computationally intensive, run times cannot be compared between

193: different platforms due to differences in the order that random numbers

194: are generated. Hence the Stupid 16 measurements reported in table

195: \ref{execution times} should be taken with a certain amount of

196: salt. Nevertheless, I verified that all models executed for 1000

197: steps, and that the number of Stupid Bugs was roughly the same for

198: each platform (approximately 8-900 after the initial population explosion).

199:

200: Railsback et al. did not do any performance analysis or tuning. For

201: C++ code, performance tuning can deliver big performance

202: improvements. \EcoLab{} can be built with performance counters enabled for the

203: individual TCL commands, and a single run indicated that the initial

204: approach used for evaluating the stopping criterion (evaluating the

205: maximum of the vector of bug sizes in TCL) was very expensive. By

206: implementing a specialised \verb+max_bugsize()+ (all of 4 lines of C++

207: code) improved performance by about a factor of four. However, for the

208: inter-platform performance comparison, the stopping condition was

209: changed to a fixed number of bug update steps, so this optimisation

210: makes no difference to the performance benchmarks.

211:

212: A more detailed performance profile using the standard GNU/Linux

213: profiling tool {\tt gprof}, indicated that updating the food

214: availability was a bottleneck, and that cache utilisation could be

215: improved by laying the data contiguously in memory, which is not the

216: case when the data is stored as members of a cell object. This

217: optimisation, which needed some substantial recoding of the model,

218: improved overall performance by a factor of two for model version 16,

219: although it only made about a 10\% improvement for version 11. It

220: should be noted that this optimisation technique should also be

221: available for the Java and Objective-C platforms, and presumably may

222: deliver a similar performance boost.

223:

224: All performance benchmarks were run on a 2GHz Intel Pentium M

225: processor with 1GB memory running Slackware Linux 10.0. The Java

226: version used for Repast and Mason was SDK 1.4.2 standard edition. The

227: compiler used for Swarm and \EcoLab{} was GCC 3.4.3. I also did a

228: comparison \EcoLab{} run using the Intel C++ compiler 9.0, but this

229: was more than 50\% slower than the GCC compiled code. This somewhat

230: surprising result indicates that icc's strength lies in vectorising

231: loops that access data contiguously to exploit the inbuilt SSE

232: instructions, but that for more general purpose ABM code, GCC performs

233: better (at least on Linux!).

234:

235: The sourcecode for \EcoLab{} Stupid Model is available from the

236: \EcoLab{} website.\cite{EcoLab}

237:

238: \section{Results}

239:

240: Similar to all the platforms reviewed by Railsback et al., \EcoLab{}

241: proved capable of implementing all functionality for all versions of

242: Stupid Model. Implementing the first version took longer than any of

243: the remaining versions, as \EcoLab{} does not provide a ready-to-use

244: spatial library. Instead it provides a more general library called

245: {\em Graphcode}\cite{Standish-Madina06}. Graphcode's abstraction is a

246: network, or graph of objects, with the links between objects

247: representing data flow. Graphcode can distribute the objects across

248: multiple processors using the Classdesc serialisation library. A

249: cellular space such as found in Swarm or Repast will be a set of

250: objects, each one wired to its neighbours. In such a way, Graphcode

251: can easily represent Cartesian and hexagonal topologies by the way the

252: neighbourhoods are wired. However, the only example using Graphcode

253: provided in the \EcoLab{} was a continuous space example, each cell

254: holding objects located within a certain region of space. Examples of

255: models using different sorts of spatial topologies, as well as a few

256: common cases being supplied as a library would improve the beginner's

257: experience of \EcoLab{}.

258:

259: In retrospect, it may have been simpler to implement the spatial class

260: on top of a standard vector of cells. This would have gotten the

261: initial model up and running quicker, but limited the model to

262: sequential usage only. By using Graphcode, we enable parallel

263: processing capability.

264:

265: One thing that became clear in this exercise is the need for a smart

266: reference type. Objects like bugs need a reference to the cell in

267: which they inhabit, scheduling lists need references to the bugs that

268: they schedule and so on. Because bugs move from cell to cell, it is

269: better for the cells to have a reference to the bug it contains (if

270: any) rather than for the cell to store the bug itself. In C, the only possibility for

271: references are pointers, which are difficult to serialise properly due

272: to the fact that C makes no guarantees about whether a pointer is

273: valid or not. Substantial care is required to ensure that references

274: remain valid in the event of an object such as a bug being deleted

275: from the system.  Classdesc accepts a pragma that asserts that a

276: pointer is either valid or NULL, and whether the pointer chains form

277: cycles or not to allow serialisation, but it's up to the programmer to

278: ensure software bugs do not invalidate this assertion.

279:

280: C++ also supports static references (eg \verb+int&+), which are

281: established at the time of the reference's creation, and then

282: immutable until the reference is destroyed. These references are

283: always valid, however the lack of dynamic control makes them

284: unsuitable for agent based simulations where agents may be dropped or

285: moved, and appropriate references updated. Furthermore static

286: reference cycles cannot be handled with serialisation at all, since

287: the serialisation descriptors cannot distinguish an object from its reference.

288:

289: Whilst it is possible to use \EcoLab{} with a nonserialisable model,

290: one gives up substantial functionality doing so, including the ability to

291: checkpoint/restart the model.

292:

293: What is needed actually is something like Java's reference type, where

294: objects are created on the heap, and the programmer simply manipulates

295: references. Once all references to an object have been destroyed,

296: Java's garbage collector takes care of destroying the object,

297: reclaiming the memory used.

298:

299: It is possible to implement something like this in C++, using operator

300: overloading to give the resulting type the ``look and feel'' of a

301: pointer. Such types are usually called {\em smart pointers}. The well

302: known Boost library\cite{Boost} provides a few different versions,

303: some of which are being considered for inclusion in the C++ standard

304: library. \EcoLab{} provides the template \verb+ref<T>+, which is

305: parameterised by the target type of the reference. Unlike the Boost

306: versions (in which you pass the smart pointer a pointer for it to

307: control), \verb+ref+ has control over the entire lifecycle of the

308: object it points to. The first time a \verb+ref+ object is

309: dereferenced, the target object is created on the heap, and it keeps

310: track of the number of references to the target object, so that once

311: all references to are destroyed, so is the target object.

312:

313: The version of \verb+ref+ supplied in the current \EcoLab{} has a

314: number of deficiencies, however, most notable of which is that it

315: doesn't provide any way of testing whether the target object exists or

316: not. For the purposes of this exercise, I copied the \verb+ref.h+

317: header file, and added the necessary functionality. This improved

318: \verb+ref.h+ will be incorporated in future releases of \EcoLab{}.

319:

320: Agents usually need to refer to the environment, or world in which

321: they live. In languages like Java or Objective C, this is simply

322: managed by having the agent store a reference to the world, and/or

323: cell. However, this will set up a reference cycle which will play

324: havoc with model serialisation if the serialisation algorithm doesn't

325: explicitly account for cycles. \EcoLab{} provides a routine that

326: serialises arbitrary graphs constructed with pointer references.

327: However, it does not currently support the presence of cycles with the

328: \verb+ref<>+ data type. With C++, however, there is a simple

329: workaround. The model is a global variable, and agents can refer to

330: their cell by holding an index into a container of cells stored within

331: this global model. This is the approach I have taken with Stupid

332: Model, and indeed this technique is used in other \EcoLab{} models.

333: However, if the \verb+ref<>+ data type were extended to support

334: serialisation of cyclic graphs, the method deployed in Java and

335: Objective C models can be supported as well.

336:

337: Line counts are often considered a proxy for the amount of effort a

338: programmer must expend to implement a problem. Table \ref{line count}

339: shows the line counts for the 16 different Stupid Model cases for each

340: of the Railsback implementations, as well as the \EcoLab{}

341: implementation. The \EcoLab{} implementation also includes two

342: additional cases, which build upon version 16. The model is

343: parallelised using \EcoLab{}'s MPI-based parallel processing features,

344: and finally, the ``field'' optimisation whereby the food data is

345: stored in contiguous memory. \EcoLab{} and the two Java platforms

346: seems to need a similar number of lines of code, yet the Swarm

347: implementation needed up to three times the number. Whilst a factor of two or three in

348: source line count is not particularly significant, it does indicate

349: that it takes a bit more effort to implement Swarm models.

350:

351: \begin{table}

352: \begin{tabular}{r|rrrr}

353: Version & Repast & Mason & Obj-C Swarm & \EcoLab{} \\

354: \hline

355: 1 & 158 & 169 &   578 & 253 \\

356: 2 & 158 & 214 &   622 & 259 \\

357: 3 & 250 & 263 &   865 & 281 \\

358: 4 & 256 &     &   896 & 310 \\

359: 5 & 312 & 296 &   968 & 322 \\

360: 6 & 306 & 362 &  1005 & 338 \\

361: 7 & 359 & 316 &  1070 & 337 \\

362: 8 & 258 & 365 &  1144 & 320 \\

363: 9 & 368 & 369 &  1152 & 336 \\

364: 10 & 381& 383 &  1191 & 352 \\

365: 11 & 391& 409 &  1253 & 358 \\

366: 12 & 497& 494 &  1614 & 416 \\

367: 13 & 484&     &  1636 & 419 \\

368: 14 & 501&     &  1360 & 432 \\

369: 15 & 646& 670 &  1761 & 515 \\

370: 16 & 753& 816 &  2174 & 662 \\

371: parallel &&   &      & 753  \\

372: field &  &   &      & 894  \\

373: \end{tabular}

374: \caption{Source code line-counts (as reported by the unix command `wc')

375:   for the different Stupid Model

376:   versions. Makefiles are not included (Swarm \& \protect\EcoLab{}), since

377:   these are fairly boiler plate code, and fairly negligible. \protect\EcoLab{}

378:   counts include the TCL scripts.}

379: \label{line count}

380: \end{table}

381:

382: \begin{table}

383: \begin{tabular}{r|rrrr}

384: Version & Repast & Mason & Obj-C Swarm & \EcoLab{} \\

385: \hline

386: 10 &     3.5  &  3.4   & 71 & 3.9 \\

387: 11 &     32.7 &  21.3  & 165 & 14.9 \\

388: 16 &     44   &  40.5  & 402 & 1014 \\

389: field &       &       &     & 67 \\

390: \end{tabular}

391: \caption{Execution CPU times (in seconds) for several Stupid Model versions for

392:   different platforms. Versions 10 and 11 were performed in batch mode

393:   (no graphical output, no GUI control, Mason excepted), version 16 in

394:   GUI mode with a

395:   plot and histogram. \protect\EcoLab{}'s field version uses raster rather

396:   than canvas for display, and omits the expensive histogram

397:   widget. All these figures need considerable qualification (see text).}

398: \label{execution times}

399: \end{table}

400:

401: In table \ref{execution times}, execution times for various stupid

402: model versions is reported. As described in \S\ref{method}, versions

403: 10 \& 11 were run in batch mode with as much graphical output turned

404: off as possible. The Java versions performed slightly better for

405: version 10, and the C++ version did better on version 11. However,

406: given the possible range of implementation strategies, one should not read

407: too much into this, except that the myth of Java being slow relative

408: to C++ should be now be firmly laid to rest. The result is broadly in

409: line with other observations that Java implementations tend to be

410: within a factor of 2 of natively compiled

411: applications\cite{Boisvert-etal01,Lewis-Neumann03}. The results for

412: Swarm though confirm Railsback et al's the observation that Objective

413: C performance lags that of the Java (and also now C++)

414: versions. Unfortunately, my knowledge of Objective-C and Swarm

415: internals is not up to the task of explaining this result.

416:

417: In version 16, the full graphical version of the model was run. This

418: included a display of the space, a plot of the number of bugs and a

419: histogram of bug sizes. It should be noted that the Mason

420: implementation lacked the plot and histogram, apparently because this

421: functionality is absent within the Mason

422: toolkit\cite{Railsback-etal06} itself, but provided by 3rd party

423: add-ons. One thing that stands out is the slowness of \EcoLab{}. The

424: TCL-based plotting widgets used in \EcoLab{} (also used in Swarm) are

425: slow relative to the equivalent Java offerings. Furthermore, this

426: benchmark displays the space environment using a canvas, which is a

427: high level drawing tool with roughly the same sort of functionality as

428: a standard drawing application (eg. the drawing application in

429: OpenOffice or Xfig). The bugs, predators and empty cells are rendered

430: as coloured squares. The other platforms provide dedicated raster

431: objects for rendering spatial displays. In the ``field'' version of

432: Stupid Model, instead of representing the model's objects as squares, a

433: single pixmap object is created on the canvas and manipulated through

434: low level Tk library calls. This amounts to about 40 lines of code,

435: and improves the display performance dramatically. The result listed

436: under the row ``field'' also omits the expensive histogram

437: functionality (but still displayed the plot of bug numbers).

438:

439: \section{Parallel implementation}

440:

441: Having put the extra work into building the space class on top of

442: Graphcode rather than using a simple vector, it raises the question of

443: whether Stupid Model can be effectively parallelised.

444:

445: The first thing that becomes apparent is that Stupid Model as

446: specified is inherently sequential. Two bugs are not allowed to occupy

447: the same spatial location, and movement into a location is performed

448: on a first come first served basis. Since the order in which bugs

449: perform their update move is randomised, the obvious parallel

450: generalisation in a shared memory context is to use locks to prevent

451: two bugs on different processors simultaneously moving to the same

452: location. However, \EcoLab{} is designed for use with distributed

453: parallel systems, and obtaining the state of a cell located on a remote

454: processor is expensive. In fact, in the MPI transport layer used by

455: \EcoLab{}, such functionality is only supported by ``one-sided''

456: communications of MPI 2, a relatively new feature that is not well

457: supported and typically poorly implemented. Instead, the recommended

458: approach in \EcoLab{} is to have separate communication and

459: computation phases, with a snapshot of neighbouring data at the

460: previous timestep supplied to each processor during the communication

461: phase.

462:

463: As Stupid Model is a pedagogical model, there is no one right answer

464: as to respecifying the model for parallelisation.  Perhaps the most

465: obvious approach would be to allow multiple bugs to share a single

466: location within the space. This would certainly simplify the code, as

467: additional logic was required to enforce the one-bug-per-location

468: requirement. However, in the spirit of adventure, I propose the

469: following protocol for allowing bugs to migrate from one processor to

470: the next, whilst maintaining the one-bug-per-location property. As in

471: the sequential algorithm, bugs examine their neighbourhood, and choose

472: the cell with the highest food resource as a destination. If the

473: destination lies on the current processor, and the cell is empty, the

474: bug is free to move. If the destination is remote, however, the bug's

475: desire to move to a remote cell is lodged with an emigration

476: register. Then after all bugs have performed their move, the

477: emigration register is passed to the remote processor, which approves

478: or denies the request depending on whether the destination is already

479: occupied, or an immigration request has already been allowed. The

480: immigration approval list is passed back to the requesting processor,

481: and approved bugs are migrated between processors. The remaining

482: bugs do not move.

483:

484: I coded this solution into the \verb+stupid-parallel+ version, and

485: also the field optimised version \verb+stupid-field+. None of the

486: other versions are parallel aware code --- building them and running

487: them in parallel will only result in the model running on processor 0,

488: with the remaining processors idle.

489:

490: With the \verb+stupid-parallel+ version, it became immediately clear

491: that the \verb+Prepare_Neighbours()+ step dominated the

492: calculation. This highlighted a hitherto unsuspected source of

493: inefficiency in Graphcode's \verb+Prepare_Neighbours()+ method. To

494: build the list of neighbours to transmit, Graphcode loops over the

495: neighbours of local cells, adding to the list any remote neighbour

496: found. However, this leads to many duplicates, as one cell may be the

497: neighbour of many other cells --- for the Stupid Model case, each cell

498: in the transfer list will be duplicated 36 times. In a more common von

499: Neumann neighbourhood of radius 1 there is no duplication, and in the Moore

500: neighbourhood of radius 1 the duplication is only 3 times. In choosing

501: a Moore neighbourhood of radius 4 for their Stupid Model, Railsback et

502: al. unwittingly made this inefficiency blatant.

503:

504: However, even with this inefficiency corrected,

505: \verb+Prepare_Neighbours()+ is still an expensive overhead. The

506: example problem I tested was the same $200\times200$ spatial grid,

507: and so $2\times200\times4\times N_p$ cells need to be transferred each

508: time step ($N_p>1$ being the number of processors). This overhead can

509: be amortised by increasing the problem size.

510:

511: In the \verb+stupid-field+ case, the \verb+food_available+ data is not

512: stored in the cell, but in the additional field data structure, so is

513: not transferred with the cell data during the

514: \verb+Prepare_Neighbours()+ step. In fact, only the food data has any

515: affect on bug movement, so \verb+Prepare_Neighbours()+ is eliminated

516: altogether. In the \verb+stupid-field+ version of the model, we do not

517: transfer the food data, but duplicate the update calculation on the

518: overlap area between two processors. A single

519: \verb+Prepare_Neighbours()+ step is done at the beginning of the model

520: run to ensure access to the food data.

521:

522: Figure \ref{speedup} shows the speedup curve for both the

523: \verb+stupid-parallel+ and \verb+stupid-field+ model, for the same

524: input script used for the \verb+stupid10+ and \verb+stupid11+

525: benchmarks reported in table \ref{execution times}.

526:

527: \begin{figure}

528:

529: % Define new PST objects, if not already defined

530: \ifx\PSTloaded\undefined

531: \def\PSTloaded{t}

532: \psset{arrowsize=.01 3.2 1.4 .3}

533: \psset{dotsize=.01}

534: \catcode`@=11

535:

536: \newpsobject{PST@Border}{psline}{linewidth=.0015,linestyle=solid}

537: \newpsobject{PST@Axes}{psline}{linewidth=.0015,linestyle=dotted,dotsep=.004}

538: \newpsobject{PST@Solid}{psline}{linewidth=.0015,linestyle=solid}

539: \newpsobject{PST@Dashed}{psline}{linewidth=.0015,linestyle=dashed,dash=.01 .01}

540: \newpsobject{PST@Dotted}{psline}{linewidth=.0025,linestyle=dotted,dotsep=.008}

541: \newpsobject{PST@LongDash}{psline}{linewidth=.0015,linestyle=dashed,dash=.02 .01}

542: \newpsobject{PST@Diamond}{psdots}{linewidth=.001,linestyle=solid,dotstyle=square,dotangle=45}

543: \newpsobject{PST@Filldiamond}{psdots}{linewidth=.001,linestyle=solid,dotstyle=square*,dotangle=45}

544: \newpsobject{PST@Cross}{psdots}{linewidth=.001,linestyle=solid,dotstyle=+,dotangle=45}

545: \newpsobject{PST@Plus}{psdots}{linewidth=.001,linestyle=solid,dotstyle=+}

546: \newpsobject{PST@Square}{psdots}{linewidth=.001,linestyle=solid,dotstyle=square}

547: \newpsobject{PST@Circle}{psdots}{linewidth=.001,linestyle=solid,dotstyle=o}

548: \newpsobject{PST@Triangle}{psdots}{linewidth=.001,linestyle=solid,dotstyle=triangle}

549: \newpsobject{PST@Pentagon}{psdots}{linewidth=.001,linestyle=solid,dotstyle=pentagon}

550: \newpsobject{PST@Fillsquare}{psdots}{linewidth=.001,linestyle=solid,dotstyle=square*}

551: \newpsobject{PST@Fillcircle}{psdots}{linewidth=.001,linestyle=solid,dotstyle=*}

552: \newpsobject{PST@Filltriangle}{psdots}{linewidth=.001,linestyle=solid,dotstyle=triangle*}

553: \newpsobject{PST@Fillpentagon}{psdots}{linewidth=.001,linestyle=solid,dotstyle=pentagon*}

554: \newpsobject{PST@Arrow}{psline}{linewidth=.001,linestyle=solid}

555: \catcode`@=12

556:

557: \fi

558: \psset{unit=5.0in,xunit=5.0in,yunit=3.0in}

559: \pspicture(0.000000,0.000000)(1.000000,1.000000)

560: \ifx\nofigs\undefined

561: \catcode`@=11

562:

563: \PST@Border(0.1010,0.1260)

564: (0.1160,0.1260)

565:

566: \PST@Border(0.9470,0.1260)

567: (0.9320,0.1260)

568:

569: \rput[r](0.0850,0.1260){ 0}

570: \PST@Border(0.1010,0.2313)

571: (0.1160,0.2313)

572:

573: \PST@Border(0.9470,0.2313)

574: (0.9320,0.2313)

575:

576: \rput[r](0.0850,0.2313){ 2}

577: \PST@Border(0.1010,0.3365)

578: (0.1160,0.3365)

579:

580: \PST@Border(0.9470,0.3365)

581: (0.9320,0.3365)

582:

583: \rput[r](0.0850,0.3365){ 4}

584: \PST@Border(0.1010,0.4418)

585: (0.1160,0.4418)

586:

587: \PST@Border(0.9470,0.4418)

588: (0.9320,0.4418)

589:

590: \rput[r](0.0850,0.4418){ 6}

591: \PST@Border(0.1010,0.5470)

592: (0.1160,0.5470)

593:

594: \PST@Border(0.9470,0.5470)

595: (0.9320,0.5470)

596:

597: \rput[r](0.0850,0.5470){ 8}

598: \PST@Border(0.1010,0.6523)

599: (0.1160,0.6523)

600:

601: \PST@Border(0.9470,0.6523)

602: (0.9320,0.6523)

603:

604: \rput[r](0.0850,0.6523){ 10}

605: \PST@Border(0.1010,0.7575)

606: (0.1160,0.7575)

607:

608: \PST@Border(0.9470,0.7575)

609: (0.9320,0.7575)

610:

611: \rput[r](0.0850,0.7575){ 12}

612: \PST@Border(0.1010,0.8628)

613: (0.1160,0.8628)

614:

615: \PST@Border(0.9470,0.8628)

616: (0.9320,0.8628)

617:

618: \rput[r](0.0850,0.8628){ 14}

619: \PST@Border(0.1010,0.9680)

620: (0.1160,0.9680)

621:

622: \PST@Border(0.9470,0.9680)

623: (0.9320,0.9680)

624:

625: \rput[r](0.0850,0.9680){ 16}

626: \PST@Border(0.1010,0.1260)

627: (0.1010,0.1460)

628:

629: \PST@Border(0.1010,0.9680)

630: (0.1010,0.9480)

631:

632: \rput(0.1010,0.0840){ 0}

633: \PST@Border(0.2068,0.1260)

634: (0.2068,0.1460)

635:

636: \PST@Border(0.2068,0.9680)

637: (0.2068,0.9480)

638:

639: \rput(0.2068,0.0840){ 2}

640: \PST@Border(0.3125,0.1260)

641: (0.3125,0.1460)

642:

643: \PST@Border(0.3125,0.9680)

644: (0.3125,0.9480)

645:

646: \rput(0.3125,0.0840){ 4}

647: \PST@Border(0.4183,0.1260)

648: (0.4183,0.1460)

649:

650: \PST@Border(0.4183,0.9680)

651: (0.4183,0.9480)

652:

653: \rput(0.4183,0.0840){ 6}

654: \PST@Border(0.5240,0.1260)

655: (0.5240,0.1460)

656:

657: \PST@Border(0.5240,0.9680)

658: (0.5240,0.9480)

659:

660: \rput(0.5240,0.0840){ 8}

661: \PST@Border(0.6298,0.1260)

662: (0.6298,0.1460)

663:

664: \PST@Border(0.6298,0.9680)

665: (0.6298,0.9480)

666:

667: \rput(0.6298,0.0840){ 10}

668: \PST@Border(0.7355,0.1260)

669: (0.7355,0.1460)

670:

671: \PST@Border(0.7355,0.9680)

672: (0.7355,0.9480)

673:

674: \rput(0.7355,0.0840){ 12}

675: \PST@Border(0.8413,0.1260)

676: (0.8413,0.1460)

677:

678: \PST@Border(0.8413,0.9680)

679: (0.8413,0.9480)

680:

681: \rput(0.8413,0.0840){ 14}

682: \PST@Border(0.9470,0.1260)

683: (0.9470,0.1460)

684:

685: \PST@Border(0.9470,0.9680)

686: (0.9470,0.9480)

687:

688: \rput(0.9470,0.0840){ 16}

689: \PST@Border(0.1010,0.1260)

690: (0.9470,0.1260)

691: (0.9470,0.9680)

692: (0.1010,0.9680)

693: (0.1010,0.1260)

694:

695: \rput(0.5240,0.0210){No. processors}

696: \rput[r](0.8200,0.9270){stupid-parallel}

697: \PST@Solid(0.8360,0.9270)

698: (0.9150,0.9270)

699:

700: \PST@Solid(0.1539,0.1786)

701: (0.1539,0.1786)

702: (0.2068,0.1368)

703: (0.3125,0.1380)

704: (0.4183,0.1415)

705: (0.5240,0.1424)

706: (0.6298,0.1450)

707: (0.7355,0.1454)

708: (0.8413,0.1502)

709: (0.9470,0.1522)

710:

711: \rput[r](0.8200,0.8850){stupid-field}

712: \PST@Dashed(0.8360,0.8850)

713: (0.9150,0.8850)

714:

715: \PST@Dashed(0.1539,0.1786)

716: (0.1539,0.1786)

717: (0.2068,0.2261)

718: (0.3125,0.3254)

719: (0.4183,0.4364)

720: (0.5240,0.4814)

721: (0.6298,0.5562)

722: (0.7355,0.5976)

723: (0.8413,0.5887)

724: (0.9470,0.7241)

725:

726: \rput[r](0.8200,0.8430){linear}

727: \PST@Dotted(0.8360,0.8430)

728: (0.9150,0.8430)

729:

730: \PST@Dotted(0.1539,0.1786)

731: (0.1539,0.1786)

732: (0.1619,0.1866)

733: (0.1699,0.1946)

734: (0.1779,0.2025)

735: (0.1859,0.2105)

736: (0.1939,0.2185)

737: (0.2019,0.2265)

738: (0.2100,0.2344)

739: (0.2180,0.2424)

740: (0.2260,0.2504)

741: (0.2340,0.2584)

742: (0.2420,0.2663)

743: (0.2500,0.2743)

744: (0.2580,0.2823)

745: (0.2660,0.2903)

746: (0.2740,0.2982)

747: (0.2821,0.3062)

748: (0.2901,0.3142)

749: (0.2981,0.3221)

750: (0.3061,0.3301)

751: (0.3141,0.3381)

752: (0.3221,0.3461)

753: (0.3301,0.3540)

754: (0.3381,0.3620)

755: (0.3461,0.3700)

756: (0.3542,0.3780)

757: (0.3622,0.3859)

758: (0.3702,0.3939)

759: (0.3782,0.4019)

760: (0.3862,0.4099)

761: (0.3942,0.4178)

762: (0.4022,0.4258)

763: (0.4102,0.4338)

764: (0.4183,0.4418)

765: (0.4263,0.4497)

766: (0.4343,0.4577)

767: (0.4423,0.4657)

768: (0.4503,0.4736)

769: (0.4583,0.4816)

770: (0.4663,0.4896)

771: (0.4743,0.4976)

772: (0.4823,0.5055)

773: (0.4904,0.5135)

774: (0.4984,0.5215)

775: (0.5064,0.5295)

776: (0.5144,0.5374)

777: (0.5224,0.5454)

778: (0.5304,0.5534)

779: (0.5384,0.5614)

780: (0.5464,0.5693)

781: (0.5544,0.5773)

782: (0.5625,0.5853)

783: (0.5705,0.5932)

784: (0.5785,0.6012)

785: (0.5865,0.6092)

786: (0.5945,0.6172)

787: (0.6025,0.6251)

788: (0.6105,0.6331)

789: (0.6185,0.6411)

790: (0.6265,0.6491)

791: (0.6346,0.6570)

792: (0.6426,0.6650)

793: (0.6506,0.6730)

794: (0.6586,0.6810)

795: (0.6666,0.6889)

796: (0.6746,0.6969)

797: (0.6826,0.7049)

798: (0.6906,0.7128)

799: (0.6986,0.7208)

800: (0.7067,0.7288)

801: (0.7147,0.7368)

802: (0.7227,0.7447)

803: (0.7307,0.7527)

804: (0.7387,0.7607)

805: (0.7467,0.7687)

806: (0.7547,0.7766)

807: (0.7627,0.7846)

808: (0.7708,0.7926)

809: (0.7788,0.8006)

810: (0.7868,0.8085)

811: (0.7948,0.8165)

812: (0.8028,0.8245)

813: (0.8108,0.8325)

814: (0.8188,0.8404)

815: (0.8268,0.8484)

816: (0.8348,0.8564)

817: (0.8429,0.8643)

818: (0.8509,0.8723)

819: (0.8589,0.8803)

820: (0.8669,0.8883)

821: (0.8749,0.8962)

822: (0.8829,0.9042)

823: (0.8909,0.9122)

824: (0.8989,0.9202)

825: (0.9069,0.9281)

826: (0.9150,0.9361)

827: (0.9230,0.9441)

828: (0.9310,0.9521)

829: (0.9390,0.9600)

830: (0.9470,0.9680)

831:

832: \PST@Border(0.1010,0.1260)

833: (0.9470,0.1260)

834: (0.9470,0.9680)

835: (0.1010,0.9680)

836: (0.1010,0.1260)

837:

838: \catcode`@=12

839: \fi

840: \endpspicture

841: \caption{Speedup curves for {\tt stupid-parallel} and

842: {\tt stupid-field} for a $200\times200$ grid with 4000 stupid bugs

843: moving and growing. Bug reproduction and mortality as well as

844: predation have been turned off. At no stage does {\tt stupid-parallel}

845: run as fast in parallel as it does sequentially, due to the overheads

846: of the {\tt Prepare\_Neighbours()} step.}

847: \label{speedup}

848: \end{figure}

849:

850: The parallel computing experiements were performed on Linux cluster

851: (Beowulf style) with dual 3GHz Pentium 4 Xeon nodes connected via

852: Gigabit Ethernet. Each node has 2GB of memory.

853:

854: \section{Conclusion}

855:

856: The aim of this study was to answer the following questions:

857: \begin{itemize}

858: \item is \EcoLab{} suitable for the sorts of agent based models that

859:   other more well known platforms are used for

860: \item what performance advantages, if any, does the use of C++

861:   provide

862: \item what deficiencies are present in \EcoLab{}

863: \end{itemize}

864:

865: Stupid Model is a nontrivial, yet fairly simple agent based model that

866: could be implemented without an excessive amount of programming.

867: \EcoLab{} has shown itself to be capable of implementing Stupid Model

868: with about the same sort of effort reported by developers of Repast

869: and Mason versions of the model, and was implemented in around the

870: same number of lines of code. Furthermore, performance was on a par

871: with these Java-based platforms.

872:

873: The main deficiencies encountered were:

874: \begin{itemize}

875: \item A lack of specialised space library, or library of examples in

876:   the use of Graphcode for implementing spaces.

877: \item A lack of a simple raster object for displaying spaces. The provided canvas

878:   functionality is very slow

879: \item GUI functionality is slow compared with the Java-based

880:   functionality

881: \item the smart pointer template \verb+ref+ needs to be improved

882: \end{itemize}

883:

884: For addressing the space library issue, I will start with implementing

885: a few well known ABM models to build up a library of practice. Where

886: code appears in common, this can be refactored into a library.

887:

888: To address the GUI performance, a possible future strategy is to develop a

889: Classdesc C++/Java interface to enable C++ coded \EcoLab{} models to

890: run under a Java framework such as Repast. A similar strategy was

891: investigated integrating C++ and Objective C using Classdesc to look

892: at Swarm integration, however it never found practical use and is no

893: longer being maintained\cite{Leow-Standish03}.  The feasibility of doing

894: this with a Java platform will be the subject of future work.

895:

896: %\bibliographystyle{plain}

897: %\bibliography{rus}

898: \begin{thebibliography}{10}

899:

900: \bibitem{Boisvert-etal01}

901: RF~Boisvert, J.~Moreira, M.~Philippsen, and R.~Pozo.

902: \newblock {Java and numerical computing}.

903: \newblock {\em Computing in Science \& Engineering [see also IEEE Computational

904:   Science and Engineering]}, 3(2):18--24, 2001.

905:

906: \bibitem{Boost}

907: {Boost C++ Libraries}.

908: \newblock http://www.boost.org/.

909:

910: \bibitem{EcoLab}

911: {\EcoLab{}} website.

912: \newblock http://ecolab.sourceforge.net.

913:

914: \bibitem{GoslingJava}

915: James Gosling, Bill Joy, and Guy~L. Steele, Jr.

916: \newblock {\em The {Java} Language Specification}.

917: \newblock Addison-Wesley, 3rd edition, 2005.

918:

919: \bibitem{Grimm-Railsback05}

920: V.~Grimm and S.~F. Railsback.

921: \newblock {\em Individual-based Modeling and Ecology}.

922: \newblock Princeton UP, 2005.

923:

924: \bibitem{JavaGrande}

925: {Java Grande}.

926: \newblock http://www.javagrande.org/.

927:

928: \bibitem{Lewis-Neumann03}

929: J.P.Lewis and Ulrich Neumann.

930: \newblock Performance of {Java} versus {C++}.

931: \newblock http://www.idiom.com/~zilla/Computer/javaCbenchmark.html, 2003.

932:

933: \bibitem{Leow-Standish03}

934: Richard Leow and Russell~K. Standish.

935: \newblock Running {C++} models under the {Swarm} environment.

936: \newblock In {\em Proceedings SwarmFest 2003}, 2003.

937: \newblock arXiv:cs.MA/0401025.

938:

939: \bibitem{Luke-etal05}

940: Sean Luke, Claudio Cioffi-Revilla, Liviu Panait, Keith Sullivan, and Gabriel

941:   Balan.

942: \newblock {MASON}: A multiagent simulation environment.

943: \newblock {\em Simulation}, 81:517--527, 2005.

944:

945: \bibitem{Madina-Standish01}

946: Duraid Madina and Russell~K. Standish.

947: \newblock A system for reflection in {C++}.

948: \newblock In {\em Proceedings of AUUG2001: Always on and Everywhere}.

949:   Australian Unix Users Group, 2001.

950:

951: \bibitem{McFadzean94}

952: David McFadzean.

953: \newblock {SimBioSys}: A class framework for biological simulations.

954: \newblock Master's thesis, Dept. of Computer Science, Calgary, Alberta, 1994.

955: \newblock http://www.lucifer.com/\~{}david/thesis/.

956:

957: \bibitem{Swarm}

958: Nelson Minar, Roger Burkhart, Christopher~G. Langton, and Manor Askenazi.

959: \newblock The {Swarm} simulation system: A toolkit for building multi-agent

960:   simulations.

961: \newblock Technical Report WP96-06-042, Santa Fe Institute, 1996.

962: \newblock http://www.swarm.org.

963:

964: \bibitem{North-etal06}

965: M.J. North, N.T. Collier, and J.R. Vos.

966: \newblock Experiences creating three implementations of the {Repast} agent

967:   modeling toolkit.

968: \newblock {\em ACM Transactions on Modeling and Computer Simulation}, 16:1--25,

969:   2006.

970:

971: \bibitem{Railsback-etal06}

972: S.~F. Railsback, S.~L. Lytinen, and S.~K. Jackson.

973: \newblock Agent-based simulation platforms: Review and development

974:   recommendations.

975: \newblock {\em Simulation}, 82:609--623, 2006.

976:

977: \bibitem{StupidSpec}

978: Steve Railsback, Steve Lytinen, and Volker Grimm.

979: \newblock {StupidModel} and extensions: A template and teaching tool for

980:   agent-based modeling platforms.

981: \newblock http://condor.depaul.edu/\~{}slytinen/abm/StupidModel.

982:

983: \bibitem{mpiref}

984: Marc Snir et~al.

985: \newblock {\em MPI: the complete reference}.

986: \newblock MIT Press, Cambridge, MA, 1996.

987:

988: \bibitem{Standish-Leow03}

989: Russell~K. Standish and Richard Leow.

990: \newblock {EcoLab}: Agent based modeling for {C++} programmers.

991: \newblock In {\em Proceedings SwarmFest 2003}, 2003.

992: \newblock arXiv:cs.MA/0401026.

993:

994: \bibitem{Standish-Madina06}

995: Russell~K. Standish and Duraid Madina.

996: \newblock Classdesc and graphcode: support for scientific programming in {C++}.

997: \newblock {\em International Journal for High Performance Computing and

998:   Applications}, 2006.

999: \newblock submitted.

1000:

1001: \bibitem{Stroustrup97}

1002: Bjarne Stroustrup.

1003: \newblock {\em The {C++} Programming Language}.

1004: \newblock Addison-Wesley, Reading, Mass., 3rd edition, 1997.

1005:

1006: \end{thebibliography}

1007:

1008: \end{document}

1009: