cs0402049/pcga.tex
1: %\documentstyle[amstex, subfigure, epsfig, my-apa-uiuc]{llncs}
2: %\documentstyle[amstex, subfigure, epsfig]{llncs}
3: \documentclass{llncs}
4: 
5: \usepackage{epsfig}
6: %\usepackage{subfigure}
7: %\usepackage{fancybox}
8: 
9: \begin{document}
10: 
11: 
12: \title{An architecture for massively parallelization of the 
13: compact genetic algorithm}
14: 
15: \author{Fernando G. Lobo \and Cl\'audio F. Lima \and Hugo M\'artires}
16: \institute{ADEEC-FCT\\
17:    Universidade do Algarve\\
18:    Campus de Gambelas\\
19:    8000-062 Faro, Portugal\\
20:    \{flobo,clima\}@ualg.pt, hmartires@myrealbox.com 
21: }
22: \maketitle
23: 
24: 
25: \begin{abstract}
26: This paper presents an architecture which is suitable for a 
27: massive parallelization of the compact genetic algorithm. The
28: resulting scheme has three major advantages. First, it has 
29: low synchronization costs. Second, it is fault tolerant, and
30: third, it is scalable.
31: 
32: The paper argues that the benefits that can be obtained with
33: the proposed approach is potentially higher than those obtained 
34: with traditional parallel genetic algorithms.
35: In addition, the ideas suggested in the paper may also be 
36: relevant towards parallelizing more complex probabilistic 
37: model building genetic algorithms.
38: \end{abstract}
39: 
40: 
41: \section{Introduction}
42: \label{sec:introduction}
43: 
44: There has been several efforts in the field of evolutionary computation
45: towards improving the genetic algorithm's efficiency. One of 
46: the efficiency enhancement techniques that has been investigated, 
47: both in theory and in practice, is the topic of parallelization
48: \cite{Cantupaz:2000}.
49: 
50: In this paper, parallelization is investigated further, this time
51: in the context of Probabilistic Model Building Genetic Algorithms (PMBGAs),
52: a class of genetic algorithms whose operational mechanics differ
53: somewhat from those of the traditional GAs.
54: 
55: Efficiency is a very important factor in problem solving.
56: When talking about computer algorithms, efficiency is usually addressed
57: along two major axis: {\em Time} and {\em Memory} requirements needed
58: to solve a problem.
59: In the context of genetic and other evolutionary algorithms, there 
60: is another axis, {\em Solution Quality}, that also needs to be addressed. 
61: This third
62: aspect comes into play because many of the problems that genetic and 
63: evolutionary algorithms attempt to solve cannot be solved optimally
64: with 100\% confidence unless a complete enumeration of the search
65: space is performed. Therefore, genetic algorithms, as well
66: as many other methods, use a search bias to try to give good 
67: approximate solutions without doing a complete exploration of the 
68: search space.
69: 
70: Summarizing, efficiency in genetic algorithms translates into a 
71: 3-objective problem: (1) maximize solution quality, (2) minimize
72: execution time, and (3) minimize memory resources. The latter is 
73: usually not a great concern because in traditional GAs, memory
74: requirements are constant throughout a run, leaving us with 
75: a tradeoff between solution quality and execution time.
76: 
77: The rest of the paper is organized as follows.
78: The next section presents background material on parallel GAs
79: and on PMBGAs.
80: Then, section~\ref{sec:motivation} raises some issues that 
81: can be explored when parallelizing PMBGAs that were not
82: possible with regular GAs. Section~\ref{sec:architecture}
83: presents an architecture that allows a massive parallelization
84: of the compact GA, and in section~\ref{sec:experiments}
85: computer experiments are conducted and its results are discussed. 
86: Finally, a number of extensions are outlined, and the paper finishes with 
87: a summary and some conclusions.
88: 
89: \section{Background}
90: \label{sec:background}
91: 
92: This section presents background material which is necessary for 
93: understanding the rest of the paper. It starts with a review of the 
94: major issues involved in parallelizing GAs, and then reviews the 
95: basic ideas of PMBGAs giving particular emphasis to the compact GA.
96: 
97: \subsection{Parallel GAs}
98: \label{sec:parallel_gas}
99: 
100: An important efficiency question that people are faced with in problem
101: solving is the following: Given a fixed computational time, what is the 
102: best way to allocate computer resources in order to have as good a 
103: solution as possible.
104: 
105: Under such a challenge, the idea of parallelization stands out naturally
106: as a way of improving the efficiency of the problem solving task. By using
107: multiple computers in parallel, there is an opportunity for delivering
108: better solutions in a shorter period of time.
109: 
110: Many computer algorithms are difficult to parallelize, but that is not
111: the case with GAs because GAs work with a population of solutions which 
112: can be evaluated independently of one another. Moreover, in many problems
113: most of the time is spent on evaluating solutions rather than on the
114: internal mechanisms of the GA operators themselves. Indeed, the time 
115: spent on the GA operators is usually negligible compared to the time spent 
116: on evaluating individual solutions.
117: 
118: Several researchers have investigated the topic of parallel GAs and the
119: major design issues are in choices such as using one or more populations,
120: and in the case of using multiple populations, decide when, with who, and
121: how often do individuals communicate with other individuals of other
122: populations.
123: 
124: Although implementing parallel genetic algorithms is relatively simple,
125: the answers to the questions raised above are not so straightforward and 
126: traditionally have only been answered by means of empirical experimentation. 
127: One exception to that has been the work of Cant\'u-Paz \cite{Cantupaz:2000}
128: who has built theoretical models that lead to rational decisions for 
129: setting the different parameters involved in parallelizing GAs. 
130: There are two major ways of implementing parallel GAs:
131: 
132: \begin{enumerate}
133: \item Using a single population.
134: \item Using multiple populations. 
135: \end{enumerate}
136: 
137: In single population parallel GAs, also called Master-Slave parallel GAs, 
138: one computer (the master) executes the GA operations and distributes 
139: individuals to be evaluated by other computers (the slaves). After
140: evaluating the individuals, the slaves return the results back to the
141: master. There can be significant benefits with such a scheme because
142: the slaves can work in parallel, independently of one another.
143: On the other hand, there is an extra overhead in communication costs
144: that must be paid in order to communicate individuals and fitness values
145: back and forth.
146: 
147: In multiple population parallel GAs, what would be a whole population 
148: in a regular non-parallel GA, becomes several smaller populations 
149: (usually called demes),
150: each of which is located in a different computer.
151: Each computer executes a regular GA and occasionally, individuals
152: may be exchanged with individuals from other populations. Multiple population
153: parallel GAs are much harder to design because there are more degrees of
154: freedom to explore. Specifically, four main things need to be chosen: (1) the 
155: size of each population, (2) the topology of the connection between the 
156: populations, (3) the number of individuals that are exchanged,
157: and (4) how often do the individuals exchange.
158: 
159: Cant\'u-Paz investigated both approaches and
160: concluded that for the case of the Master-Slave architecture, the benefits of 
161: parallelization occur mainly on problems with long function evaluation 
162: times because it needs constant communication. Multiple population
163: parallel GAs have less communication costs but do not avoid completely
164: the communication scalability problem. In other words, in either approach,
165: communication costs impose a limit on how fast parallel GAs can be.
166: To overcome this limitation, Cant\'u-Paz proposed a combination
167: of the two approaches in what was called {\em Hierarchical Parallel 
168: GAs}, and verified that when using such an approach it is possible 
169: to reduce the execution time more than by using either approach alone.
170: The interested reader is referred to the original source 
171: for the mathematical formulation and for additional information on the
172: design of parallel GAs.
173: 
174: 
175: \subsection{Probabilistic Model Building Genetic Algorithms}
176: \label{sec:pmbgas}
177: 
178: Probabilistic Model Building Genetic Algorithms (PMBGAs), also
179: referred by some authors as 
180: {\em Estimation of Distribution Algorithms} (EDAs), or
181: {\em Iterated Density Evolutionary Algorithms} (IDEAs),
182: are a class of Evolutionary Algorithms that replace the traditional 
183: variation operators, crossover and mutation, by the construction of a 
184: probabilistic model of the population and subsequent sampling from that 
185: model to obtain a new population of individuals. The 
186: operation of PMBGAs can be summarized by the following 5 steps:
187: 
188: \begin{enumerate}
189: \item Create a random population of individuals.
190: \item Apply selection to obtain a population of ``good'' individuals.
191: \item Build a probabilistic model of those good individuals.
192: \item Generate a new population according to the probabilistic model.
193: \item Return to step 2.
194: \end{enumerate}
195: 
196: Work on this area begun with simple probabilistic models that 
197: treated each gene independently, sometimes also called order-1 models.
198: Later, more complex algorithms were developed to allow dependencies
199: among genes. A detailed review of these algorithms can be found 
200: elsewhere \cite{Pelikan:02} \cite{larranaga:2001}.
201: 
202: The next subsection, reviews in detail the compact GA \cite{Harik:99e},
203: which is an example of an order-1 PMBGA, and whose parallelization is
204: discussed later in the paper.
205: 
206: \subsection{The Compact Genetic Algorithm}
207: \label{sec:cga}
208: 
209: Consider a 5-bit problem with a population of 10 individuals as shown
210: below:
211: 
212: \begin{center}
213: \begin{tabular}{|c c c c c|}
214:   \hline 
215:          1 &  0 &  0 &  0 & 0\\ 
216:          1 &  1 &  0 &  0 & 1\\
217:          0 &  1 &  1 &  1 & 1\\
218:          1 &  1 &  0 &  0 & 0\\
219:          0 &  1 &  1 &  0 & 1\\
220:          0 &  1 &  1 &  1 & 0\\
221:          1 &  1 &  0 &  0 & 0\\
222:          1 &  0 &  0 &  0 & 0\\  
223:          0 &  1 &  1 &  0 & 1\\  
224:          1 &  0 &  0 &  1 & 1\\  
225:   \hline
226: \end{tabular}
227: \end{center}
228: 
229: %\noindent
230: Under the compact GA, the population can be represented 
231: by the following probability vector:
232: 
233: \begin{center}
234: \begin{tabular}{|c|c|c|c|c|}
235:   \hline 
236:   0.6 & 0.7 & 0.4 & 0.3 & 0.5 \\ 
237:   \hline
238: \end{tabular}
239: \end{center}
240: 
241: %\noindent
242: The probabilities are the relative frequency counts of the number
243: of 1's for the different gene positions, and can be interpreted as
244: a compact representation of the population. In other words, the
245: individuals of the population could have been sampled from the 
246: probability vector.
247: 
248: Harik et al. \cite{Harik:99e} noticed that it was possible to mimic the behavior 
249: of a simple GA, without storing the population explicitly.
250: Such observation came from the fact that during the course of a regular 
251: GA run, alleles compete with each other at every gene position. At the 
252: beginning, scanning the 
253: population column-wise, we should expect to observe that 
254: roughly 50\% of the alleles have value 0 and  50\% of the alleles have 
255: value 1. As the search progresses, for each column, either the 
256: zeros take over the ones, or vice-versa. 
257: Harik et al. built an algorithm that explicitly simulates 
258: the random walk that takes place on the allele frequency makeup 
259: for every gene position.
260: The resulting algorithm, the compact GA, was shown to be operationally 
261: equivalent to a simple GA that does not assume any linkage between 
262: genes. 
263: 
264: The compact GA does not follow exactly the 5 steps mentioned previously
265: (in section~\ref{sec:pmbgas}) for a typical PMBGA, because the algorithm 
266: does not manipulate the population explicitly. Instead, it does so in an 
267: indirect way through the update step of $1/N$, where $N$ denotes the 
268: population size of a regular GA.
269: 
270: 
271: \section{Motivation for parallelizing PMBGAs}
272: \label{sec:motivation}
273: 
274: The main motivation for parallelizing PMBGAs is the same as the one
275: for parallelizing regular GAs, or any other algorithm: efficiency. 
276: By using multiple computers it is possible to make the algorithm run faster.
277: 
278: In many ways, parallelizing PMBGAs has many similarities with
279: parallelizing regular GAs. On the other hand, the mechanics of PMBGAs
280: are different from those of regular GAs, and it is possible to take advantage
281: of that. Specifically, it is possible to increase efficiency by exploring
282: the following two things:
283: 
284: \begin{enumerate}
285: \item Parallelize model building.
286: \item Communicate model rather than individuals.
287: \end{enumerate}
288: 
289: In regular GAs, the time spent on the GA operations (selection, crossover,
290: and mutation) is usually negligible compared to the time spent in fitness 
291: function evaluations. When using PMBGAs, and especially when
292: using multivariate models,
293: the model-building phase is much more compute intensive than the usual 
294: crossover and mutation operators of a regular GA. For many problems, such 
295: overhead can contribute to a significant fraction of the overall execution 
296: time. In such cases, it makes a lot of sense to parallelize the 
297: model-building phase. There has been a couple of research efforts 
298: addressing this topic \cite{Ocenasek:03} \cite{Lam:2002}.
299: 
300: Another aspect that makes PMBGAs very attractive for parallelization
301: comes from the observation that the model is a compact representation of 
302: the population, and it is possible to communicate the model rather than 
303: individuals themselves. Communication costs can be reduced this way 
304: because the model needs significant less storage than the whole population.
305: Since communication costs can be drastically reduced, it might make
306: sense to clone the model to several computers, and each computer could 
307: work independently on solving the problem by running a separate PMBGA.
308: Then, the different models would have to be consolidated (or mixed)
309: once in a while.
310: The next section presents an architecture that implements this idea 
311: with the compact GA.
312: 
313: \section{An architecture for building a massively parallel compact GA}
314: \label{sec:architecture}
315: 
316: This section presents an architecture which is suitable for a scalable
317: parallelization of the compact GA. Similar schemes can be done with other
318: order-1 PMBGAs. However, the connection that exists
319: between the population size and the update step, makes the compact GA
320: more suitable when working with very large populations, a topic that is
321: revisited later.
322: 
323: Since the model-building phase of the compact GA is trivial,
324: our study focuses only on the second item mentioned in 
325: section~\ref{sec:motivation}; communicate the model rather than
326: individuals. In the case of the compact GA, the model is represented 
327: by a probability vector of size $\ell$ ($\ell$ is the chromosome length). 
328: Each variable of the probability vector contains a value which has to 
329: be a member of a finite set of $N+1$ values ($N$ denotes the size of 
330: the population that the compact GA is simulating). The $N+1$ numbers 
331: correspond to all possible allele frequency counts for a particular
332: gene ($0$, $1$, $2$, \ldots, $N$), and can be stored with 
333: $\log_{2} (N+1)$ bits. Therefore, the probability vector can be represented
334: with $\ell \times \log_{2} (N+1)$ bits. This value is of a different 
335: order of magnitude than the $\ell \times N$ bits needed to represent a 
336: population in a regular GA, making it feasible to communicate the model 
337: back an forth between different computers. 
338: 
339: The storage savings are especially important when using large populations.
340: For instance, let us suppose that we are interested
341: in solving a 1000-bit problem using a population of size 1 million.
342: With a regular parallel GA, in order to communicate the whole population
343: it would be necessary to transmit approximately 1 Giga bit over a network.
344: Instead, with the compact GA, it would only
345: be necessary to transmit 20 thousand bits. The difference is large
346: and suggests that running multiple compact GAs in parallel with model
347: exchanges once in a while is something that deserves to be explored.
348: We have devised an architecture, that we call {\em manager-worker}, 
349: that implements this idea.
350: Figure~\ref{fig:architecture} shows a schematic of the approach.
351: 
352: \begin{figure}
353: \centering
354: \epsfig{figure=architecture.eps,width=0.9\textwidth}
355: \caption{Manager-worker architecture.}
356: \label{fig:architecture}
357: \end{figure}
358: 
359: Although Figure~\ref{fig:architecture} resembles
360: a master-slave configuration, we decided to give it a different name
361: (manager-worker) to contrast with the usual 
362: master-slave architecture of regular parallel GAs. There, the master 
363: executes and coordinates the GA operations and the slaves just compute 
364: fitness function evaluations. In the case of the 
365: parallel compact GA that we are suggesting, the manager also coordinates 
366: the work of the workers, but each worker runs a compact GA on its own.
367: There can be an arbitrary number of workers and there is no direct 
368: communication among them; the only communication that takes place 
369: occurs between the manager and a worker. 
370: 
371: 
372: \subsection{Operational details}
373: 
374: One could think of different ways of parallelizing the compact GA. 
375: Indeed, some researchers have proposed different schemes \cite{Ahn:2003}
376: \cite{Hidalgo:2003}.
377: The
378: way that we are about to propose is particularly attractive because once the
379: manager starts, there can be an arbitrary number of workers, each of which
380: can start and finish at any given point in time making the whole system
381: fault tolerant. The operational details consist of the following seven steps:
382: 
383: \begin{enumerate}
384: \item The manager initializes a probability vector of size $\ell$ 
385: with each variable set to $0.5$. Then it goes to sleep, and waits to 
386: be woken up by some worker computer.
387: 
388: \item When a worker computer enters in action for the first time, 
389: it sends a signal to the manager saying that it is ready to start working.
390: 
391: \item The manager wakes up, sends a copy of its probability vector to 
392: the worker, and goes back to sleep. 
393: 
394: \item Once the worker receives the probability vector, it explores $m$ new
395: individuals with a compact GA. During this period, $m$ fitness function
396: evaluations are performed and the worker's local probability vector
397: (which initially is just a copy of the manager's probability vector)
398: is updated along the way.
399: 
400: \item After $m$ fitness function evaluations have elapsed,
401: the worker wakes up the manager in order to report the results
402: of those $m$ function evaluations. The results can be summarized by
403: sending only the differences that occurred between the vector that was
404: sent from the master and the worker's vector state after the execution
405: of the $m$ fitness function evaluations.
406: 
407: \item When the manager receives the probability vector differences sent 
408: by the worker, it updates its own probability vector by adding the differences
409: to its current vector. 
410: 
411: \item Then it sends the newly updated probability vector back to the worker.
412: The manager goes back to sleep and the worker starts working for $m$
413: more fitness function evaluations (back to step 4).
414: \end{enumerate}
415: 
416: 
417: There are a number of subtle points that are worth mentioning.
418: First of all, step number 7 is not a broadcast operation. The manager
419: just sends its newly updated probability vector to one particular worker. 
420: Notice however, that the manager's probability vector not only incorporates
421: the results of the $m$ function evaluations performed by that particular
422: worker, but it also incorporates the results of the evaluations conducted
423: by the other workers. That is, while a particular worker is working, other
424: workers might be updating the manager's probability vector. Thus, at a given
425: point in time, workers are working with a slightly outdated probability
426: vector. Although this might seem a disadvantage at first sight, the error
427: that is committed by working with a slightly outdated probability vector is
428: likely to be negligible for the overall search because an iteration
429: of the compact GA represents only a small step in the action of the GA
430: (this is especially true for large population sizes). 
431: The proposed parallelization scheme has several advantages, namely:
432: 
433: \begin{itemize}
434: \item Low synchronization costs.
435: \item Fault tolerance.
436: \item Scalability.
437: \end{itemize}
438: 
439: All the communication that takes place consist of short transactions. 
440: Workers do their job independently and only interrupt the manager once 
441: in a while. During the interruption period, the manager communicates
442: with a single worker, and the other workers can continue working non-stop.
443: 
444: The architecture is fault tolerant because workers can go up or down at any
445: given point in time. This makes it suitable for massively parallelization
446: using the Internet. It is scalable because potentially there is no limit 
447: on the number of workers.
448: 
449: \section{Computer simulations}
450: \label{sec:experiments}
451: 
452: This section presents computer simulations that were done to validate
453: the proposed approach. For the purpose of this paper, we are only interested 
454: in checking if the idea is valid. Therefore, and in order to simplify 
455: both the implementation and the interpretation of the results, we decided to 
456: do a serial implementation of the parallel compact GA architecture.
457: Although it might seem strange (after all, we are describing a scheme
458: for doing massive parallelization), doing a serial simulation of the behavior
459: of the algorithm has a number of advantages:
460: 
461: \begin{itemize}
462: \item we can analyze the algorithm's behavior under careful controlled
463: conditions.
464: \item we can do scalability tests by simulating a parallel compact GA with
465: a large number of computers without having the hardware.
466: \item we can ignore network delays and different execution speeds of different
467: machines.
468: \end{itemize}
469: 
470: The serial implementation that we have developed simulates that there
471: are a number of $P$ worker processors and 1 manager processor. The 
472: $P$ worker processors start running at the same time and they all 
473: execute at the same speed. In addition, it is assumed that the communication
474: cost associated with a manager-worker transaction takes a constant time 
475: which is proportional to the probability vector's size. 
476: Such a scheme can be implemented by having a collection of $P$
477: regular compact GAs, each one with its own probability vector, and iterating
478: through all of them, doing a small step of the compact GA main loop, 
479: one at a time. After a particular compact GA worker completes $m$ fitness function
480: evaluations, the worker-manager communication is simulated as illustrated
481: during section~\ref{sec:architecture}.
482: 
483: We present experiments on a single problem, a bounded deceptive
484: function consisting of the concatenation of 10 copies of a 3-bit trap
485: function with deceptive-to-optimal ratio of $0.7$ \cite{Deb:93a*}.
486: This same function has been used in the original compact GA work.
487: We simulate a selection rate of $s=8$
488: and did tests with a population size of $N=100000$ individuals
489: (each worker processor runs a compact GA that simulates
490: a 100000 population size). We chose this population size because we
491: wanted to use a size large enough to solve all the building blocks
492: correctly. We use $s=8$ following the recommendation given by Harik et al.
493: in the original compact GA paper for this type of problem. Finally, we
494: chose this problem as a test function because, even though the compact
495: GA is a poor algorithm in solving the problem, we wanted to use a function
496: that requires a large population size because those are the situations where
497: the benefits from parallelization are more pronounced.
498: 
499: Having fixed both the population size
500: and the selection rate, we decided to systematically vary the number
501: of worker processors $P$, as well as the $m$ parameter which has an 
502: effect on the rate of communication that occurs between the
503: manager and a worker.
504: We did experiments for $P$ in \{1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024\},
505: and for a particular $P$, we varied the parameter $m$ in 
506: \{8, 80, 800, 8000, 80000\}. This totalled 55 different configurations,
507: each of which was run 30 independent times.
508: 
509: \begin{figure}[htb]
510: \centering
511: \mbox{
512:   %\subfigure[]
513:     {\epsfig{figure=fe_log2_log10.eps, width=.48
514:         \textwidth}}\quad
515:   %\subfigure[]
516:     {\epsfig{figure=cs_log2_log10.eps, width=.48
517:         \textwidth}}
518: }
519: \caption{Both graphs depict a log-log plot. On the left, we see the average number of
520: function evaluations per processor. On the right, we see the average number of communication
521: steps per processor.}
522: \label{fig:experiments}
523: \end{figure}
524: 
525: The $m$ parameter is important because it is the one that affects communication
526: costs. Smaller $m$ values imply an increase in communication costs. 
527: On the other hand, for very large $m$ values, performance degrades because
528: the compact GA workers start sampling individuals from outdated probability
529: vectors.
530: 
531: Figure~\ref{fig:experiments} shows the results. In terms of fitness function
532: evaluations per processor, we observe a linear speedup for low $m$ values. 
533: For instance, for $m=8$
534: we observe a straight line on the log-log plot. Using the data directly, we
535: calculated the slope of the line and obtained an approximate value of -0.3.
536: In order to take into account the different logarithm bases, we need to multiply
537: it by $\log_{2}10$ (y-axis is $\log_{10}$, x-axis is $\log_{2}$) yielding a 
538: slope of approximately -1.
539: This means that the number of function evaluation per processor decreases
540: linearly with a growing number of processors.
541: That is, whenever we
542: double the number of processors, the average number of fitness function
543: evaluations per processor gets cut by a half. 
544: 
545: Likewise, in terms of communication costs, as we raise the parameter $m$,
546: the average number of communication steps between manager and worker decreases
547: in the same proportion as expected. For instance, for $m=80$, communication
548: costs are reduced 10 times when compared with $m=8$. 
549: Notice that there is a degradation in terms of speedup for the larger $m$ values.
550: For instance, 
551: for $m=8000$ and $m=80000$ (which is about the same order of the population size),
552: the speedup obtained goes away from the idealized case. This can be explained 
553: by the fact that in this case (and 
554: especially with a large number of processors), the average number of communication
555: steps per processor approaches zero. That means that a large fraction of processors
556: were actually doing some work but never communicated their results back to the
557: manager because the problem was solved before they had a chance to do so.
558: 
559: 
560: \section{Extensions}
561: \label{sec:extensions}
562: 
563: This work has a number of extensions worthwhile exploring.
564: Below, we outline some of them:
565: 
566: \begin{itemize}
567: \item Build theory for analyzing the effect of $m$, $N$, and $P$.
568: \item Compare with traditional parallel GA schemes.
569: \item Extend the approach to multivariate PMBGAs.
570: \item Take advantage of the Internet and build something like SETI@@home.
571: \end{itemize}
572: 
573: It would be interesting to study the mathematical analysis
574: of the proposed parallel compact GA. A number of questions come
575: to mind. For instance, what is the effect of the $m$ parameter?
576: What about the number of workers $P$? Should $m$ be adjusted
577: automatically as a function of $P$ and $N$? Our experiments suggest
578: that there is an ``optimal'' $m$ that depends on the number of 
579: compact GA workers $P$, and most likely depends on the 
580: population size $N$ as well.
581: 
582: Another extension that could be done is to compare
583: the proposed parallel architecture with those 
584: used more often in traditional parallel GAs, either master-slave
585: and multiple deme GAs. Again, our experiments suggest that the
586: parallel compact GA is likely to be on top of regular parallel
587: GAs due to lower communication costs. 
588: 
589: The model structure of the compact GA never changes, every gene is 
590: always treated independently. There are other PMBGAs 
591: that are able to learn a more complex structure dynamically as
592: the search progresses. One could think of using some of the ideas
593: presented here for parallelizing these more complex PMBGAs. 
594: 
595: Finally, it would be interesting to have a parallel compact GA implementation
596: based on the Internet infrastructure, where computers around the world could 
597: contribute with some processing power when they are idle. Similar schemes have
598: been done with other  projects, one of the most well known is
599: the  SETI@@home project \cite{Korpela:2001}. Our parallel GA architecture
600: is suitable for a similar kind of project because computers can go up or
601: down at any given point in time.
602: 
603: \section{Summary and conclusions}
604: \label{sec:summary_conclusions}
605: 
606: This paper reviewed the compact GA and presented an architecture that 
607: allows its massive parallelization. The motivation for doing so has
608: been discussed and a serial implementation of the parallel architecture
609: was simulated. Computer experiments were done under idealized conditions 
610: and we have verified an almost linear speedup with a growing number
611: of processors.
612: 
613: The paper presented a novel way of parallelizing GAs. This was possible
614: due to the different operational mechanisms of the compact GA when compared
615: with a more traditional GA. By taking advantage of the compact representation 
616: of the population, it becomes possible do distribute its representation to 
617: different computers without the associated cost of sending it individual
618: by individual.
619: 
620: Additional empirical and theoretical research needs to be done to confirm
621: our preliminary results. Nonetheless, the speedups observed in our experiments 
622: suggest that a massive parallelization of the compact GA may constitute an efficient
623: and practical alternative for solving a variety of problems.
624: 
625: 
626: %\subsubsection*{Acknowledgements} 
627: %PUT after obtaining reviews.
628: 
629: %\begin{thebibliography}{}
630: %\end{thebibliography}
631: 
632: \bibliographystyle{splncs}
633: \bibliography{references}
634: 
635: \end{document}
636: 
637: 
638: 
639: 
640: 
641: 
642: 
643: 
644: