physics0506176/workloadAnalysis.tex
1: %% bigloo scm/r0.o scm/stat.o scm/parameters.o scm/error.o -o bin/r0.bin
2: 
3: \documentclass[a4paper,10pt,twoside,fleqn,twocolumn]{article}
4: \usepackage{epsfig}
5: \usepackage{amssymb, amsmath}
6: \usepackage{graphicx}
7: \usepackage{subfigure}
8: \usepackage{url}
9: \usepackage[english]{babel}
10: 
11: %% \mathindent 0.2cm
12: \pagestyle{headings}
13: \pagenumbering{arabic}
14: \numberwithin{equation}{section}
15: 
16: %% New commands to align figures/tables
17: \makeatletter
18: \newcommand\figcaption{\def\@captype{figure}\caption}
19: \newcommand\tabcaption{\def\@captype{table}\caption}
20: \makeatother
21: 
22: %% Un abstract plus beau
23: \renewenvironment{abstract}{\begin{quote}{\noindent\bf Abstract.}}{\end{quote}}
24: 
25: %% Draft version
26: %%  \newcommand{\reviewtimetoday}[2]{\special{!userdict begin
27: %%      /bop-hook{gsave 20 710 translate 45 rotate 0.8 setgray
28: %%        /Times-Roman findfont 16 scalefont setfont 0 0   moveto (#1) show
29: %% 0 -12 moveto (#2) show grestore}def end}}
30: %% % You can turn on or off this option. 
31: %% \reviewtimetoday{\today}{Draft Version}
32: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
33: \renewcommand{\thefootnote}{\fnsymbol{footnote}}
34: 
35: 
36: %% Pour avoir plus de marges en haut et en bas 
37: %% \setlength{\oddsidemargin}{0in}
38: %% \setlength{\evensidemargin}{0in}
39: %% \setlength{\textwidth}{6.5in}
40: %% \setlength{\textheight}{9.0in}
41: %% \setlength{\topmargin}{0in}
42: 
43: %% \selectlanguage{english}
44: \hyphenation{Ra-dio-the-ra-py}
45: \hyphenation{appli-ca-tions}
46: \hyphenation{bio-me-di-cal}
47: \hyphenation{INSTRUIRE}
48: \hyphenation{Healthgrid}
49: \hyphenation{EGEE}
50: \hyphenation{Emmanuel}
51: \hyphenation{Medernach}
52: \hyphenation{Vincent}
53: \hyphenation{Breton}
54: \hyphenation{Yannick}
55: \hyphenation{Pierre-Louis}
56: \hyphenation{Reichstadt}
57: \hyphenation{la-bo-ra-to-ry}
58: \hyphenation{ne-ce-ssa-ry}
59: \hyphenation{Si-mi-lar-ly}
60: \hyphenation{heavily}
61: \hyphenation{parameters}
62: \hyphenation{probability}
63: \hyphenation{classi-cal}
64: 
65: \begin{document}
66: 
67:   \title{
68:     \vspace*{-2cm}
69:        {\normalsize  Workload characterization and modelling
70:          \hfill \today \\[1mm]}
71:        \LARGE\sc\hrule height0.2pt \vskip 2mm
72:        Workload analysis of a cluster \\
73:        in a Grid environment
74:        \vskip 2mm \hrule height0.2pt}
75: 
76:   \author{ Emmanuel  Medernach\\[1ex]
77:     \small Laboratoire de Physique Corpusculaire, CNRS-IN2P3 \\
78:     \small Campus des C\'ezeaux, 
79:     \small 63177 Aubi\`ere Cedex, France \\
80:     \thanks{This work was supported by EGEE.} 
81:     \normalsize \em e-mail: \tt medernac@clermont.in2p3.fr }
82:   \date{}
83:   \maketitle
84: 
85: \begin{abstract}
86:   \em  With Grids, we  are able  to share  computing resources  and to
87:   provide for  scientific communities  a global transparent  access to
88:   local  facilities.  In  such  an environment  the  problems of  fair
89:   resource sharing and best usage  arise.  In this paper, the analysis
90:   of the LPC cluster usage (Clermont-Ferrand, France) in the EGEE Grid
91:   environment is done, and from the results a model for job arrival is
92:   proposed.
93: \end{abstract}
94: 
95: 
96: \section{Introduction}
97: 
98:  Analysis of a cluster workload  is essential to understand better user behavior
99:  and how  resources are  used~\cite{feitelson02workload}.  We are  interested to
100:  model  and simulate  the  usage of  a Grid  cluster  node in  order to  compare
101:  different scheduling policies and to find the best suited one for our needs.
102: 
103:  The Grid gives new ways to share resources between sites, both as computing and
104:  storage  resources.   Grid  defines   a  global  architecture  for  distributed
105:  scheduling  and  resource  management~\cite{darincosts} that  enable  resources
106:  scaling.  We would like to understand better  such a system so that a model can
107:  be defined.  With such a model, simulation may be done and a quality of service
108:  and fairness could then be proposed to the different users and groups.
109: 
110:  Briefly, we  have some  groups of  users that each  submit jobs  to a
111:  group of  clusters. These jobs are  placed inside a  waiting queue on
112:  some clusters before being  scheduled and then processed.  Each group
113:  of  users  have  their  own  need  and  their  own  strategy  to  job
114:  submittal. We wish:
115:  \begin{enumerate}
116:    \item to  have good metrics  that describes the  group and user usage  of the
117:      site.
118:    \item to model the global behavior (average job waiting time, average waiting
119:      queue  length, system  utilization, etc.)   in order  to know  what  is the
120:      influence of each parameter and to avoid site saturation.
121:    \item  to simulate  jobs arrivals  and  characteristics to  test and  compare
122:      different  scheduling   strategies.   The   goal  is  to   maximize  system
123:      utilization  and  to provide  fairness  between  site  users to  avoid  job
124:      starvation.
125:  \end{enumerate}
126: 
127:  As     parallel     scheduling    for     $p$     machines     is    a     hard
128:  problem~\cite{Garey79:intractability,      mertens04},      heuristics      are
129:  used~\cite{school-parallel,  feitelson95d}.  Moreover  we have  no  exact value
130:  about the duration of jobs, making the problem difficult.  We need a good model
131:  to be  able to compare different  scheduling strategies. We  believe that being
132:  able  to  characterize  users  and  groups  behavior  we  could  better  design
133:  scheduling  strategies that promote  fairness and  maintain a  good throughput.
134:  From this  paper some metrics are  revealed, from the job  submittal protocol a
135:  detailed arrival  model for  single user and  group is proposed  and scheduling
136:  problems are discussed.  We then suggest  a new design based on our observation
137:  and show relationship  between fairness issue and system  utilization as a flow
138:  problem.
139: 
140:  Our  cluster usage  in  the  EGEE (Enabling  Grids  for E-science  in
141:  Europe)  Grid  is presented  in  section~\ref{Environment}, the  Grid
142:  middleware  used is  described.  Corresponding  scheduling  scheme is
143:  shown  in section~\ref{Scheduling}.   Then  the workload  of the  LPC
144:  (Laboratoire  de  Physique   Corpusculaire)  computing  resource,  is
145:  presented (section~\ref{Workload analysis and modelling}) and the logs
146:  are   analyzed  statistically.    A   model  is   then  proposed   in
147:  section~\ref{Model}  that  describes the  job  arrival  rate to  this
148:  cluster.      Simulation    and     validation     are    done     in
149:  section~\ref{Simulation}  with  comparison   with  related  works  in
150:  section~\ref{related}.       Results       are      discussed      in
151:  section~\ref{Discussion}.   Section~\ref{Conclusion}  concludes  this
152:  paper.
153:    
154:  \section{Environment}
155:  \label{Environment}
156: 
157:  %% studied
158:  
159:  \subsection{Local situation}
160:  \label{Local situation}
161:  
162:  The EGEE node at  LPC Clermont-Ferrand is a Linux cluster made  of 140 dual 3.0
163:  GHz CPUs with 1 GB of RAM and  managed by 2 servers with the LCG (LHC Computing
164:  Grid  Project)  middleware.   We  are  currently  using  MAUI  as  our  cluster
165:  scheduler~\cite{jackson01, linux-usenix}.  It is  shared with the regional Grid
166:  INSTRUIRE (\url{http://www.instruire.org}).  Our LPC Cluster role in EGEE is to
167:  be used mostly by Biomedical users\footnote{Our cluster represented 75\% of all
168:  the Biomed Virtual  Organization (VO) jobs in 2004.}  located  in Europe and by
169:  High Energy  Physics Communities.  Biomedical research is  one core application
170:  of the EGEE project.  The approach  is to apply the computing methods and tools
171:  developed in  high energy  physics for biomedical  applications.  Our  team has
172:  been involved  in international research group focused  on deploying biomedical
173:  applications in a Grid environment.
174: 
175:  %% Biomedical  applications currently  used in the  Biomed Virtual Organization  (VO) are
176:  %% listed on table~\ref{table:applications}. 
177: 
178:  One   pilot  application  is   GATE  which   is  based   on  the   Monte  Carlo
179:  GEANT4~\cite{geant4} toolkit  developed by  the high energy  physics community.
180:  Radiotherapy and brachytherapy use ionizing radiations to treat cancer.  Before
181:  each treatment,  physicians and physicists plan the  treatment using analytical
182:  treatment planning systems  and medical images data of the  tumor. By using the
183:  Grid environment  provided by the EGEE project,  we will be able  to reduce the
184:  computing time of Monte Carlo simulations in order to provide a reasonable time
185:  consuming tool for specific cancer treatment requiring Monte-Carlo accuracy.
186: 
187:  Another group is  Dteam, this group is partly  responsible of sending
188:  tests and monitoring  jobs to our site.  Total CPU  time used by this
189:  group is  small relatively to  the other one,  but the jobs  sent are
190:  important for the  site monitoring.  There are also  groups using the
191:  cluster from the  LHC experiments at CERN (\url{http://www.cern.ch}).
192:  There are  different kind  of jobs for  a given group.   For example,
193:  Data Analysis  requires a lot  of I/O whereas  Monte-Carlo Simulation
194:  needs few I/O.
195:  
196:  \subsection{EGEE Grid technology}
197:  \label{Grid technology}
198: 
199:  In Grid world, resources are controlled by their owners. For instance
200:  different kind of scheduling policies could be used for each site.  A
201:  Grid resource  center provides to  the Grid computing  and/or storage
202:  resources and also services that allow jobs to be submitted by guests
203:  users,  security  services, monitoring  tools,  storage facility  and
204:  software management.  The main issue  of submitting a job to a remote
205:  site   is  to  provide   some  warranty   of  security   and  correct
206:  execution. In  fact the  middleware automatically resubmits  job when
207:  there is  a problem with  one site.  Security and  authentication are
208:  also provided as Grid services.
209: 
210:  The Grid principle is to allow user a worldwide transparent access to
211:  computing and storage resources.  In the case of EGEE, this access is
212:  aimed to be  transparent by using LCG middleware built  on top of the
213:  Globus Toolkit~\cite{foster97globus}.  Middleware  acts as a layer of
214:  software that provides homogeneous  access to different Grid resource
215:  centers.
216: 
217:  \subsection{LCG Middleware}
218:  \label{LCG Middleware}
219: 
220:  LCG   is  organized   into  Virtual   Organizations   (VOs):  dynamic
221:  collections of  individuals and  institutions sharing resources  in a
222:  flexible,  secure  and   coordinated  manner.   Resource  sharing  is
223:  facilitated and controlled by a  set of services that allow resources
224:  to be  discovered, accessed, allocated, monitored  and accounted for,
225:  regardless of their physical location. Since these services provide a
226:  layer  between physical  resources and  applications, they  are often
227:  referred to as Grid Middleware~\cite{glite-arch}.
228: 
229:  Bag  of  task  applications  are parallel  applications  composed  of
230:  independent  jobs.  No  communications are  required  between running
231:  jobs.  Since  jobs from  a same task  may execute on  different sites
232:  communications  between jobs  are  avoided.  In  this context,  users
233:  submit their jobs to the Grid one by one through the middleware.  Our
234:  cluster receives jobs  only from the Grid.  This  means that each job
235:  requests  for  one and  only  one  processor.   Users could  directly
236:  specify  the execution site  or let  a Grid  service choose  the best
237:  destination  for them.   Users give  only a  rough estimation  of the
238:  maximum  job  running  time.   In  general  this  estimated  time  is
239:  overestimated  and very  imprecise~\cite{zotkin99joblength}.  Instead
240:  of  speaking about an  estimated time,  it could  be better  to speak
241:  about  an upper bound  for job  duration, so  this value  provided by
242:  users is  more a precision value.   The bigger the value  is the more
243:  imprecise the value of the actual runtime could be.
244: 
245:  Figure~\ref{fig:jobflow} shows  the scenario  of a job  submittal. In
246:  this  figure rounded  boxes are  grid services  and ellipses  are the
247:  different jobs  states. As there  is no communications  between jobs,
248:  jobs  could  run  independently  on multiple  clusters.   Instead  of
249:  communicating  between  job execution,  jobs  write  output files  to
250:  Storage Elements (SE) of the  Grid.  Small output files could also be
251:  sent to  the UI.   Replica Location Service  (RLS) is a  Grid service
252:  that allow location of replicated data.  Other jobs may read and work
253:  on the data generated, forming ``pipelines'' of jobs.
254:  
255:  The  users Grid entry  point is  called an  User Interface  (UI).  This  is the
256:  gateway to Grid services.  From this machine, users are given the capability to
257:  submit   jobs   to   a   Computing   Element   and   to   follow   their   jobs
258:  status~\cite{lcguserguide}. A Computing Element  (CE) is composed of Grid batch
259:  queues.  A Computing Element is built  on a homogeneous farm of computing nodes
260:  called Worker Nodes (WN) and on a node called a GateKeeper acting as a security
261:  front-end to the rest of the Grid.
262: 
263:  %% Ok
264:  \begin{figure*}[ht]
265:    \centering
266:    \includegraphics[height=12.3cm]{wms1.eps}
267:    \caption{Job submittal scenario}
268:    \label{fig:jobflow}
269:  \end{figure*}
270:  
271: 
272:  Users can query the Information System in order  to know both the state of different grid nodes and
273:  where their jobs are able to run depending on job requirements.  This match-making process has been
274:  packaged as a Grid service known as the  Resource Broker (RB). Users could either submit their jobs
275:  directly to  different sites or to  a central Resource Broker  which then dispatches  their jobs to
276:  matching sites.
277: 
278:  The services  of the Workload Management  System (WMS) are  responsible for the
279:  acceptance of job submits and the  dispatching of these jobs to the appropriate
280:  CEs, depending  on job requirements  and on available resources.   The Resource
281:  Broker is the machine where the WMS  services run, there is at least one RB for
282:  each  VO.  The  duty  of the  RB  is to  find the  best  resource matching  the
283:  requirements   of   a   job   (match-making  process).    (For   more   details
284:  see~\cite{dataGrid-wms})
285: 
286:  
287:  Users are then mapped to a  local account on the chosen executing CE.
288:  When a CE receives a job,  it enqueues it inside an appropriate batch
289:  queue,  chosen  depending  on  the  job  requirements,  for  instance
290:  depending on the maximum running time.  A scheduler then proceeds all
291:  these queues to  decide the execution of jobs.   Users could question
292:  about status of their jobs during all the job lifetime.
293: 
294:  \section{Scheduling scheme}
295:  \label{Scheduling}
296: 
297:  The goal  of the scheduler is  first to enable execution  of jobs, to
298:  maximize job  throughput and to  maintain a good  equilibrium between
299:  users in their usage of the cluster~\cite{feitelson96c}.  At the same
300:  time scheduler has to avoid starvation, that is jobs, users or groups
301:  that  access  scarcely to  available  cluster  resources compared  to
302:  others.
303: 
304:  Scheduling is done on-line, i.e  the scheduler has no knowledge about
305:  all the job  input requests but jobs are submitted  to the cluster at
306:  arbitrary  time.    No  preemption  is  done,  the   cluster  uses  a
307:  space-sharing mode for jobs.  In a Grid environment long-time running
308:  jobs are common.  The worst case  is when the cluster is full of jobs
309:  running for days  and at the same time receiving  jobs blocked in the
310:  waiting queue.
311: 
312:  Short jobs  like monitoring jobs  barely delay too much  longer jobs.
313:  For example, a  1 day job could wait 15  minutes before starting, but
314:  it is  unwise if a  5 minutes  job has to  wait the same  15 minutes.
315:  This results  in production of algorithms classes  that encourage the
316:  start  of short  jobs  over  longer jobs.   (Short  jobs have  higher
317:  priority~\cite{chiang02})  Some other solution  proposed is  to split
318:  the cluster in static sub-clusters  but this is not compatible with a
319:  sharing  vision like  Grids.  Ideal  on-line scheduler  will maximize
320:  cluster usage  and fairness  between groups and  users.  Of  course a
321:  good trade-off has to be found between the two.
322: 
323:  \subsection{Local situation}
324:  
325:  We are  using two  servers to  manage our 140  CPUs, on  each machine
326:  there are 5  queues where each group could send  their jobs to.  Each
327:  queue has its own limit in maximum  CPU Time.  A job in a given queue
328:  is killed if it exceeds its  queue time limit.  There are in fact two
329:  limits, one  is the maximum  CPU time, the  other one is  the maximum
330:  total time (or  Wall time) a job could use.  For  each queue there is
331:  also a  limit in the  number of  jobs than can  run at a  given time.
332:  This is  done in order  to avoid that  the cluster is full  with long
333:  running jobs and short jobs  cannot run before days.  Likely there is
334:  the same limit in number of running jobs for a given group.
335:  
336:  \begin{table}[h]
337:    \begin{center}
338:    \begin{tabular}{|l|c|c|c|}
339:      \hline
340:      Queue & Max CPU & Max Wall & Max Jobs \\
341:            & (H:M) & (H:M) & \\
342:      \hline
343:      Test        & 00:05 & 00:15 & 130 \\
344:      Short       & 00:20 & 01:30 & 130 \\
345:      Long        & 08:00 & 24:00 & 130 \\
346:      Day         & 24:00 & 36:00 & 130 \\
347:      Infinite    & 48:00 & 72:00 & 130 \\
348:      \hline
349:    \end{tabular}
350:    \end{center}
351:    \caption{Queue  configuration  (maximum  CPU  time, Wall  time  and
352:    running jobs)}
353:  \end{table}
354: 
355: Maui Scheduler  and the  Portable Batch System  (PBS) run  on multiple
356: hardware and  operating systems.  MAUI  is a scheduling  policy engine
357: that is used together with the  PBS batch system.  PBS manages the job
358: reception  in  queues and  execution  on  cluster  nodes.  MAUI  is  a
359: First-Come-First-Served  backfill  scheduler  with  priorities.   This
360: means  that is checks  periodically the  running queues,  execution of
361: lower priority jobs is allowed  if it is determined that their running
362: will not delay jobs  higher in the queue~\cite{linux-usenix}.  Maui is
363: unfortunately not event  driven, it polls regularly the  PBS queues to
364: decide which jobs to run.  MAUI  allows to add a priority property for
365: each  queue.  Our  site configuration  is that  the shorter  the queue
366: allows jobs to run, the more  priority is given to that job.  Jobs are
367: then  selected  to  run depending  on  a  priority  based on  the  job
368: attributes such as owner, group, queue, waiting time, etc.
369: 
370: %% If a job violates a site policy it is placed temporary in a blocked
371: %% state and not considered for scheduling. 
372: 
373:  \section{Workload data analysis}
374:  \label{Workload analysis and modelling}
375: 
376:  %% - Ce qu'on observe -> modele qui approche le comportement
377:  
378: Workload    analysis    allows    to    obtain    a   model    of    the    user
379: behavior~\cite{calzarossa93workload}.    Such   a   model   is   essential   for
380: understanding  how the different  parameters change  the resource  center usage.
381: Meta-computing workload~\cite{chapin99b}  like Grid environments  is composed of
382: different site workloads.   We are interested in modelling  workload of our site
383: which  is part  of the  EGEE computational  Grid.  Our  site receives  only jobs
384: coming from the EGEE Grid and each requests for only one CPU.
385: 
386: Traces of users activities are obtained from accountings on the server logs.  Logs contain
387: information about users,  resources used, jobs arrival time and  jobs completion time.  It
388: is possible to use directly these traces to obtain a static simulation or to use a dynamic
389: model  instead.  Workload  models  are more  flexible  than logs,  because  they allow  to
390: generate   traces    with   different   parameters   and    better   understand   workload
391: properties~\cite{feitelson02workload}.   Workload analysis  allows  to obtain  a model  of
392: users activity.  Such a model is  essential for understanding how the different parameters
393: change the  resource center usage.  Our workload  data has been converted  to the Standard
394: Workload Format (\url{http://www.cs.huji.ac.il/labs/parallel/workload/}) and made publicly
395: available for further researches.
396: 
397: \begin{figure*}[hp]
398:   \begin{center}
399:     \subfigure[Number of Biomed jobs received per weeks (from August 2004 to May 2005)]{\label{fig:stats:jobWeekBiomed}\includegraphics[totalheight=5.2cm,clip]{NumberJobsPerWeekBiomed.eps}} 
400:     \subfigure[Number of Dteam jobs received per weeks (from August 2004 to May 2005)]{\label{fig:stats:jobWeekDteam}\includegraphics[totalheight=5.2cm,clip]{NumberJobsPerWeekDteam.eps}}
401:     \caption{Number of jobs  received per VO and per  week from August
402:     2004 to May 2005}
403:     \label{fig:stats}
404:   \end{center}
405: \end{figure*}
406: \begin{figure*}[hp]
407:   \begin{center}
408:     \subfigure[System utilization per weeks (from August 2004 to May 2005)]{\label{fig:stats:utilization}\includegraphics[totalheight=5.2cm,clip]{NumberAllCPUPerWeek.eps}}  
409:     \subfigure[CPU consumed by Biomed and Dteam jobs per weeks (from August 2004 to May 2005)]{\label{fig:stats:cpuBiomed}\includegraphics[totalheight=5.2cm,clip]{CPU.eps}} \\
410: %%    \subfigure[CPU consumed by Dteam jobs per weeks (from August 2004 to May 2005)]{\label{fig:stats:cpuDteam}\includegraphics[totalheight=4.6cm,clip]{NumberDteamCPUPerWeek.eps}} 
411: %%    \subfigure[CPU consumed by Atlas jobs per weeks (from August 2004 to May 2005)]{\label{fig:stats:cpuAtlas}\includegraphics[totalheight=4.6cm,clip]{NumberAtlasCPUPerWeek.eps}} \\
412: %%    \subfigure[CPU consumed by LHCb jobs per weeks (from August 2004 to May 2005)]{\label{fig:stats:cpuLHCb}\includegraphics[totalheight=4.6cm,clip]{NumberLHCbCPUPerWeek.eps}} \\
413:     \caption{Cluster utilization as CPU consumed per VO and per week from August 2004 to May 2005}
414:     \label{fig:statscpu}
415:   \end{center}
416: \end{figure*}
417: 
418: Workload is from August 1st 2004  to May 15th 2005.  We have a cluster
419: containing 140 CPUs since September  15th.  This can be visible in the
420: figure~\ref{fig:stats},         \ref{fig:stats:utilization}        and
421: \ref{fig:stats:cpuBiomed},  where we  notice that  the number  of jobs
422: sent increases.  Statistics are obtained  from the PBS log files.  PBS
423: log files  are well  structured for data  analysis.  An AWK  script is
424: used to  extract information  from PBS log  files.  AWK acts  on lines
425: matched  by regular  expressions.  We  do not  have  information about
426: users \emph{Login} time because users send jobs to our cluster from an
427: User Interface (UI) of the EGEE Grid and not directly.
428: 
429: \subsection{Running time}
430: 
431: \begin{figure*}[ht]
432:   \begin{center}
433:     \subfigure[Dteam job runtime]{\label{figure:dteam:jobduration}\includegraphics[totalheight=5.2cm,clip]{dteam.duration.eps}} 
434:     \subfigure[Biomed job runtime]{\label{figure:biomed:jobduration}\includegraphics[totalheight=5.2cm,clip]{biomed.duration.eps}} 
435:     \caption{Dteam and Biomed job runtime distributions (logscale on time axis)}
436:     \label{fig:group:runtime}
437:   \end{center}
438: \end{figure*}
439: 
440: \begin{table}[h]
441:   \begin{center}
442:     \begin{tabular}{|l|c|c|c|}
443:       \hline
444:       Group & Mean & Standard & Number \\
445:       &  & Deviation & of jobs \\
446:       \hline
447:       Biomed & 5417 & 22942.2 & 108197 \\
448:       Dteam & 222 & 3673.6 & 94474 \\
449:       LHCb & 2072 & 7783.4 & 9709 \\
450:       Atlas & 13071 & 28788.8 & 7979 \\
451:       Dzero & 213 & 393.9 & 1332 \\
452:       \hline
453:     \end{tabular}
454:   \end{center}
455:   \caption{Group  running time  in seconds  and total  number  of jobs
456:   submitted}
457:   \label{table:group:runtime}
458: \end{table}
459: 
460: During 280 days,  our site received 230474 jobs  from which 94474 Dteam jobs and  108197 Biomed jobs
461: (table~\ref{table:group:runtime}).  For  all these jobs  there are 23208  jobs that failed  and were
462: dequeued.   It appears  that jobs  are submitted  irregularly and  by bursts,  that is  lot  of jobs
463: submitted in  a short period of  time followed by a  period of relative inactivity.   From the logs,
464: there are not  much differences between CPU time and  total time, so it means that  jobs sent to our
465: cluster are really CPU intensive jobs and not I/O intensive.  Dteam jobs are mainly short monitoring
466: jobs but  all Dteam jobs  are not  regularly sent jobs.   We have 6784.6  days CPU time  consumed by
467: Biomed  for 108197  jobs (Mean  of  one hour  and half  per jobs,  table~\ref{table:group:runtime}).
468: Repartition   of   cumulative   job   duration   distributions   for   Biomed   VO   is   shown   on
469: figure~\ref{fig:group:runtime}.  The duration of about 70\%  of Biomed jobs are less than 15 minutes
470: and 50\% under 10 seconds, there are a dominant number of small running jobs but the distribution is
471: very   wide   as   shown   by   the    high   standard   deviation   compared   to   the   mean   in
472: table~\ref{table:group:runtime}.
473: 
474: \begin{table}[h]
475:   \begin{center}
476:     \begin{tabular}{|l|c|c|c|}
477:       \hline
478:       Queue &  Mean & Standard & CV \\
479:       & & Deviation & \\
480:       \hline
481:       Test  & 31.0 & 373.6 & 12.0 \\
482:       Short  & 149.5 & 1230.5 & 8.2 \\
483:       Long  & 2943.2 & 11881.2 & 4.0 \\
484:       Day  & 6634.8 & 25489.2 & 3.8 \\
485:       Infinite  & 10062.2 & 30824.5 & 3.0 \\
486:       \hline
487:     \end{tabular}
488:   \end{center}
489:   \caption{Queue  mean  running  time  in  seconds,  corresponding
490:   Standard Deviation and Coefficient of Variation}
491:   \label{table:queue:runtime}
492: \end{table}
493: 
494: Users  submit   their  jobs  with   an  estimated  run   length.   For
495: relationships between  execution time  and requested job  duration and
496: its  accuracy see~\cite{cirnecompr-model}.  To  sum up  estimated jobs
497: duration are essentially inaccurate. It  is in fact an upper bound for
498: job duration which could in reality take any value below it.
499: Table~\ref{table:queue:runtime} shows for  each queue the mean running
500: time, its  standard deviation and coefficient of  variation (CV) which
501: is the ratio between standard deviation and the mean.  CV decreases as
502: the queue maximum  runtime increase.  This means that  jobs in shorter
503: queues vary a lot in their duration compared to longer jobs and we can
504: expect that more the upper bound  given is high the more confidence in
505: using the queue mean runtime as a an estimation we could have.
506: 
507: A commonly used  method for modelling duration distribution  is to use
508: log-uniform    distribution.    Figures~\ref{figure:dteam:jobduration}
509: and~\ref{figure:biomed:jobduration}  show the  fraction  of Dteam  and
510: Biomed jobs with duration less than some value.  Job duration has been
511: modelled      with      a      multi-stage      log-uniform      model
512: in~\cite{downey99elusive} which is piecewise  linear in log space.  In
513: this  case  Dteam  and  Biomed  job  duration  could  be  approximated
514: respectively with a 3 and a 6 stages log-uniform distribution.
515: 
516: \subsection{Waiting time}
517: 
518: \begin{table}[h]
519:   \begin{center}
520:     \begin{tabular}{|l|c|c|c|c|}
521:       \hline
522:       Group & Mean & Stretch & Standard & CV \\
523:       &  & & Deviation &  \\
524:       \hline
525:       Biomed & 781.5   & 0.874 & 16398.8 & 20.9  \\
526:       Dteam  & 1424.1  & 0.135 & 26104.5 & 18.3  \\
527:       LHCb   & 217.7   & 0.905 & 2000.7 & 9.1  \\
528:       Atlas  & 2332.8  & 0.848 & 13052.1 & 5.5  \\
529:       Dzero  & 90.7    & 0.701 & 546.3 & 6.0   \\
530:       \hline
531:     \end{tabular}
532:   \end{center}
533:   \caption{Group mean waiting  time in seconds, corresponding Standard
534:     Deviation and Coefficient of Variation}
535:   \label{table:waiting}
536: \end{table}
537:  
538: \begin{table}[h]
539:   \begin{center}
540:     \begin{tabular}{|l|c|c|c|c|}
541:       \hline
542:       Queue & Mean & Standard & CV & Number \\
543:       &  & Deviation &  & of jobs \\
544:       \hline
545:       Test & 33335.9 & 148326.4 & 4.4 & 45760 \\
546:       Short & 1249.7 & 27621.8 & 22.1 & 81963 \\
547:       Long & 535.1 & 5338.8 & 9.9 & 32879 \\
548:       Day & 466.8 & 8170.7 & 17.5 & 19275 \\
549:       Infinite & 1753.9 & 24439.8 & 13.9 & 49060 \\
550:       \hline
551:     \end{tabular}
552:   \end{center}
553:   \caption{Queue mean waiting  time in seconds, corresponding Standard
554:     Deviation, Coefficient of Variation and number of jobs}
555:   \label{fig:queuemean}
556: \end{table}
557: 
558: Table~\ref{table:waiting} shows that jobs  coming from the Dteam group
559: are  the more  unfairly treaten.   Dteam group  sends short  jobs very
560: often, Dteam jobs are then all  placed in queue waiting that long jobs
561: from other groups finished.  Dzero  group sends short jobs more rarely
562: and is  also less  penalized than Dteam  because there are  less Dzero
563: jobs that  are waiting  together in queue  before being  treated.  The
564: best treated group is LHCb with not very long running jobs (average of
565: about  34 minutes)  and  one job  about  every 41  minutes.  The  best
566: behavior to reduce  waiting time per jobs seems to  send jobs that are
567: not too  short compared to the  waiting factor, and send  not too very
568: often in  order to avoid that  they all wait together  inside a queue.
569: Very long jobs is not a  good behavior too as the scheduler delay them
570: to run shorter one if possible.
571: 
572: Table~\ref{fig:queuemean} shows  the mean waiting  time per jobs  on a
573: given  queue.  There is  a problem  with such  a metric,  for example:
574: Consider one job arriving on a cluster with only one free CPU, it will
575: run on it  during a time $T$ with no waiting  time.  Consider now that
576: this job  is splitted  in $N$ shorter  jobs (numbered $0  \ldots N-1$)
577: with  equal total duration  $T$.  Then  the waiting  time for  the job
578: number $i$ will be $iT/N$,  and the total waiting time $(N-1)T/2$.  So
579: the more  a job is splitted the  more it will wait  in total.  Another
580: metric that does not depend on the number of jobs is the total waiting
581: time divided by the number of jobs and by the total job duration.  Let
582: note $\widehat{WT}$ this normalized waiting time, We obtain:
583: \begin{align}
584: \widehat{WT} & = \frac{TotalWaitingTime}{NJobs * TotalDuration} \nonumber\\
585: \widehat{WT} & = \frac{MeanWaitingTime}{NJobs * MeanDuration} 
586: \end{align}
587: 
588:  \begin{table}[h]
589:    \begin{center}
590:    \begin{tabular}{|l|c||l|c|}
591:      \hline
592:      Queue & $\widehat{WT}$ & Group & $\widehat{WT}$ \\
593:      \hline
594:      Test & 2.35e-2 & Biomed & 1.58e-5 \\
595:      Short & 1.02e-4 & Dteam & 6.79e-5 \\
596:      Long & 5.53e-6 & LHCb & 1.08e-5 \\
597:      Day & 3.65e-6 & Atlas & 2.23e-5 \\
598:      Infinite & 3.55e-6 & Dzero & 31.9e-5 \\
599:      \hline
600:    \end{tabular}
601:    \end{center}
602:    \caption{Queue and Group normalized waiting time}
603:  \end{table}
604:  With this  metric, the Test  queue is  still the most  unfairly treated and  the Infinite
605:  queue has  the more  benefits compared  to the other  queues.  Dteam  group is  again bad
606:  treated because their jobs are mainly sent  to the Test queue.  The more unfairly treated
607:  group is Dzero.
608: 
609: %% Todo
610: %% In fact, sending short jobs  rarely is the worst behavior for general
611: %% waiting time as it is unlikely  that two jobs will wait together in a
612: %% queue.
613:  
614:  \subsection{Arrival time}
615: 
616:  \begin{figure}[h]
617:    \centering \includegraphics[totalheight=5.2cm]{arrivalperday.eps}
618:    \caption{Job arrival daily cycle}
619:    \label{fig:arrivalhours}
620:  \end{figure}
621:  
622:  \begin{table}[h]
623:    \begin{center}
624:      \begin{tabular}{|l|c|c|c|}
625:        \hline
626:        Group & Mean & Standard & CV\\
627:        & (seconds) & Deviation & \\
628:        \hline
629:        Biomed & 223.6 & 5194.5 & 23.22 \\
630:        Dteam & 256.2 & 2385.4 & 9.31 \\
631:        LHCb & 2474.6 & 39460.5 & 15.94 \\
632:        Atlas & 2824.1 & 60789.4 & 21.52 \\
633:        Dzero & 5018.7 & 50996.6 & 10.16 \\
634:        \hline
635:      \end{tabular}
636:    \end{center}
637:    \caption{Group  interarrival  time  in  seconds,  corresponding
638:      Standard Deviation and Coefficient of Variation}
639:    \label{table:interarrival}
640:  \end{table}
641: 
642: Job arrival daily cycle is presented in figure~\ref{fig:arrivalhours}.
643: This  figure shows  the number  of  arrival depending  on job  arrival
644: hours, with a 10 minutes sampling.  Clearly users prefer to send their
645: jobs at o'clock.  In fact  we receive regular monitoring jobs from the
646: VO  Dteam.   The  monitoring   jobs  are  submitted  every  hour  from
647: \url{goc.grid-support.ac.uk}.  Users are located in all Europe, so the
648: effect of sending at working hours is summed over all users timezones.
649: However the  shape is  similar compared to  other daily  cycle, during
650: night (before  8am) less jobs are  submitted and there  is an activity
651: peak around midday, 2pm and 4pm.
652: 
653: Table~\ref{table:interarrival} shows the  moments of interarrival time
654: for each group.  CV is much  higher than $1$, this means that arrivals
655: are not  Poisson processes and are very  irregularly distributed.  For
656: instance  we could  receive  $10$  jobs in  $10$  minutes followed  by
657: nothing during  the $50$ next  minutes.  In this  case we have  a mean
658: interarrival time  of $6$ minutes but  in fact when  jobs arrived they
659: arrived every minutes.
660: 
661: Figure~\ref{fig:stats:utilization} shows the system utilization of our
662: cluster  during  each week.   There  are a  maximum  of  980 CPU  days
663: consumed  each week for  140 CPUs.  We have  a highly  varying cluster
664: activity.
665: 
666: \subsection{Frequency analysis}
667: 
668: \begin{figure*}[ht]
669:   \begin{center}
670:     \subfigure[Biomed job arrival rate each 5 minutes]{\label{table:stats:rateBiomed}\includegraphics[totalheight=5.2cm,clip]{freqBiomedlog.eps}} 
671:     \subfigure[Dteam job arrival rate each 5 minutes]{\label{table:stats:rateDteam}\includegraphics[totalheight=5.2cm,clip]{freqDteamlog.eps}}   
672:     \caption{Arrival frequencies for  Biomed and Dteam VOs (Proportion
673:       of  occurrences of  $n$ jobs  received during  an interval  of 5
674:       minutes)}
675:   \label{fig:stats:more}
676:   \end{center}
677: \end{figure*}
678: 
679: Job  arrival  rate  is  a  common   measurement  for  a  site  usage  in  queuing  theory.
680: Figures~\ref{table:stats:rateBiomed}   and~\ref{table:stats:rateDteam}  present   the  job
681: arrival  rate distribution.   It  is the  number of  time  $n$ jobs  are submitted  during
682: interval  length of 5  minutes.  They  show that  most of  the time  the cluster  does not
683: receive  jobs but jobs  arrived grouped.   Users actually  submit groups  of jobs  and not
684: stand-alone jobs.  It explains the shape of  the arrival rate: it fastly decreases but too
685: slowly  compared to  a Poisson  distribution.  Poisson  distribution is  usually  used for
686: modelling the arrival process but evidences are against that fact~\cite{feitelson95e}.
687: 
688:  Dteam monitoring jobs are short and regular jobs, there is no need of a special
689:  arrival model for  such jobs.  What we  observe for other kind of  jobs is that
690:  the job  arrival law is  not a Poisson Law  (see table~\ref{table:interarrival}
691:  where $CV \gg 1$) as for instance a web site traffic~\cite{paxson95wide}.  What
692:  really happens  is that  users come  using the cluster  from an  User Interface
693:  during some  time interval.  During  this time they  send jobs to  the cluster.
694:  Users log to an User Interface machine in order to send their jobs to a RB that
695:  dispatch them  to some CEs.  Note  that one can  send jobs to our  cluster only
696:  from an  User Interface, it means for  instance that jobs running  on a cluster
697:  cannot send secondary jobs.  On a computing site we do not have this user login
698:  information, but only job arrival.
699:  
700:  First  we look  at modelling  user arrival  and  submission behavior.
701:  Secondly we  show that  the model proposed  shows good results  for a
702:  group behavior.
703: 
704:  \section{Model}
705:  \label{Model}
706:  
707:  \subsection{Login model}
708:  \label{Login model}
709: 
710:  In this section we begin to model user \emph{Login}/\emph{Logout} behavior from
711:  the Grid  job flow  (figure~\ref{fig:jobflow}).  We neglect  the case  where an
712:  user has multiple login on different UI  at the same time.  We mean that a user
713:  is in the state  \emph{Login} if he is in the state of  sending jobs from an UI
714:  to our cluster, else he is in the state \emph{Logout}.
715: 
716: \begin{figure}[h]
717:   \centering
718:   \includegraphics[height=5.2cm]{modele3.eps}
719:   \caption{\emph{Login/Logout} cycle}
720:   \label{fig:login}
721: \end{figure}
722: 
723: Markov  chains  are  like  automatons  with  for each  state  a  probability  of
724: transition.  One property of Markov chains  is that future states depend only on
725: the current state and not on the  past history.  So a Markov state must contains
726: all  the  information  needed  for  future  states.  We  decided  to  model  the
727: \emph{Login/Logout} behavior as a continuous  Markov chain.  During each $dt$, a
728: \emph{Logout} user has a probability during  $dt$ of $\lambda dt$ to login and a
729: \emph{Login} user  has a probability during  $dt$ of $\delta dt$  to logout (see
730: figure~\ref{fig:login}).  $\lambda$ is called the \emph{Login} rate and $\delta$
731: is called the \emph{Logout} rate.
732: 
733: All these parameters could vary over time as we see with the variation
734: of the week job arrival (figure~\ref{fig:stats:utilization}) or during
735: day time  (figure~\ref{fig:arrivalhours}) The model  proposed could be
736: used more  accurately with non-constant  parameters at the  expense of
737: more calculation  and more difficult fitting.  For  example, one could
738: numerically use  Fourier series for  the \emph{Login} rate or  for the
739: submittal  rate  to model  this  daily  cycle.   We use  now  constant
740: parameters for calculation, looking for general properties.
741: 
742: We would like  to have the probabilities during time  that the user is
743: logged    or   not    logged.    Let    $\mathcal{P}_{Login}(t)$   and
744: $\mathcal{P}_{Logout}(t)$ be respectively probability that the user is
745: logged or not logged at time t. We have from the modelling:
746: \begin{eqnarray}
747:   \lefteqn{\mathcal{P}_{Logout}(t+dt) = (1 - \lambda dt) \mathcal{P}_{Logout}(t)}\nonumber  \\
748:   & & \mbox{} + \delta dt \mathcal{P}_{Login}(t) \\
749:   \lefteqn{\mathcal{P}_{Login}(t+dt) = (1 - \delta dt) \mathcal{P}_{Login}(t)}\nonumber \\
750:   & & \mbox{} + \lambda dt \mathcal{P}_{Logout}(t) 
751: \end{eqnarray}
752: At equilibrium we have no variation so
753: \begin{eqnarray}
754:   \mathcal{P}_{Logout}(t+dt) = \mathcal{P}_{Logout}(t) = \mathcal{P}_{Logout} \\
755:   \mathcal{P}_{Login}(t+dt) = \mathcal{P}_{Login}(t) = \mathcal{P}_{Login}
756: \end{eqnarray}
757: We obtain: 
758: \begin{align}
759:   \mathcal{P}_{Logout} &= \frac{\delta}{\lambda + \delta} \\
760:   \mathcal{P}_{Login} &= \frac{\lambda}{\lambda + \delta}
761: \end{align}
762: 
763: \subsection{Job submittal model}
764: \label{Job submittal model}
765: 
766: During period  when users are logged they  could submit jobs.  We  model the job
767: submittal rate for one  user as follows: During $dt$ when the  user is logged he
768: has  a probability  of $\mu  dt$ to  submit a  job.  With  $\delta=0$ we  have a
769: delayed Poisson process, with $\mu=0$ no  jobs are submitted.  The full model is
770: shown  at  figure~\ref{fig:markov}, it  shows  all  the  possible outcomes  with
771: corresponding probabilities from  one of the possible state to  the next after a
772: small period $dt$.  Numbers inside circles are the number of jobs submitted from
773: the start.   \emph{Login} states are below  and \emph{Logout} states  are at the
774: top.  We have:
775: 
776: 
777: \begin{figure*}[ht]
778:   \centering
779:   \includegraphics[height=8.6cm]{modele5.eps}
780:   \caption{Markov modelling of jobs submittal}
781:   \label{fig:markov}
782: \end{figure*}
783: 
784: \begin{itemize}
785: \item  $\mathcal{P}_n(t)$  is  the  probability  to be  in  the  state
786:   ``\emph{User is not logged at time  t and n jobs have been submitted
787:   between time 0 and t}.''
788: \item  $\mathcal{Q}_n(t)$  is  the  probability  to be  in  the  state
789:   ``\emph{User  is logged at  time t  and n  jobs have  been submitted
790:   between time 0 and t}.''
791: \item  $\mathcal{R}_n(t)$  is  the  probability  to be  in  the  state
792:   ``\emph{n jobs have been submitted  between time 0 and t}.'' We have
793:   $\mathcal{R}_n = \mathcal{P}_n + \mathcal{Q}_n$.
794: \end{itemize}
795: 
796: From  the  model,  we obtain  with  the  same  method as  before  this
797: recursive differential equation:
798: \begin{eqnarray}
799:   \mathcal{M} & = &
800:     \begin{pmatrix}
801:       - \lambda & \delta \\
802:       \lambda & -(\mu + \delta)
803:     \end{pmatrix} \\  
804:   \begin{pmatrix}
805:     \mathcal{P}_0 \\
806:     \mathcal{Q}_0
807:   \end{pmatrix} ' & = &
808:   \mathcal{M}
809:   \begin{pmatrix}
810:     \mathcal{P}_0 \\
811:     \mathcal{Q}_0
812:   \end{pmatrix} \\
813:   \begin{pmatrix}
814:     \mathcal{P}_n \\
815:     \mathcal{Q}_n
816:   \end{pmatrix} ' & = &
817:   \mathcal{M}
818:   \begin{pmatrix}
819:     \mathcal{P}_n \\
820:     \mathcal{Q}_n
821:   \end{pmatrix} +
822:   \begin{pmatrix}
823:     0 \\
824:     \mu  \mathcal{Q}_{n-1}
825:   \end{pmatrix}
826: \label{recursive}
827: \end{eqnarray}
828: 
829: This  results  to  the  following  recursive  equation  (in  case  the
830: parameters are constants, $\mathcal{M}$ is a constant)
831: \begin{eqnarray}
832:   \begin{pmatrix}
833:     \mathcal{P}_n \\
834:     \mathcal{Q}_n
835:   \end{pmatrix} & = &
836:   e^{\mathcal{M}t} 
837:   \int{
838:     e^{-\mathcal{M}x} 
839:     \begin{pmatrix}
840:       0 \\
841:       \mu  \mathcal{Q}_{n-1}
842:     \end{pmatrix}
843:     dx}
844: \end{eqnarray}
845:     
846: We take a look at the probability of having no job arrival during an interval of
847: time $t$  which is $\mathcal{P}_0$ and $\mathcal{Q}_0$.   $\mathcal{R}_0$ is the
848: the probability that no jobs have been submitted between arbitrary time 0 and t.
849: So from the above model, we have:
850: \begin{equation}\label{eqmatrix2}
851:   \begin{pmatrix}
852:     \mathcal{P}_0 \\
853:     \mathcal{Q}_0
854:   \end{pmatrix} ' =
855:   \begin{pmatrix}
856:     - \lambda & \delta \\
857:     \lambda & -(\mu + \delta)
858:   \end{pmatrix} 
859:   \begin{pmatrix}
860:     \mathcal{P}_0 \\
861:     \mathcal{Q}_0
862:   \end{pmatrix} 
863: \end{equation}
864: 
865: At  arbitrary  time  we  could  be  in  the  state  \emph{Login}  with
866: probability  $\lambda   /  (\lambda  +  \delta)$  and   in  the  state
867: \emph{Logout} with the probability  $\delta / (\lambda + \delta)$.  We
868: have from the results above: 
869: \begin{align}
870:   \begin{pmatrix}
871:     \mathcal{P}_0(0) \\
872:     \mathcal{Q}_0(0)
873:   \end{pmatrix} &=
874:   \begin{pmatrix}
875:     \mathcal{P}_{Logout} \\
876:     \mathcal{P}_{Login}
877:   \end{pmatrix} =
878:   \frac{1}{\lambda + \delta}
879:   \begin{pmatrix}
880:     \delta \\
881:     \lambda
882:   \end{pmatrix}
883:   \\
884:   \mathcal{R}_0 &= \mathcal{P}_0 + \mathcal{Q}_0
885: \end{align}
886: 
887: Finally we obtain the result. 
888: \begin{equation}
889: \begin{split}
890:   \mathcal{R}_0(t) &= m_0 \frac{e^{- m_1t} - e^{- m_2t}}{m_1
891:     - m_2} + \frac{m_1e^{- m_2t} - m_2e^{- m_1t}}{m_1 - m_2} 
892: \end{split}
893: \end{equation}
894: 
895: Where \begin{align}
896:   m_0 &= \frac{\lambda \mu}{\lambda + \delta} = \mu \mathcal{P}_{Login} \\
897:   m_1 + m_2 &= \lambda + \delta + \mu \\
898:   m_1 m_2 &= \lambda \mu \\
899: \end{align}
900: 
901: With $\lambda=0$ or $\mu=0$, we  obtain that no jobs are submitted $(\mathcal{R}_0(t)=1)$.
902: With $\delta=0$, this  is a Poisson process and  $\mathcal{R}_0(t)=e^{-\mu t}$.  Note that
903: during a period of $t$ there are in average $\mu \mathcal{P}_{Login} t$ jobs submitted, we
904: have also for small period $t$,
905: \begin{equation}
906:   \mathcal{R}_0(t) \approx 1 - \mu\mathcal{P}_{Login}t
907: \end{equation} 
908: We have also
909: \begin{align}
910: \mathcal{R}'_0(0) & = - \mu\mathcal{P}_{Login} \\
911: \mathcal{R}'_0(0) & = - \frac{\emph{Number of jobs submitted}}{\emph{Total duration}}
912: \end{align}
913: 
914: $\mathcal{R}_0(t)$  could be  estimated by  splitting the  arrival  processes in
915: intervals  of  duration  $t$ and  estimating  the  ratio  of intervals  with  no
916: arrival. The error of this estimation  is linear with $t$. Another issue is that
917: the logs precision is not below one second. 
918: 
919: \subsection{Model characteristics}
920: 
921: We have also these interesting properties:
922: \begin{eqnarray}
923:   \mathcal{R}'_0(0) & = & - \mu \mathcal{P}_{Login} \\
924:   \frac{\mathcal{R}''_0(0)}{\mathcal{R}'_0(0)} & = & - \mu \\
925:   \frac{\mathcal{R}'_0(0)^{2}}{\mathcal{R}''_0(0)} & = & \mathcal{P}_{Login} \\
926:   \frac{\mathcal{R}'''_0(0)}{\mathcal{R}''_0(0)} & = & - (\mu + \delta)
927: \end{eqnarray}
928: 
929: Probability  distribution  of  the  duration   between  two  jobs  arrival  is  called  an
930: interarrival process.  Interarrival process is a common metric in queuing theory.  We have
931: $\mathcal{A}(t)=\mathcal{P}_0(t)+\mathcal{Q}_0(t)$  with the  initial condition  that user
932: just submits a job.  This implies that user is logged.
933: \begin{equation}
934:   \mathcal{P}_0(0) = 0.0, \quad \mathcal{Q}_0(0) = 1.0 \nonumber
935: \end{equation}
936: \begin{equation}
937: \begin{split}
938: \mathcal{A}(t)  &= \mu \frac{e^{- m_1t} -  e^{- m_2t}}{m_1 -
939:     m_2} + \frac{m_1e^{- m_2t} - m_2e^{- m_1t}}{m_1 - m_2}
940: \end{split}
941: \end{equation}
942: \begin{equation}
943:   p = \frac{\mu - m_2}{m_1 - m_2} 
944: \end{equation}
945: \begin{equation}
946:   \mathcal{A}(t) = p e^{- m_1t} + (1 - p) e^{- m_2t} 
947: \end{equation}
948: We have $\mu \in[m1; m2]$ because
949: \begin{align}
950:   (\mu - m_1)(\mu - m_2) & = \mu^{2} - (\lambda + \delta + \mu)\mu + \delta \mu \nonumber\\
951:   (\mu - m_1)(\mu - m_2) & = -\delta \mu < 0
952: \end{align}
953: So $p \in  [0; 1]$, and we have  an hyper-exponential interarrival law
954: of  order 2 with  parameters $p=(\mu  - m_2)/(m_1  - m_2),  m_1, m_2$.
955: This   result   is    coherent   with   other   experimental   fitting
956: results~\cite{david-workload}  Moreover any  hyper-exponential  law of
957: order  2  may   be  modelled  with  the  Markov   chain  described  in
958: figure~\ref{fig:markov}  with parameters  $\mu =  p m_1  +  (1-p) m_2,
959: \lambda = m_1 m_2 / \mu, \delta = m_1 + m_2 - \mu - \lambda$
960: 
961: Let calculate the  mean interarrival time.  Probability to  have an interarrival
962: time   between   $\theta$  and   $\theta+d\theta$   is  $\mathcal{A}(\theta)   -
963: \mathcal{A}(\theta+d\theta) = -\mathcal{A}'(\theta)d\theta$.  The mean is
964: \begin{eqnarray}
965: \tilde{\mathcal{A}} &=& \int_{0}^{\infty}-\theta \mathcal{A}'(\theta)d\theta = \int_{0}^{\infty}\mathcal{A}(\theta)d\theta \\
966: \tilde{\mathcal{A}} &=& \frac{1}{\mu \mathcal{P}_{Login}} = \frac{\lambda+\delta}{\lambda\mu} 
967: \end{eqnarray}
968: 
969: Let compute the variance of interarrival distribution. 
970: \begin{eqnarray}
971:   \mathit{var} &=& \int_{0}^{\infty}-(\theta - \tilde{\mathcal{A}})^{2}\mathcal{A}'(\theta)d\theta \\
972:   \mathit{var} &=& 2 \int_{0}^{\infty}\theta\mathcal{A}(\theta)d\theta - \tilde{\mathcal{A}}^{2} \\
973:   \frac{\mathit{var}}{\tilde{\mathcal{A}}^{2}} &=& {CV}^{2} = 1 + 2\frac{\delta\mu}{(\lambda+\delta)^{2}} \\
974:   {CV}^{2} &=& 1 + 2~\mathcal{P}_{Logout}^{2} \frac{\mu}{\delta} 
975: \end{eqnarray}
976: 
977: Another  interesting property  is the  number of  jobs submitted  by this  model  during a
978: \emph{Login}  period.   Let  $P_n$  be  the  probability to  receive  $n$  jobs  during  a
979: \emph{Login} period. We have:
980: \begin{eqnarray}
981:   P_n & = & \int_{0}^{\infty} \frac{(\mu t)^{n}}{n!}e^{-\mu t} \delta e^{-\delta t} dt \\
982:   P_n & = & \frac{\delta}{\mu + \delta}(\frac{\mu}{\mu + \delta})^{n}
983: \end{eqnarray}
984: This  is  a  geometric law.  The  mean  number  of jobs  submitted  by
985: \emph{Login} period is $\mu/\delta$.
986: 
987: \subsection{Group model}
988: 
989: Groups are composed of users, either  regular users sending jobs at regular time
990: or users with a \emph{Login/Logout} like behavior.  Metrics defined below as the
991: mean number  of jobs  sent by  \emph{Login} state, the  mean submittal  rate and
992: probability of \emph{Login} could represent an user behavior.
993: 
994: \begin{figure}[h]
995:   \centering
996:   \includegraphics[height=5.2cm]{job-rate.eps}
997:   \caption{Users job submittal rates during their period of activity}
998:   \label{fig:jobrate}
999: \end{figure}
1000: 
1001: Figure~\ref{fig:jobrate} shows  the sorted distribution of  users submittal rate
1002: ($\mu  \mathcal{P}_{Login}$).  Except  for  the  highest values  it  is quite  a
1003: straight line in logspace.  This observation could be included in a group model.
1004: 
1005: \section{Simulation and validation}
1006: \label{Simulation}
1007: 
1008:  %% - Evaluation des parametres: r0(0) connu avec precision = µPlogin = nbjobs/tempstotal
1009:  %% - µPlogin classé par utilisateur, variance: conjecture d'une loi de groupe
1010:  %% - Graphe des R0 de plusieurs utilisateurs
1011:  %% - Norme utilisée, Fitting des fréquences
1012: 
1013: \begin{figure*}[hp]
1014:   \begin{center}
1015:    \subfigure[Biomed user 1]{\label{figure:simu:biomed001.clrce02}\includegraphics[totalheight=5.4cm,clip]{biomed001.clrce02.eps}}
1016:    \subfigure[Biomed user 2]{\label{figure:simu:biomed001.clrce01}\includegraphics[totalheight=5.4cm,clip]{biomed001.clrce01.eps}} \\
1017:    \subfigure[Biomed user 3]{\label{figure:simu:biomed002.clrce01}\includegraphics[totalheight=5.4cm,clip]{biomed002.clrce01.eps}} 
1018:    \subfigure[Biomed user 4]{\label{figure:simu:biomed005.clrce01}\includegraphics[totalheight=5.4cm,clip]{biomed005.2.eps}}    \\
1019:     \begin{tabular}{|l | c | c | c | c |}
1020:       \hline
1021:       Name & $\mu$ & $\delta$ & $\lambda$ & Error \\ 
1022:       \hline
1023:       Biomed user 1 & 0.0837 & 0.02079 & 2.1e-4 & 4.929e-3 \\
1024:       Biomed user 2 & 0.0620 & 0.01188 & 1.2e-4 & 3.534e-3 \\
1025:       Biomed user 3 & 0.0832 & 0.02475 & 2.5e-4 & 1.1078e-2 \\
1026:       Biomed user 4 & 0.0365 & 1.4285e-3 & 1.075e-4 & 8.78e-2\\
1027:       \hline
1028:     \end{tabular}
1029:   \end{center}
1030:   \caption{Biomed simulation results}
1031:   \label{table:simu}
1032: \end{figure*}
1033: 
1034: \begin{figure*}[hp]
1035:   \begin{center}
1036:     \subfigure[LPC cluster Biomed user]{\label{fig:r0Biomed}\includegraphics[totalheight=5.4cm,clip]{AngelR0.eps}}
1037:     \subfigure[NASA Ames most active user]{\label{fig:nasa:nasa}\includegraphics[totalheight=5.4cm,clip]{Nasa-3.eps}} \\
1038:     \subfigure[DAS2 fs0 cluster most active user]{\label{fig:nasa:das2}\includegraphics[totalheight=5.4cm,clip]{DAS2-75.eps}}
1039:     \subfigure[SDSC Blue Horizon most active user]{\label{fig:nasa:sdsc}\includegraphics[totalheight=5.4cm,clip]{SDSC-35.eps}}
1040:     \caption{Hyper-Exponential fitting of $\mathcal{R}_0$ for a Biomed
1041:       LPC user  and for the most  active users at NASA  Ames, DAS2 and
1042:       SDSC Blue Horizon clusters.}
1043:     \label{fig:nasa}
1044:   \end{center}
1045: \end{figure*}
1046: 
1047: We have  done a simulation  in Scheme~\cite{kelsey98revised} directly  using the
1048: Markov model.  We began by fitting  users behavior from the logs with our model.
1049: Like  the frequency  obtained  from the  logs,  the model  shows  a majority  of
1050: intervals with no job arrival, possibly followed by a relatively flat area and a
1051: fast     decreases.     Some     fitting     results    are     presented     in
1052: figures~\ref{table:simu}. Norm used  to fit real data is  the maximum difference
1053: between the two cumulative distributions.  We fitted the frequency data for each
1054: user.
1055: 
1056: During a  period of $t$  there are in  average $\mu \mathcal{P}_{Login}  t$ jobs
1057: submitted.   We evaluate  the value  of $\mu  \mathcal{P}_{Login}$ which  is the
1058: average number of jobs submitted by seconds.  We use that value when doing a set
1059: of simulation  in order to fit  a known real user  probability distribution.  We
1060: have two free  parameters, so we vary $\mathcal{P}_{Login}$  between 0.0 and 1.0
1061: and   $lambda$   which  the   inverse   is  the   average   time   an  user   is
1062: \emph{Logout}. Some results obtained are shown in figures~\ref{table:simu}.
1063: 
1064: $\mu$ parameter decides of the frequency length of the curve.  Without
1065: the  \emph{Login} behavior we  would have  obtained a  classic Poisson
1066: curve of $\mu$ parameter. $1/\mu$ is the mean interarrival time during
1067: \emph{Login} period.  An  idea to evaluate $\mu$ would  be to evaluate
1068: the job  arrival rate  during \emph{Login} periods,  but we  lack that
1069: \emph{Login} information.
1070: 
1071: $\delta$ and $\lambda$ are the  \emph{Logout} and \emph{Login} parameters.  What is really
1072: important is the ratio $\lambda/(\lambda  + \delta)$ which is $\mathcal{P}_{Login}$.  This
1073: is the  ratio between time  user is active  on the cluster  and total time.   $\delta$ and
1074: $\mathcal{P}_{Login}$  are measures  of the  deviation from  a classic  Poisson  law.  For
1075: instance, the mean number of jobs submitted by \emph{Login} period is $\mu/\delta$ and the
1076: mean job submittal rate is  $\mu\mathcal{P}_{Login}$.  For a same $\mathcal{P}_{Login}$ we
1077: could have  very different scenarios.   A user  could be active  for long time  but rarely
1078: logged and another user could be  active for short period with frequent login.  $1/\delta$
1079: is the mean \emph{Login} time, $1/\lambda$ is the mean \emph{Logout} time.
1080: 
1081: The  $\mathcal{R}_0$  probability  is essential  for  studying  job  arrival time.   $1  -
1082: \mathcal{R}_0(t)$ is  the probability that  between time $0$  and $t$ we have  received at
1083: least one job.  It is easier to  fit the $\mathcal{R}_0$ distribution for an user than the
1084: interarrival distribution because we  have more points.  Figure~\ref{fig:r0Biomed} shows a
1085: typical  graph of  $\mathcal{R}_0$ for  a Biomed  user.  It  shows for  instance  that for
1086: intervals of 10000 seconds, this Biomed user has a probability of about 0.2 to submits one
1087: or more  jobs.  We have  fitted this probability  with hyper-exponential curve, that  is a
1088: summation of exponential curves.  There was too  much noises for high interval time to fit
1089: that curve.  In fact  errors on $\mathcal{R}_0$ are linear with $t$.   So we have smoothed
1090: the curve  before fitting  by averaging  near points.  $\mathcal{R}_0$  for this  user was
1091: fitted with a sum of two exponentials.
1092: 
1093: %%% \begin{eqnarray}
1094: %%%   \lefteqn{e^{-0.197-8.12e-06t} + e^{-2.04-4.002e-04t}} \nonumber \\
1095: %%%   & & \mbox{} + e^{-3.234-4.308e-03t}
1096: %%% \end{eqnarray}. 
1097: 
1098: It seems that more than  the \emph{Login}/\emph{Logout} behavior there is also a
1099: notion of user activity.  For example during the preparation of jobs or analysis
1100: phase of the results an user does  not use the Grid and consequently the cluster
1101: at all.  More  than the \emph{Login} and \emph{Logout}  state an \emph{Inactive}
1102: state could be added to the model if needed.
1103: 
1104: \subsection{Other workloads comparison}
1105: 
1106: %% Particularité des jobs ne se trouve pas dans les autres modeles. 
1107: 
1108:  User number 3 is the most  active user from the NASA Ames iPSC/860 workload~\footnote{The
1109:  workload log  from the NASA Ames iPSC/860  was graciously provided by  Bill Nitzberg. The
1110:  workload  logs  from DAS2  were  graciously  provided by  Hui  Li,  David  Groep and  Lex
1111:  Wolters. The  workload log from the SDSC  Blue Horizon was graciously  provided by Travis
1112:  Earheart  and  Nancy   Wilkins-Diehr.   All  are  available  at   the  Parallel  Workload
1113:  Archives~\url{http://www.cs.huji.ac.il/labs/parallel/workload/}\label{foot:pwa}}.
1114:  Figure~\ref{fig:nasa:nasa}  shows  its  $\mathcal{R}_0(t)$  probability,  it  is  clearly
1115:  hyper-exponential of  order 2, as other  users like number  22 and 23.  Other  users like
1116:  number 12 and 15 are more classical Poissonian users.
1117: 
1118:  DAS2 Clusters (see  note \ref{foot:pwa}) used also PBS and  MAUI as their batch
1119:  system and scheduler.   The main difference we have with them  is that they use
1120:  Globus to co-allocate  nodes on different clusters.  We only  have bag of tasks
1121:  applications which interacts together in a pipeline way by files stored on SEs.
1122:  Their     fs0     144     CPUs     cluster    is     quite     similar     with
1123:  ours. Figure~\ref{fig:nasa:das2}  shows the $\mathcal{R}_0(t)$  probability for
1124:  their most active user and corresponding hyper-exponential fitting or order 2.
1125: 
1126:  SDSC Blue Horizon cluster (see note  \ref{foot:pwa}) have a total of 144 nodes.
1127:  The $\mathcal{R}_0(t)$  distribution probability of their most  active user was
1128:  fitted with a hyper-exponential of order 2 in figure~\ref{fig:nasa:sdsc}.
1129:  
1130: %% test.12.r0 ~ exp = Poisson process
1131: %% test.14.r0 ~ droite => envoi regulier
1132: %% test.24.r0 ~ hyper-exp
1133: 
1134: \section{Related works}
1135: \label{related}
1136: 
1137:  Our  Grid environment  is very  particular  and different  from common  cluster
1138:  environment as  parallelism involved requires no  interaction between processes
1139:  and degree of parallelism is one for all jobs.
1140: 
1141:  To be  able to completely  simulate the  node usage we  need not only  the jobs
1142:  submittal  process but also  the job  duration process.   Our runtime  model is
1143:  similar  with  the Downey  model~\cite{downey99elusive}  for  runtime which  is
1144:  composed of linear  pieces in logspace.  There is  a strong correlation between
1145:  successive jobs  running time but  it seems unlikely  that a general  model for
1146:  duration may be  made because it depends highly on algorithms  and data used by
1147:  users.
1148: 
1149:  Most other models use  Poisson distribution for interarrival distribution.  But
1150:  evidences,  like CV  be  much  higher than  one,  demonstrate that  exponential
1151:  distributions does not fit  well the real data~\cite{feitelson01, jann97}.  The
1152:  need  of a  detailed  model was  expressed  in~\cite{talby99b}.  With  constant
1153:  parameters our model exhibits a hyper-exponential distribution for interarrival
1154:  rate and justify  such a distribution choice.  One strong  benefit of our model
1155:  is  that  it  is  general  and  could be  used  numerically  with  non-constant
1156:  parameters at the expense of difficult fitting.
1157:  
1158: \section{Discussion}
1159: \label{Discussion}
1160: 
1161:  What  could be  stated is  that job  maximum run  times provided  by  users are
1162:  essentially inaccurate,  some authors are  even not using this  information for
1163:  scheduling~\cite{darincosts}.  Maybe  a better concept is  the relative urgency
1164:  of a job.   For example on a grid software managers  are people responsible for
1165:  installing software  on cluster nodes  by sending installation  jobs.  Software
1166:  manager jobs may  be regarded as more urgent than other  jobs type.  So sending
1167:  jobs  with an  estimated runtime  could  be replaced  by sending  jobs with  an
1168:  urgency parameter.  That urgency could be established in part as a site policy.
1169:  Each site administrator  could define some users classes  for different kind of
1170:  jobs and  software used  with different jobs  priorities.  For instance  a site
1171:  hosted in some laboratory might wish to promote its scientific domain more than
1172:  other domain, or some specific applications might need quality of services like
1173:  real time interaction.
1174: 
1175:  Another idea for scheduling  is to have some sort of risk  assessment measured during the
1176:  scheduler decision.  This risk assessment may be based on blocking probabilities obtained
1177:  either from the logs or from some user behavior models.  For example, it could be wise to
1178:  forbid that a group or  an user takes all the cluster at a given  time but instead to let
1179:  some few percents of it open for short jobs or low CPU consuming jobs like monitoring.
1180:  
1181:  Information System  shows for a  site the number  of job currently  running and
1182:  waiting. But it is not really the relevant metric in an on-line environment.  A
1183:  better metric for a cluster is  the computing flow rate input and the computing
1184:  flow rate capacity.  A cluster is  able to treat some amount of computation per
1185:  unit of time.   So a cluster is contributing to the  Grid with some computation
1186:  flow rate (in GigaFLOPS or TeraFLOPS).  As with classical queuing theory if the
1187:  input rate is  higher than the capacity, the site is  overloaded and the global
1188:  performances are low due to jobs  waiting to be processed. What happens is that
1189:  the site  receive more jobs that  is is able to  treat in a given  time. So the
1190:  queues begin to grow and jobs have  to wait more and more before being started,
1191:  resulting in performance decay.   Similarly when the computation submitted rate
1192:  is lower  than the site  capacity the site  is under-used.  Job  submittal have
1193:  also to be  fairly distributed according to the site  capacity.  For example, a
1194:  site  that  is twice  bigger  than  another site  have  to  receive twice  more
1195:  computing request  than the  other site.   But there is  a problem  to globally
1196:  enforce  this submittal  scheme on  all the  Grid.  This  is why  a  local site
1197:  migration policy  may be better than  a central migration policy  done with the
1198:  RB.
1199: 
1200:  To be more precise there are  two different kinds of cluster flow rate metrics,
1201:  one is  the local flow  rate and the  other one is  the global flow  rate.  The
1202:  local computing flow rate is the flow  rate that one job sees when reaching the
1203:  site.  The global flow rate is the computing flow rate a group of jobs see when
1204:  reaching the  site.  That  global flow  rate is also  the main  measurement for
1205:  meta-scheduling between  sites.  These two metrics are  different, for instance
1206:  we could have a  site with a lot of slow machines (low  local flow rate and big
1207:  global flow rate) and another site with only few supercomputers (big local flow
1208:  rate and low global flow rate).  But the most interesting metric for one job is
1209:  the local  flow rate.   This means that  if each  job wants individually  to be
1210:  processed at  the best  local flow rate  site, this  site will saturate  and be
1211:  globally slow.
1212: 
1213:  As far as all  users and groups computation total flow rate is  less than the site global
1214:  flow rate or site capacity, there is no real fairness issue because there is no strife to
1215:  access the site  resources, there is enough for  all.  The problem comes when  the sum of
1216:  all  computation flow  rate is  greater  than the  site capacity,  firstly this  globally
1217:  reduces the site  performance, secondly the scheduler must take  decision to share fairly
1218:  these resources.  The Grid is an ideal  tool that would allow to balance the load between
1219:  sites by migrating jobs~\cite{darincosts}.  A site  that share their resources and is not
1220:  saturated could  discharge another  heavily loaded  site.  Some kind  of local  site flow
1221:  control could  maintain a bounded input  rate even with fluctuating  jobs submittal.  For
1222:  instance fairness  between groups and  users could be  maintained by decreasing  the most
1223:  demanding input rate and distributing it to other less saturated sites.
1224: 
1225:  Another  benefits is  that  applications  computing flow  rates  may be  partly
1226:  expressed by users  in their job requirements.  Computing  flow rate takes into
1227:  account both  the jobs sizes and  their time limits. Fairness  between users is
1228:  then ensured if whatever may be flow values asked by each user, part granted to
1229:  each  penalizes no  other one.  Computing flow  rate granted  by a  site  to an
1230:  application may depend  on the applications degree of  parallelism, that is for
1231:  the moment the number of jobs.  For  instance it may be more difficult to serve
1232:  an application composed of only one job asking for a lot of computing flow rate
1233:  than to serve  an application asking the same computing  flow rate but composed
1234:  of many jobs.   Urgency is not totally measured by a  computing flow rate.  For
1235:  example  a critical  medical application  which is  a matter  of life  or death
1236:  arriving on  a full site has to  be treated in priority.  Allocating flow rates
1237:  between users and groups has to be  right and to take under account priority or
1238:  urgency issues.
1239: 
1240:  To use a site  wisely users have to bound their computational  flow rate and to
1241:  negotiate  it with  site managers.   A computing  model has  to be  defined and
1242:  published.  These remarks  are important in the case  of on-line computing like
1243:  Grids  where meta-scheduling strategy  have to  take a  lot of  parameters into
1244:  account.      General     on-line     load     balancing     and     scheduling
1245:  algorithms~\cite{azar97line,   azar94line,  bar-noy00new,  lam02line}   may  be
1246:  applied.  The problem of finding the  best suited scheduling policy is still an
1247:  open problem.  A better understanding of  job running time is necessary to have
1248:  a full model.
1249: 
1250:  The LCG middleware allows  users to send their jobs to different  nodes.  This is done by
1251:  the way of a central element called  a Resource Broker, that collects user's requests and
1252:  distributes  them  to computing  sites.   The  main purpose  is  to  match the  available
1253:  resources and balance the load of  job submittal requests. Jobs are better localized near
1254:  the data they need to use.
1255: 
1256:  We would  like to advise instead a  peer to peer~\cite{andrade-ourgrid} view  of the Grid
1257:  over a centralized one.  In this view computing sites themselves work together with other
1258:  computing  sites to  balance the  average workload.   Not relying  on  dependent services
1259:  greatly improves  the reliability and  adaptability of the  whole systems.  That  kind of
1260:  meta-scheduling have to  be globally distributed as stated by Dmitry  Zotkin and Peter J.
1261:  Keleher~\cite{zotkin99joblength}:
1262:  
1263:  %% \begin{quote}
1264:  \textit{In a distributed system like  Grid, the use of a central Grid
1265:  scheduler}\footnote{like the Resource Broker used in LCG middleware}
1266:  \textit{may result in a performance  bottleneck and lead to a failure
1267:  of  the  whole  system.   It   is  therefore  appropriate  to  use  a
1268:  decentralized scheduler architecture  and distributed algorithm.}  
1269:  %%  \end{quote}  %% %%  \hfill{\small  (Dmitry  Zotkin  and Peter J. Keleher )}
1270: 
1271:  gLite~\cite{glite-design} is the next generation middleware for Grid computing.
1272:  gLite will provide  lightweight middleware for Grid computing.   The gLite Grid
1273:  services  follow   a  Service  Oriented  Architecture   which  will  facilitate
1274:  interoperability among  Grid services.  Architecture details of  gLite could be
1275:  viewed  in~\cite{glite-arch}.   The architecture  constituted  by  this set  of
1276:  services is not bound to  specific implementations of the services and although
1277:  the  services are expected  to work  together in  a concerted  way in  order to
1278:  achieve the goals of the end-user  they can be deployed and used independently,
1279:  allowing  their   exploitation  in  different  contexts.    The  gLite  service
1280:  decomposition  has been largely  influenced by  the work  performed in  the LCG
1281:  project.  Service implementations need to  be inter-operable in such a way that
1282:  a client may talk to different independent implementations of the same service.
1283:  This  can be  achieved in  developing  lightweight services  that only  require
1284:  minimal  support from their  deployment environment  and defining  standard and
1285:  extensible communication protocols between Grid services.
1286:  
1287: \hfill
1288: 
1289: 
1290:  \section{Conclusion}
1291:  \label{Conclusion}
1292: 
1293:  So far  we have analyzed the  workload of a Grid  enabled cluster and
1294:  proposed an infinite Markov-based model that describes the process of
1295:  jobs arrival.   Then a  numerical fitting has  been done  between the
1296:  logs and the  model. We find a very similar  behavior compared to the
1297:  logs, even bursts were observed during the simulation.
1298: 
1299:  \section*{Acknowledgments\markboth{Acknowledgments}{Acknowledgments}}
1300:  The cluster at LPC  Clermont-Ferrand was funded by Conseil R\'egional
1301:  d'Auvergne   within   the   framework   of  the   INSTRUIRE   project
1302:  (\url{http://www.instruire.org})
1303: 
1304: %%%%  \appendix
1305: %%%%  
1306: %%%%  \begin{table*}[hb]
1307: %%%%     \begin{center}
1308: %%%%       \begin{tabular}{l l}
1309: %%%%         \textsc{Pilot applications} & \\
1310: %%%%         \hline \\
1311: %%%%         \textbf{CDSS} & Clinical decision support system \\
1312: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/CDSS.html} \\   
1313: %%%%         \textbf{GATE} & Application for tomographic emission \\
1314: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/gate.html} \\
1315: %%%%         \textbf{GPS@} & Grid genomic web portal \\
1316: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/GPSA.html}\\
1317: %%%%         \\
1318: %%%%         \textsc{External applications} & \\
1319: %%%%         \hline \\
1320: %%%%         \textbf{MammoGrid} & European-wide database of mammograms \\
1321: %%%%         & \url{http://mammoGrid.vitamib.com/} \\
1322: %%%%         \\
1323: %%%%         \textsc{Internal applications} & \\
1324: %%%%         \hline \\
1325: %%%%         \textbf{Docking platform} & Grid-enabled docking platform for in
1326: %%%%         sillico drug discovery for tropical diseases \\
1327: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/tropicaldisease.html} \\
1328: %%%%         \textbf{gPTM 3D} & Interactive radiological image visualization and processing tool \\
1329: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/gPTM3D.html} \\
1330: %%%%         \textbf{GridGRAMM} & Molecular docking web \\
1331: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/GridGRAMM.html} \\
1332: %%%%         \textbf{GROCK} & Mass screenings of molecular interactions web \\
1333: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/GROCK.html} \\
1334: %%%%         \textbf{SiMRI 3D} & Magnetic resonance image simulator \\
1335: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/SiMRI3D.html} \\
1336: %%%%         \textbf{Xmipp MLrefine} & Macromolecular 3D structure analysis \\
1337: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/xmipp_MLrefine.html} \\
1338: %%%%         \textbf{Xmipp multiple CTFs} & Micrographia CTF calculation \\
1339: %%%%         & \url{http://egee-na4.ct.infn.it/biomed/Xmipp_assign_multiple_CTFs.html} \\
1340: %%%%       \end{tabular}
1341: %%%%     \end{center}
1342: %%%%     \caption{Biomedical application list used by the Biomed VO}
1343: %%%%     \label{table:applications}
1344: %%%%   \end{table*}
1345: 
1346: %%\clearpage
1347: \begin{small}
1348: 
1349:   \bibliographystyle{unsrt}
1350:   \bibliography{workloadAnalysis.bib}
1351: 
1352: \end{small}
1353: 
1354: \end{document}
1355: 
1356: 
1357: 
1358: %% R0
1359: %% Ok:
1360: %% atlas001 @ clrce01/2 (~ regulier ?) 
1361: %% atlas002 @ clrce01
1362: %% atlas003 @ clrce01 (tres regulier ?)
1363: %% atlas004 @ clrce01 (!!) 
1364: %% atlassgm @ clrce01 (!!) 
1365: %% atlassgm @ clrce02
1366: %% biomed001 @ clrce01 (!!) 
1367: %% biomed001 @ clrglop195
1368: %% biomed002 @ clrce01 (!!) 
1369: %% biomed002 @ clrce02 (!!) 
1370: %% biomed003 @ clrce01 !
1371: %% biomed003 @ clrce02 !
1372: %% biomed004 @ clrce01 !
1373: %% biomed005 @ clrce01 (!!) 
1374: %% biomed005 @ clrglop195 (!!) 
1375: %% biomed007 @ clrce01 (!!) (tres regulier ?) 
1376: %% biomgrid @ clrce01/2/95 (!!) 
1377: %% cms002 @ clrce01 (!!) tres regulier ? 
1378: %% dteam001 @ clrce01/95 (!!) regulier ? 
1379: %% dteam042 @ clrce02/95 (!!) tres regulier ? 
1380: %% dteam002 @ clrce01/2/95 (!!) tres regulier ? 
1381: %% dteam003 @ clrce01/2/95 regulier ? 
1382: %% dteam004 @ clrce01/2/95 (!!) tres regulier ? 
1383: %% dteam005 @ clrce01/2/95 regulier ? 
1384: %% dteam006/7 ce02/95 tres regulier ? 
1385: %% dteam011/3 (!!) tres regulier 
1386: %% lhcb001/2 @ clrce01/2/95 (!!) hyperexp.. 
1387: %% lhcb003 ce01 (--) hyperexp.. 
1388: %% lhcb003 @ 95 (!!) tres regulier ou hyperexp ?. 
1389: %% lhcbsgm @ 95 (!!) regulier ! 
1390: 
1391: 
1392: 
1393: 
1394: 
1395: 
1396: 
1397: 
1398: 
1399: %% LEs RB sont utilisés comme moyen de pression.