0805.2443/ms.tex
1: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2: %
3: % The scale free character of the cluster mass function and the universality
4: % of the stellar IMF
5: %
6: % Version 1.2: 31 Jan 2008 Many changes and added work.
7: %
8: % Version:  04 May 2007, FSE
9: %		Includes changes requested by referee after
10: %		the answer by the editor.
11: % Version: 15 May 2007, JME
12: %               mostly cosmetical. Proposed re-phrasing of discussion of Elmegreen
13: %
14: % Version: 15 November 2007, JME
15: % Version: 18 January 2008. JME
16: %
17: %
18: \documentclass[12pt,manuscript]{emulateapj}
19: %%\usepackage[sort]{natbib}
20: %%\usepackage{amssymb,amsmath}
21: \bibliographystyle{plainnat} 
22:  
23: \newcommand{\vdag}{(v)^\dagger}
24: \newcommand{\myemail}{jmelnick@eso.org}
25: 
26: %% You can insert a short comment on the title page using the command below.
27: 
28: %\slugcomment{Draft. Please do not circulate (\today).}
29: 
30:  
31: \shorttitle{The cluster mass function and the universality of the IMF}
32: \shortauthors{Selman and Melnick}
33: 
34:  
35: \begin{document}
36: 
37: %% LaTeX will automatically break titles if they run longer than
38: %% one line. However, you may use \\ to force a line break if
39: %% you desire.
40: 
41:     \title{ 
42: 	The scale-free character of the cluster mass function and the universality of the stellar IMF.
43: 	}
44: 
45: 
46: %% Use \author, \affil, and the \and command to format
47: %% author and affiliation information.
48: %% Note that \email has replaced the old \authoremail command
49: %% from AASTeX v4.0. You can use \email to mark an email address
50: %% anywhere in the paper, not just in the front matter.
51: %% As in the title, use \\ to force line breaks.
52: 
53: \author{Fernando J. Selman   and Jorge Melnick }
54: \affil{European Southern Observatory, Santiago, Chile.}
55:    
56: % \email{fselman@eso.org}
57: 
58:  
59: \begin{abstract}
60: Our recent determination of a Salpeter slope   
61: 	for the IMF in the field of 30 Doradus \citep{selman2005}
62: 	appears to be in conflict with simple probabilistic counting 
63: 	arguments advanced in the past to support observational
64: 	claims of a steeper  IMF in the LMC field.
65: 	In this paper we re-examine these arguments and show by explicit 
66: 	construction that, contrary to these claims, the field IMF is expected 
67: 	to be exactly the same as  the stellar IMF of the  clusters out
68: 	of which the field was presumably formed.
69: 	We show that the current data on the mass distribution of clusters themselves
70: 	is in excellent agreement with our model, and is consistent
71: 	with a single spectrum {\it by number of stars} of the type $n^\beta$~with $\beta$\ between -1.8 and -2.2 
72: 	down to the smallest clusters without any preferred mass scale for
73: 	cluster formation.
74: 	We also use the random sampling model to estimate the statistics of the maximal
75: 	mass star in clusters, and confirm the discrepancy with observations found by \cite{weidner2006}.
76: 	We argue that rather than signaling the violation of the random sampling
77: 	model these observations reflect the gravitationally unstable nature of systems with one
78: 	very large mass star. We stress the importance of the random sampling model
79: 	as a \emph{null hypothesis} whose violation would signal the presence of
80: 	interesting physics.
81: \end{abstract}
82: 
83: %% Keywords should appear after the \end{abstract} command. The uncommented
84: %% example has been keyed in ApJ style. See the instructions to authors
85: %% for the journal to which you are submitting your paper to determine
86: %% what keyword punctuation is appropriate.
87:  \keywords{	galaxies: evolution --
88: 		galaxies: star clusters --
89: 		galaxies: stellar content --
90: 		Galaxy:	stellar content --
91: 		stars: formation --
92: 		stars: mass function --
93: 		star formation --
94:                 initial mass function --
95:                 IMF --
96: 		Star clusters
97:                }
98:  
99:  \section{Introduction}
100: 
101: In a recent paper \citep{selman2005} we measured
102: the Initial Mass Function (IMF) of the field in the 30 Doradus
103: super-association and found that for $7 \leq m/M_\odot \leq 40$\
104: the field IMF can be characterized as a power law with the \citet{salpeter1955} slope.
105: This result contradicts  claims of a steep
106: IMF for the LMC field \citep{massey1995, massey2002, gouliermis2005},
107: and lends support to the hypothesis of a universal IMF.
108: However, the observation of an initial mass spectrum of the same slope
109: in clusters and the field goes against
110: the probabilistic counting arguments of \cite{vanbeveren1982} as
111: interpreted by \citet[][henceforth KW2003]{kroupa2003}. Following
112: Vanbeveren, KW2003 posit that if the field population is entirely formed
113: out of disrupted clusters, then the field IMF 
114: must be steeper because there are many more low mass clusters
115: than massive ones, and low mass clusters cannot contain 
116: stars more massive than the clusters themselves.
117:   
118: Although an extensive review of the literature is beyond the scope
119: of the present work, a brief tour will place it in its proper context.
120: \cite{vanalbada1968b} built groups of stars by randomly sampling
121: an IMF $f(m)dm$\ and gives the formulas for the general order statistics
122: where the distribution function of the maximum mass star is one
123: particular case\footnote{Recently \cite{oey2005} have shown using the random sampling
124: model that the statistics of the maximal mass star in a number
125: of OB association shows evidence of an upper mass limit in the range 100-200~M$_\odot$.}.
126: \cite{reddish1978} gives the formulation used
127: by Vanbeveren, which appears to be one of the first references
128: that gives the formula for the mass of the maximal stellar mass
129: as an integral of the IMF (Equation~\ref{mmax} below).
130: \citet{larson1982} studied the correlation between the maximum stellar
131: mass and the mass of the parent molecular clouds in star-forming regions.
132: He noted that the observed correlation, $M_{max*} = 0.33M_{cloud}^{0.43}$,
133: could be explained by stochastic sampling of an IMF with Salpeter slope.
134: To study dynamical biasing \citep{vanalbada1968a} in binary star formation \cite{mcdonald1993} used a two-step
135: process in which they sample stars assuming that a certain fraction $g(N)$\ of them come
136: from groups of size N, and then sampled a stellar mass spectrum to build several statistics
137: of binary stars. The method  was extended by \cite{sterzik1998} to study the decay of
138: gravitational few body systems. To build
139: their clusters they introduced what they called a ``two-step" approach which later lead
140: them to the ``two-step initial mass function" \citep{durisen2001}: first draw a cluster
141: mass from a cluster mass-function, then draw enough stars from  an stellar IMF to
142: add to the cluster mass. The method presented here is similar, but with the
143: important difference that we do not censor by mass, but we rather work with a cluster
144: spectrum by number, and draw stars from a stellar mass function. This has the important
145: consequence that we know by construction that there are no preferred mass scales other
146: than those present in the stellar mass spectrum or the spectrum of clusters by number.
147: A similar model was used by \cite{oey2004}\ to study the distribution of clusters
148: by numbers in the SMC to conclude that the data for the high mass groupings studied
149: is consistent with an $n^{-2}$\ distribution.
150: 
151: \cite{vanbeveren1982} using the then assumed Salpeter slope
152: for the field stellar IMF concludes that \emph{``massive aggregates would
153: contain more OB type stars than predicted by the Salpeter IMF."}
154: Interestingly, KW2003 turn the argument around and use the well established Salpeter
155: form for the cluster stellar IMF for $m>1 M_\odot$~to infer a slope
156: steeper than Salpeter for the field stellar IMF in the same mass range.
157: With the exception of the LMC work mentioned above, the evidence
158: for a Salpeter slope for the field stellar population is overwhelming,
159: from the original \citet{salpeter1955} work on the Milky Way
160: to more recent work such as that of \cite{scalo1986}
161: that steepen slightly the slope of the high mass end from 2.35 to 2.7
162: \citep[for recent reviews on this topic the reader is referred to][]{kroupa2002,elmegreen2006}.
163: Recently, Weidner and Kroupa (2006; henceforth WK2006) used an extensive
164: set of Monte Carlo simulations to investigate the question of whether clusters
165: could be constructed by sampling stellar IMFs using different sampling prescriptions.
166: Their strong conclusion is that the model in which clusters are 
167: built by random sampling of a (Salpeter) stellar IMF is falsified by the statistics
168: of the maximal mass star in clusters, stating: \emph{``With this contribution
169: we demonstrate conclusively that the purely statistical notion is false,
170: and that the stellar IMF is sampled to a maximum stellar
171: mass that correlates with the cluster mass."} 
172: 
173: The purpose of this paper is to revisit this issue and to examine the question of whether clusters
174: and the field sample a universal stellar mass distribution.
175: We show that what really matters
176: is the (poorly determined) cluster mass spectrum in the range of single star masses ($M<150M_{\odot}$), and that
177: it is more natural to work with the cluster ``number of stars spectrum", $P(n)$, 
178: the probability of a cluster having $n$ stars.
179: We address this question in two different ways: by studying it from first principles, and by actually
180: doing Monte Carlo experiments building clusters  randomly
181: sampling a universal stellar IMF and comparing the results with the observations. 
182: The random sampling model has no other physics in it than that input from the stellar mass
183: spectrum and the cluster number spectrum. It should be considered as a \emph{null hypothesis}
184: for interesting physical processes: its violation signals the presence of interesting physics.
185: Occam's razor should be used with all models that violate the \emph{null hypothesis} until
186: strong observational evidence renders the model untenable.
187: 
188: In Section~\ref{formalism} we present the formal framework for the subsequent analysis,
189: and give an analytical parametrization of the stellar IMF
190: that agrees reasonably well with observations at all masses. In that section we also
191: present an analytical relationship between the cluster mass function and the stellar IMF. We use this
192: relation to conduct Monte Carlo experiments to simulate
193: the mass distribution of  clusters.   In Section~\ref{obs} we compare our simulations with the observed 
194: distribution of embedded clusters presented by \citet[][~henceforth LL2003]{lada2003}. The claim by LL2003
195: that there is a preferred mass scale for cluster formation is not born out by
196: our analysis, and a critical discussion to uncover the sources of this
197: discrepancy is presented. In Sections~\ref{compare} and~\ref{disc} 
198: we challenge the view that all stars 
199: form in clusters and argue that our results favor a view where stars form, or at least acquire their
200: final properties, before cluster formation.
201: Section~\ref{summary} summarizes our results and ends with the usual plea for more observations.
202: 
203:  
204:  \section{Building a field population from clusters: the formalism.}
205: \label{formalism}
206: 
207: We will use the term population  in
208: the statistical sense: a
209: set with infinitely many elements \citep{brandt1998}.
210: Consider a population of stars with a Salpeter frequency
211: distribution of masses $f(m)$.  The mass $m$\ 
212: of the stars therefore is a random variable with a frequency
213: distribution $f(m)$. Let us draw samples from such population
214: with a fixed number of stars $n$\ and frequency distribution $P(n)$\footnote{
215: In this paper the symbol $P(x)$\ stands for probability when
216: the random variable $x$ is a number (i.e. $n$), and for a  the frequency
217: distribution when $x$ is a mass (i.e. $m; M; m|M$). The meaning should be clear from the context.
218: }.
219: Each of the samples will be called a \emph{cluster} although such ``clusters'' can contain a single star. 
220: This construction is analogous to those used in previous
221: work studying the properties of HII regions in galaxies
222: \citep{oey1998}, the more general study of Poissonian
223: fluctuations in population synthesis models by \citet{cervino2002},
224: and the analysis of the isolated massive stars in the Milky Way by
225: \cite{dewit2005}.
226: 
227: The frequency distribution function of cluster masses will be given by,
228: \begin{eqnarray}
229: \xi_{cl}(M) = \sum_{\small n=1 \atop M = m_1 + m_2 \cdots + m_n}^\infty \mathcal{F}_n(m_1,m_2,\cdots,m_n)P(n) ,
230: \end{eqnarray} where $\mathcal{F}_n$\ is the multivariate frequency distribution of
231: masses for a sample of size $n$, and the summation is understood
232: also as a multiple integral over all masses $m_1,\cdots,m_n$\ satisfying the
233: constraint that they add up to $M$ (which imply quite a complex domain of integration).
234: If the sample is  random then the following two
235: conditions are satisfied: 
236: 
237: \begin{itemize}
238: \item[(a)] the individual $m_i$\ must be independent, that is,
239: \begin{eqnarray}
240: \mathcal{F}_n(m_1,m_2,\cdots,m_n) = f_1(m_1)f_2(m_2)\cdots f_n(m_n),
241: \end{eqnarray}
242: 
243: \item[(b)] the individual marginal distributions
244: must be identical and equal to the frequency distribution
245: of the parent population, that is,
246: \begin{eqnarray}
247: f_1(m_1) = f_2(m_2) = \cdots = f_n(m_n) = f(m).
248: \end{eqnarray}
249: \end{itemize} We can write an explicit expression for the cluster
250: mass function (we will use lowercase for stellar quantities and
251: uppercase for cluster quantities). Because we consider only random
252: samples, the variable $M = m_1 + m_2 + \cdots + m_n$\ is also a random
253: variable.  Thus, the distribution function of $M$, $F_n(M)$, can be written as
254: \begin{eqnarray}
255: F_n(M) = \int\limits_{\small -\infty\atop {\large M < m_1 + m_2 + \cdots + m_n < M + dM} }^{+\infty} f(m_1)f(m_2)\cdots f(m_n)dm_1\cdots dm_n.
256: \end{eqnarray}
257: We have used a somewhat unusual notation under the integral sign to indicate that  the domain of integration is restricted to total masses
258: between $M$ and $M+dM$ only. We can write this condition on the total masses as a Dirac delta-function in terms of its Fourier expression \citep{morse1953}
259: \begin{eqnarray}
260: \delta(M-\sum m_j) = {1\over 2\pi} \int_{-\infty}^{+\infty} e^{-i(M-\sum m_j)t}dt.
261: \end{eqnarray}This allow us to integrate over all positive $m_j$\ and thus to avoid the problem
262: posed by the difficult domain of integration:
263: \begin{eqnarray}
264: F_n(M) & = & {1\over 2\pi} \int_{-\infty}^{+\infty} e^{-i(M-\sum m_j)t}dt f(m_1)f(m_2)\cdots f(m_n)dm_1\cdots dm_n \\
265: 	& =  & {1\over 2\pi} \int_{-\infty}^{+\infty} e^{-iMt} \prod_{j=1}^n e^{-im_jt}f(m_j)dm_jdt\\
266: 	&  = & {1\over 2\pi} \int_{-\infty}^{+\infty} e^{-iMt} \phi^n(t)dt
267: \end{eqnarray} where $\phi(t)$\ is the characteristic function of $f(m)$, that is,
268: the Fourier transform of the probability density:
269: \begin{eqnarray}
270: \phi(t) & = & \int_{-\infty}^{+\infty} e^{imt}f(m)dm.
271: \end{eqnarray}Thus,
272: \begin{eqnarray}
273: F_n(M) & = & {1\over 2\pi} \int_{-\infty}^{+\infty} e^{-iMt}\phi^n(t)dt.
274: \end{eqnarray} Finally, the cluster mass function (Equation 1) can be written as,
275: \begin{eqnarray}
276: \label{eqMC}
277: \xi_{cl}(M) & = & \sum_{n=1}^\infty F_n(M)P(n),\\
278: \xi_{cl}(M) & = & {1\over 2\pi} \int_{-\infty}^{+\infty} e^{-iMt}\underbrace{\sum_{n=1}^\infty P(n)\phi^n(t)}_{\Phi(t)}dt
279: \end{eqnarray}were $P(n)$\ is an arbitrary probability distribution of the number of stars in clusters. We see above that
280: the characteristic function of the cluster mass function, $\Phi(t)$, and the characteristic function of the stellar
281: mass function, $\phi(t)$, are related as,
282: \begin{eqnarray}
283: \label{charfunc}
284: \Phi(t) = \sum_{n=1}^\infty P(n)\phi^n(t).
285: \end{eqnarray} 
286: 
287: Equations~\ref{eqMC}-\ref{charfunc} form the basis for either an
288: analytical, or a Monte Carlo approach to the statistical
289: simulation of clusters. Its importance resides in that it relates
290: the cluster ``number of stars"  distribution function, $P(n)$, the (universal)
291: stellar IMF, and the actual cluster mass function. We will study a simple
292: analytical case to illustrate its properties 
293: and proceed with full Monte Carlo simulations.
294: Consider for example the case in which the cluster stellar mass function is simply
295: $f(m) = \delta(m-m_*)$, that is, a cluster
296: with a single stellar mass species of mass $m_*$. In this case $\phi(t) = e^{im_*t}$,
297: and $\phi^n(t) = e^{inm_*t}$\ from which we
298: obtain $F_n(M) = \delta(M-nm_*)$\ as expected.
299: More generally, we can use the relationship between
300: cumulants and moments of a distribution \citep[][p.69]{kendall1977} to determine that
301: the mean mass of $F_n(M)$\ scales with $n$, and its width
302: scales with $\sqrt{n}$.
303: 
304: We should notice that  we have constructed a set of clusters
305: with strictly the same mass spectrum as that of a field
306: built by their total destruction.
307: Since we can set $P(n)$\ to be any function, and in particular $P(n)=0$\ for $n<N$\ for an arbitrary $N$,
308: this important result  holds independently of the lower cut-off in the cluster
309: number spectrum:  \emph{the stellar mass spectrum of clusters and of a field built entirely out 
310: of disrupted clusters can be strictly the same.} At first sight this may seem to be an almost trivial result, but
311: notice the subtlety revealed by the following {\it gedanken} experiment:  sample a universal stellar IMF to create a sample
312: of $n$\  clusters with different numbers of stars according to $P(n)$ and partition this sample 
313: according to their mass $M$ to determine the stellar IMF conditioned to the parent cluster mass, $P(m|M)$
314: (the probability that a star has mass $m$ if the parent cluster mass is $M$).  For $M$ in the range of
315: single star masses, this probability is 
316: not independent of $M$ and one would get the impression
317: that the stellar IMF does depend on the cluster mass, that is, it is not universal. However, we know  by construction that the
318: clusters have their stars drawn from exactly the same stellar IMF; what depends on
319: cluster mass is the {\it conditional probability}. The purpose of this paper is to
320: investigate whether the observations are in agreement with the $P(m|M)$\ that derives from the random
321: sampling model, or whether they falsify it. Furthermore, we know from probability theory
322: \begin{eqnarray}
323: \label{conditionalP}
324: P(m) = \int_0^{+\infty}P(m|M)P(M)dM
325: \end{eqnarray}that in the above construction we should recover the input stellar mass function. Thus, for given $P(n)$ and $P(m)$,
326: $P(m|M)$\ and $P(M)$\ must  {\it conspire} so that Equation 14 is satisfied.
327: Using this relation one finds for the simple case of the single mass species
328: clusters with $n$ stars that $P(m|M)=\delta(M-nm_*-m+m_*)=\delta(m-m_*)$, independent of 
329: $M=nm_*$, as it should be. 
330: 
331: As it is shown in Section~\ref{compare}, this ``conspiracy" is not present
332: in other treatments of this problem, where $P(m|M)$\ and $P(M)$\ are taken
333: to be totally independent. Because of this it makes more sense to
334: work with the cluster ``number function", $P(n)$.
335: 
336: Equations~\ref{eqMC}-\ref{conditionalP} are the fundamental relations relating
337: the stellar and cluster mass spectra in the
338: random sampling model. Given the stellar mass function $f(m)$\ and $P(n)$\ they fix the form
339: of the cluster mass function $\xi_{cl}(M)$\ and of $P(m|M)$, which can then be compared with observations.
340:  
341: \subsection{Monte Carlo simulations}
342: \label{mc}
343: 
344: Using the formalism described in the previous section we build clusters by randomly sampling the following ``universal'' stellar IMF,
345: \begin{eqnarray}
346: dN\propto {m^\alpha e^{-(m/m_2)^q}\over (m_1^2+m^2)^{\gamma/2}}dm
347: \end{eqnarray}
348: \noindent where $\gamma$, $\alpha$, $m_1$, $m_2$~and q~are chosen to give the appropriate
349: behavior at low and high masses: $-1+\Gamma = \alpha-\gamma=-2.35$, $m_1=0.3\/M_\odot$,
350: $m_2=150\/M_\odot$, $q=3$. Figure~1 shows this analytical
351: stellar IMF together with the IMF of the Trapezium cluster \citep{hillenbrand2000,luhman2000,muench2002}.
352: Our analytical formula departs from the observations at the higher masses  because we have chosen
353: to preserve the Salpeter slope for $1 \leq m/M_\odot \leq\ \sim\!\!120$.
354: Henceforth we will call this function the Salpeter IMF.
355: 
356: 
357: % INSERT HERE SALPETER IMF
358: \begin{figure}
359:    \includegraphics[width=6cm,angle=-90]{f1.eps}
360:       \caption{
361:         The analytical stellar IMF that we use for our MC
362:         experiments compared with the IMF of the Trapezium
363:         cluster by \citet{hillenbrand2000}, steps; \citet{luhman2000},
364: 	segmented solid lines;
365: 	and \citet{muench2002}, asterisks.\label{stellarIMF}}
366: \end{figure}
367: 
368: We sample the Salpeter IMF to create clusters with \emph{n} stars assuming 
369: a scale-free frequency distribution $P(n)\propto n^\beta$,
370: and we build the cluster mass spectrum using Equation 10,
371: \[
372: \xi_{cl}(M) = \sum_{n=1}^{\infty} F_n(M)P(n),
373: \] where $F_n(M)$~is the mass distribution function of clusters
374: with exactly $n$\ stars. Notice that this process is scale-free only if the sum starts from $n=1$,
375: in which case the only mass scales of the problem are $m_l$\ and $m_u$, the lower and upper
376: mass cut-offs of the stellar IMF.
377: Our Monte Carlo simulations consist of
378: repeatedly drawing \emph{n} stars from the Salpeter IMF,
379: calculating $M=m_1+\cdots+m_n$\ as the mass of a cluster
380: with \emph{n} stars, and then obtaining $F_n(M)$.
381: 
382: For the value of $\beta$\ there have been a multitude of studies of massive clusters
383: in galaxies, which gives for the mass functions values ranging between $\beta = -1.85$
384: \citep{degrijs2006} to $\beta = -2.4$\ \citep{hunter2003}. More extensive references
385: are given in \cite{elmegreen2006}. For smaller clusters \cite{dewit2004,dewit2005} claim
386: that their data on isolated massive star formation can be understood if $\beta=-1.7$. 
387: For massive clusters one can directly use the same exponent for the mass function as
388: for $P(n)$ because, as discussed above, the total mass scales with $n$ and the width 
389: for fixed $n$ scales with $\sqrt{n}$. In this work we will explore $\beta = -1.8, -2.0$, and $-2.2$.
390: 
391: \section{Comparison with observations}
392: \label{obs}  
393: 
394: We have identified two observational tests that can be performed
395: to check the validity of our \emph{null hypothesis.}
396: First, we will see if we can reproduce
397: the form of the embedded cluster mass function; second, we will see
398: if we can reproduce the statistics of the most massive star
399: in clusters. We are aware that we are leaving out tests regarding
400: the characteristics of small $n$\  multiple systems, which could
401: falsify it\footnote{The study of the statistics of small n multiple
402: systems is beyond the scope of the present work, but even here where
403: observations of the frequency of high mass doubles appear to violate
404: the simple random sampling model, there are physical mechanism which
405: explain them preserving the model, namely,
406: dynamical biasing\citep[see][and references therein]{sterzik1998}.}.
407: But multiple systems, although numerous, are not the
408: main source of stars in the field, so they will not affect the
409: main conclusions of the present work, namely that the stellar
410: and cluster field IMF can be the same.
411:  
412: \subsection{The embedded cluster mass function}
413: 
414: Due to the difficulty defining unbiased complete samples,
415: the important range of clusters masses in the regime of
416: stellar masses is not well studied. There are nevertheless
417: two relatively recent sources based on extensive surveys of the
418: literature at the time of publication: \citet{porras2003} and \citet{lada2003}.
419: We prefer to use LL2003 four our analysis because they give estimates of the masses of the clusters, although 
420: only 4 of the clusters in the Porras et al. list that satisfy the
421: constraint on minimum number of stars of LL2003 are not included in this catalog. The cluster masses given in LL2003 were obtained by
422: modeling source counts as a function of limiting
423: magnitudes for two model clusters with ages of 0.8~Myr and 2~Myr,
424: corresponding to the ages of the Trapezium and IC~348 clusters
425: respectively. They assumed a universal IMF and used the average
426: of the mass determined for the two assumed ages. 
427: 
428: Figure~2 shows the empirical data of LL2003
429: together with the results of six runs of our MC experiments
430: in which we built clusters with the above $P(n)$\ for $n\geq35$, drawing
431: 72 clusters at the time (the parameters of the observations of LL2003).
432: We note the excellent agreement between the simulations
433: and LL2003 except for the mass bins at $\log M\sim0.95$
434: and $\log M\sim3.5$\ which are totally de-populated in LL2003.
435: For $\beta=-1.8$\ the smallest mass bin is de-populated in 15\% of our
436: simulations while in 70\% of the simulations contains 2 clusters or less.
437: The highest mass bin is populated in only $\sim40\%$
438: of the simulations and in almost 100\% of the simulations
439: contains less than 2 clusters.
440: LL2003 proposed that the downturn at smaller masses was
441: evidence for a favored cluster formation mass scale at around $M\sim 50M_\odot$.
442: However, our simulations indicate that this downturn is naturally explained
443: by the cutoff in \emph{n} they introduced in an otherwise scale-free spectrum, 
444: without the need
445: to invoke a special cluster formation scale. The figure shows that the data is
446: best modeled if the cutoff in $n$\ is a bit larger than the LL2003
447: criterion of $n>35$\ to select clusters. This is probably the effect of
448: having a sample with an inhomogeneous magnitude limit so that $n>35$\ 
449: becomes only a lower limit to the actual cutoff.
450: 
451: % INSERT HERE LADA & LADA
452: \begin{figure}
453:    \epsscale{1.10}
454:    \plotone{f2.eps}
455:     \caption{\label{clusterIMF}
456:         The Lada and Lada (2003) spectrum of masses of embedded
457:         clusters together with the results of Monte Carlo
458:         simulations in which we sample the stellar IMF
459:         with a number probability distribution $\beta={-1.8}$,
460: 	top row; $\beta={-2.0}$, middle row; $\beta={-2.2}$, bottom row.
461:         For $n\geq35$, left column; and $n\geq70$, right column. 
462:            }
463: \end{figure}
464: 
465: \subsection{The statistics of the maximal mass star in clusters}
466: \label{compare}
467: 
468: 
469: WK2006 argued that their observed correlation between the maximal star mass
470: and the total mass of clusters is not consistent with the hypothesis that
471: clusters are formed by random sampling of a universal stellar IMF.
472: WK2006's sample of clusters is strongly affected by a
473: size-of-sample effect (see Appendix~\ref{ap:sizeofsample}).
474: Because of the impracticality of finding a large ensemble of
475: small clusters and thus avoid the problems introduced by the
476: size-of-sample effect, WK2006 performed Monte Carlo experiments to determine
477: the statistical properties of each of their  three
478: sampling methods. We were puzzled by their Figure~3, which shows that
479: for their random sampling method (that corresponds to our Monte Carlo models),
480: the curves of maximal mass star versus cluster mass ($M_{ecl}$) have two maxima
481: in the range between  $\sim 25M_{\odot}$\ and $\sim 250M_{\odot}$. For example, for $M_{ecl}=100M_\odot$\ the curve peaks
482: at $\sim10~M_\odot$~and then again at $\sim100~M_\odot$.  Since it is precisely in this mass range that the model curve departs most
483: strongly from the data points, we thought that the double peaks could be the result of a computational error.
484: We therefore decided to repeat the calculations using our independent algorithm
485: to build the bivariate (maximal stellar mass -- cluster mass) probability
486: distribution. The results of our simulations are shown Figure~\ref{bivariate}, where,
487: much to our surprise, we reproduce the double peaks obtained by WK2006!
488: 
489: Figure~3 shows the results of many MC experiments of the random sdampling model
490: from which we have calculated the bivariate probability distribution to have a
491: cluster in a given log-mass bin of width 0.5, with a star of maximal mass in a
492: log-mass bin of width 0.1. Lighter areas correspond to a higher probability.
493: Figure~3a  correspond to MC experiments in which anything is considered a cluster, even
494: system with n=1. Figure~3b considers only cluters with n$>$50. The vertical line
495: in Figure~3a marks the position of $\log M = 1.8$. Notice that as one moves from
496: bottom to top along this line one will cross contour levels that at first increase
497: until the bivariate distribution reaches a maximum at $\log m^*_{max}\approx 0.9$.
498: If one continues to move it will reach a minimum at approximately
499: $\log m^*_{max}\approx 1.4$, and then it starts increasing again reaching a maximum
500: at the point in which all the cluster mass is in a single star.
501: Notice that this double (local) maxima feature comes from the nature
502: of the probability distribution of star masses conditioned to cluster
503: mass ((Fig~\ref{fig:conditionalP}, see below).
504: 
505: Interestingly, the clusters from the compilation
506: of \cite{weidner2006} (crosses in Fig.~\ref{bivariate}) all concentrate in the ridge of the
507: distribution defined by the first (lower mass) peak described above and do not cover
508: the full mass range allowed by our models. To a lesser extent this is also true of the models by
509: \cite{weidner2006}, whose Figures~4 and 5 show the data to have a much smaller
510: dispersion around the mean than the models. Nevertheless, the random sampling method
511: deviate most from the data due to its ``double peaked'' mass distribution.
512: With our preferred random sampling method clusters with masses in the stellar
513: mass range the most massive stars can have masses similar to the total cluster mass.
514: Moreover, the conditional probability distribution of stellar masses, $P(m|M)$,
515: for clusters in the stellar mass range shows that some clusters can be dominated
516: by one or at most a few high mass stars (Fig~\ref{fig:conditionalP}).
517: This increase in the probability distribution of stellar masses near the cluster mass
518: is also visible in Figure~2 of \cite{durisen2001}, so the critical
519: questions are whether this effect is real, and if so, whether it is significant.
520: The fact that the effect is seen in three independent investigations argues strongly
521: for the reality of the peak in the random sampling model, but
522: its significance is debatable. On the one hand, the effect arises
523: from partitioning the data into mass bins, and it is forced into existence
524: by the need to satisfy Equation~\ref{conditionalP}, so it is of no physical significance.
525: On the other hand, it biases (toward large values)
526: the mean maximal stellar mass versus cluster mass curve used
527: by \cite{weidner2006} to falsify the random sampling model, so it is
528: highly significant.
529: 
530: Is this a real violation of our \emph{null hypothesis} signaling the presence of
531: some interesting physical effect or is it the effect of improper data, or its analysis,
532: or both?  Although WK2006's conclusions are based on a very limited data-set\footnote{For example, 
533: their favored sorted sampling model predicts no single star clusters at all, while 
534: \citet{dewit2004, dewit2005} find truly isolated massive stars of spectral
535: types ranging between O5 and O9 \citep[see also][]{zinnecker2007}. None of these ``single star clusters'' are included in KW2006.},
536: it is unlikely that this alone can explain the difference in the distribution of
537: the data points and that predicted by the models: the number of clusters in each mass
538: bin is small but the total sample is not that small and at all masses the data
539: points are delineating the lowest maximal stellar mass \emph{ridge} of the distribution.
540: 
541: \begin{figure}
542:    \epsscale{1.2}
543:     \plotone{f3a.eps}
544:     \plotone{f3b.eps}
545:     \caption{\label{bivariate}
546: 	(a, top) The bivariate probability distribution function of
547: 	maximum stellar mass and cluster mass.  The overlaid points correspond to the
548:         data in \cite{weidner2006}. This figure show the result with a cluster $P(n)\sim 1/n^2$
549: 	starting at n=1. The grey levels correspond to the bivariate probability of finding
550: 	a cluster in a log-mass bin of size $0.5\times0.1$. The white circle represents WR20a
551: 	taken as a system. The points represent the data points in WK2006 with a few additions
552: 	from Weidner (2007). The vertical line is drawn at a log M value of 1.8.
553: 	(b, bottom) Same as (a) but with $n>50$. For details see main text.
554: }
555: \end{figure}
556: 
557: 
558: \begin{figure}[t]
559:    \epsscale{1.2}
560:     \plotone{f4.eps}
561:     \caption{\label{fig:conditionalP}
562:         Conditional probability distributions of stellar masses conditioned
563:         to cluster mass. The topmost curve correspond to P(m) where the
564: 	cluster mass has bin marginalized. The other curves correspond from top to bottome
565: 	to P(m|M) for log M equal to -1.25, -0.75, -0.25, 0.25, 0.75, 1.25, 1.75, 2.25,
566: 	2.75, 3.25, 3.75, and 4.25 respectively. 
567:     }
568: \end{figure}
569: 
570: One possible explanation could come from the highly hierarchical nature of
571: young stellar systems and the somewhat arbitrary way in which the parent
572: object of the maximal star is chosen. For example, the cluster Westerlund~2 has the well
573: known massive binary WR20a, both components of which with masses $\sim 80 M_\odot$\ \citep{rauw2004, rauw2005}.
574: Considering that the ratio of its components separation to the cluster size is smaller
575: than the ratio of the cluster size to that of the Milky Way, there is no objective reason
576: not to have such binary star as a single point in Figure~\ref{bivariate}, in which
577: case we would have a data point in the area devoid of points in that Figure. An objective
578: algorithm to identify clusters of different number of stars is needed. A step in that
579: direction is the work of \cite{oey2004} which used the algorithm by \cite{battinelli1991} to
580: identify groupings of stars and showed that the distribution of clusters by number of stars
581: is consistent with $n^{-2}$\ down to single stars. What is needed is then to determine
582: the maximal mass star and total mass of those clusters to build the bivariate probability
583: distribution.
584: 
585: Finally, another explanation is that these data do indeed falsify our \emph{null hypothesis} and
586: that they signal the presence of some interesting physical effect that results
587: in the mass of the most massive star to depend on the mass of its parent cluster.
588: One possibility could be that stars form in an ordered fashion, less massive stars
589: forming first. Once a high mass star is formed this one cleans the cluster
590: of its placental material and the cluster rapidly disintegrate \citep{elmegreen1983}. This is the explanation
591: favoured by WK2006.  But there is another possibility: clusters with maximal mass star of too
592: large masses can be more gravitationally unstable. A cluster with a maximal mass star of very
593: large mass is characterized also by having a smaller total number of stars, and its conditional stellar mass function
594: is flatter (see Figure~\ref{fig:conditionalP}). \cite{terlevich1987} found that a flatter
595: mass spectrum results in a considerably smaller half-life: her model XV evolves an order
596: of magnitude faster that a cluster with a normal Salpeter IMF. This explanation has
597: the feature that it does not violate the \emph{null hypothesis,} because clusters are
598: formed according to the random sampling model but their lifetimes, and thus the probability
599: of observing them, depend on the mass of the maximal stellar mass in them.
600: 
601: \section{Discussion}
602: \label{disc}
603: 
604: As mentioned in the Introduction, our results depart radically from those of KW2003.
605: Their conclusion that the field built by clusters must have a steeper stellar IMF
606: can be criticized on several accounts as follows.
607: To begin, as noted by \citet{elmegreen2006},  for the observed range
608: of cluster masses, the predicted steepening in the stellar IMF is rather small going
609: from $\Gamma = -1.35$\ in clusters
610: to $\Gamma \sim -1.5$~in the field (Figure~2 in KW2003).
611: This is well within the observed variance in the
612: stellar IMF of clusters, and can be comfortably ascribed
613: to systematic errors in the determination of the slope \citep[see e.g.][]{kroupa2002}.
614: 
615: A second problem is that KW2003 implicitly assume that the cluster mass spectrum
616: is well determined down to masses comparable to those of the
617: smaller stars and is approximately given by
618: $\xi_{cl}\sim M^{-2}$\ for $5 \leq M/M_\odot \leq 10^7$.
619: As mentioned in the previous section, however, LL2003 claim that the cluster mass spectrum turns
620: abruptly down for masses below $M \sim50M_\odot$, that is, it appears to have a
621: preferred mass-scale. This reduces the number of small mass
622: clusters thus reducing the predicted difference  between
623: cluster stellar IMF and the field.
624: Almost paradoxically, our model invoking a universal stellar IMF
625: shows that this preferred mass-scale is most likely not a physically significant
626: feature of cluster formation!
627: 
628: Another problem is that KW2003 use a Procrustean approach in their modeling of clusters
629: where all clusters are forced to have one \emph{maximal mass star}, that
630: is, a star of the maximum mass allowed by the stellar IMF.  This assumption,
631: \begin{eqnarray}
632: \label{mmax}
633: 1 = \int_{m_{max*}}^{m_u} f(m)dm,
634: \end{eqnarray}
635: forces the upper mass cut-off of the IMF to be an increasing function of cluster mass,
636: varying as $M^{1/x}$. From the discussion leading to Equation~\ref{conditionalP} it is clear
637: that we can not recover the input IMF unless we include in our sums clusters for which $m=M$.
638: 
639: \citet{elmegreen2006} expanded the formalism of Vanbeveren and showed analytically and
640: numerically that for a power-law mass distribution of clusters of slope $\beta\leq 2$,
641: the summed IMF  of a population of clusters  is indistinguishable from  the cluster IMF.
642: However, the functional form of the conditional probability $P(m|M)$ adopted by Elmegreen
643: {\it does not} satisfy Equation~\ref{conditionalP} for any value of $\beta$; it just happens
644: that for $\beta\leq 2$ the summed IMF is {\it almost} the same as the cluster IMF (they differ
645: by a logarithmic multiplicative factor). This clearly shows that the finding of KW2003, 
646: that for $\beta>2$ the summed IMF becomes significantly steeper than the individual cluster IMF, 
647: arises from the assumption that, besides a normalization factor that depends on the cluster mass 
648: through Equation~\ref{mmax}, $P(m)$ and $P(m|M)$\ have the same functional form, (simple power-laws 
649: of the same slope in the case of Elmegreen).  While this is a very good assumption for very massive 
650: clusters, it clearly does not apply for clusters in the stellar mass range which are the ones responsible 
651: for ``tilting'' the sum-IMF for $\beta>2$. Thus, even within the Vanbeveren formalism the IMF of clusters 
652: and the field can be strictly the same for any cluster mass distribution.
653: 
654: The results of WK2006 are confirmed in this work in the sense that we also find that
655: real clusters cover a significantly smaller part of the maximal stellar mass and cluster mass
656: space than  permitted by the random sampling model. WK2006 go then to modify their sampling
657: algorithm in such a way as to reduce the probability of having very large mass stars in their
658: clusters arguing that \emph{``Star clusters appears to form in an ordered fashion, starting
659: with the lowest-mass stars until feedback is able to outweigh the gravitationally
660: induced formation process.''} Although this is a possible scenario we favour a different
661: one in which the area of the bivariate distribution allowed by the random sampling
662: model is rendered unstable due to the extreme nature of the mass spectrum therein,
663: in accordance with the simulations of \cite{terlevich1987}.
664: How much ``trimming'' of the bivariate distribution can be actually accomplished this
665: way will be the subject of a future work.
666: 
667: Our strong conclusion is that the observations of the stellar IMF and the mass spectrum of young 
668: clusters are consistent with the hypothesis that clusters form by random sampling of a universal 
669: stellar IMF.  This conclusion leads us to challenge the received hypothesis that clusters are the fundamental 
670: building blocks of the stellar populations in galaxies \citep{dewit2004,dewit2005}.  In this view clusters are 
671: given an independent existence from before the time that stars form. In our view, following the ideas of 
672: Elmegreen (1997), stars form in giant molecular
673: clouds in a hierarchy of structures with different numbers 
674: and masses.  Some of these structures end up forming large clusters 
675: (which will later dissolve) and some don't: they become part of small associations of stars formed in neighboring 
676: regions almost by chance. In fact, the observations of  de Wit et al. (2004, 2005), that
677: are used as standard references for the view that clusters form first, find that 30\% of young massive Galactic 
678: field stars are not members of clusters or OB associations. Of these, about 50\% are runaway
679: star \emph{candidates}. They conclude that 4$\pm2\%$\ of the stars in their sample result from
680: truly isolated high-mass star formation, a number that 
681: can be reproduced \emph{``assuming that all stars are formed
682: in clusters that follow a universal cluster distribution
683: (by $N_*$) with slope $\beta\sim-1.7$\ down to clusters
684: with a single member.''} If we consider single star clusters
685: the statement that \emph{all stars are formed in clusters}
686: becomes a tautology.  
687: 
688: Our results hint at a {\it strong universality} hypothesis for the IMF where not only the power-law part, but the full function may be universal.   Clearly this claim has profound implications for understanding how stars form and therefore its foundations require considerably more observational work than was available for the tests presented in this paper.
689: 
690: \section{Summary}
691: \label{summary}
692: 
693: This paper combines four separate results within a single
694: unified view. The unification is actually a result of Equation~\ref{eqMC}
695: (that is derived formally in the first section of the paper)
696: which relates the cluster mass function with the probability
697: distribution of number of stars in clusters, and the (universal)
698: stellar IMF. These results are
699: \begin{itemize}
700: 	\item the IMF of a field stellar population and that
701: 	of the clusters out of which the field was built \emph{can} be
702: 	strictly the same. This result contradicts \citet{kroupa2003},
703: 	and was implicit in the models by \citet{larson1982},
704: 	\citet{oey1998, oey2005}, and \citet{dewit2004};
705: 
706: 	\item the observations of the lower end of the
707: 	cluster mass function, as given by \citet{lada2003}, agree
708: 	with the random sampling model presented here if: (a) the
709: 	distribution of clusters by the number of stars they contain
710: 	is a scale-free power law, $n^\beta$, with $\beta$\ between -1.8
711: 	and -2.2; (b) the stellar IMF is independent of $n$ and it is
712: 	given by the Salpeter form;
713: 
714: 	\item the observed special mass scale for cluster formation claimed
715: 	by Lada and Lada arises from the arbitrary cut-off in the
716: 	number of stars imposed by them;
717: 
718: 	\item  the interpretation of the statistics of the most massive star in clusters
719: 	is a valuable tool to study cluster formation processes as the observations,
720: 	taken at face value, violate the \emph{null hypothesis} represented by the
721: 	random sampling model. Although a proper observational study requires a sample
722: 	including many clusters with only a few members, we believe that the observations
723: 	presented by WK2006 are at worst compelling. Nevertheless, it is argued in the
724: 	present work that the discrepancy is due to systems which are rendered gravitationally
725: 	unstable by the presence of one or more very massive stars. 
726: \end{itemize}
727: 
728: \acknowledgments 
729: We would like to thank our anonymous referee whose
730: comments helped us to improve this work.
731: 
732: \appendix
733: 
734: \section{Size-of-sample effect}
735: \label{ap:sizeofsample}
736: 
737: The size-of-sample effect arises in many areas of astronomy,
738: and we have studied it in the context of the distribution
739: of sizes of super-associations in galaxies~ (Selman and Melnick, 2000).
740: The formalism developed there can be translated
741: \emph{mutatis mutandi} to the present context as follows:
742: Let the whole set of star clusters to be analyzed be denoted by ${\cal C} = \{C_i\}_{i=1}^N$,
743: where $C_i$\ is the i-th cluster with mass  $M_i$\ and maximum
744: stellar mass $m^{max*}_i$.
745: We will assume ${\cal C}$\ to be ordered from the most massive to the least massive cluster,
746: that is,  $i<j \Rightarrow M_i > M_j$.  Let $M_l=\sum_{i=1}^{l} M_i/l$\ be the
747: average mass of the $l$\ most massive clusters. From ${\cal C}$\ we will draw $N_S$\
748: sub--samples, ${\cal S}_j$, of equal mass, $M_l$, defined by
749: \begin{displaymath}
750: {\cal S}_j = \{C_i\}_{i=j}^{n_j},
751: \end{displaymath}
752: where $n_j$\ is defined by the expression,
753: \begin{displaymath} \sum_{i=j}^{n_j} M_i = M_l.
754: \end{displaymath}
755: Thus, the sub-sample ${\cal S}_j$\ contains the cluster $C_j$\ and the next
756:  $n_j-1$\ less massive clusters, enough to add up to a total mass equal to $M_l$.
757: We will assign to each sub-sample~$j$\ two numbers:
758: $\tilde m^{max*}_j$, and $\tilde M^{avg}_j$, defined as
759: \begin{eqnarray}
760: \tilde m^{max*}_j = \max_{C_i\in {\cal S}_j} m^{max*}_i,\nonumber\\
761: \tilde M^{avg}_j = {1\over{n_{j}-j+1}}\sum_{i=j}^{n_j}M_i.\nonumber
762: \end{eqnarray}
763: $\tilde m^{max*}_j$\ is equal to the maximum stellar mass of all the members of ${\cal S}_j$,
764: and $\tilde M^{avg}_j$\ their average mass. We will refer to sub--sample~j
765: as ``super--cluster''~j.
766: Because all ``super--clusters'' thus defined
767: have approximately equal total mass ($\approx M_l$), we can compare
768: the mass of their maximal star without having a size--of--sample effect.
769: 
770: 
771: % INSERT HERE SIZE OF SAMPLE
772: \begin{figure}
773:    \epsscale{.80}
774:    \plotone{f5.eps}
775:    \caption{\label{WK2006subsample}
776:         The maximum stellar mass in a sub-sample of the
777:         data of WK2006 analyzed as detailed in the main
778:         text. The x-axis show the average mass of the clusters
779: 	in the 800~M$_\odot$~``super clusters'',
780:         while the y-axis show the mass of the maximal mass
781:         star in the super clusters.
782:            }
783: \end{figure}
784: 
785: Regrettably, the data set and the metod of analysis used by WK2006 is far from what is needed for this
786: kind of analysis. Their Table~1 lists 17 clusters with
787: masses ranging from $25 M_\odot$~to $10^5 M_\odot$. We have seen that we should actualy work with the
788: number of stars instead of the mass of the clusters as conditioning to cluster mass introduces
789: unphysical mass scales. If we use an average stellar mass of $0.3M_\odot$\ then the smallest
790: mass cluster corresponds to a cluster with $\approx$~75 stars while the largest mass
791: clusters corresponds to $\approx3\times10^5$\ stars.
792: The sample should include at least 4000 clusters with 75 stars
793: to meaningfully compare the maximal mass of this artificial ``super-cluster" with the maximal 
794: mass of a cluster of $10^5 M_\odot$ (such as R136).    Figure~\ref{WK2006subsample}  plots the maximal mass against 
795: total mass for the best sub-sample of ``super-clusters" that can be constructed from the data of WK2006 using the algorithm described above. This sample consists of NGC6530 as the first ``super-cluster''; 
796: NGC~2264, Mon R2, and $\sigma$~Ori in the second;   Mon R2, $\sigma$~Ori, NGC~2024, and IC~348 in the third; and $\sigma$~Ori, NGC~2024, IC~348, $\rho$~Oph, NGC~1333,
797: Ser SVS2, and Taurus-Auriga in the fourth. Each of these four ``super-clusters" has a
798: total mass of approximately 800~M$_\odot$.  The ``best-set'' shows no correlation between maximal star mass and cluster richness: the claimed correlation was a size-of-sample effect. (It is possible to construct other 
799: sub-samples having  a more massive star in the upper mass bin, but these sub-samples contain only  2 or  3 super-clusters.)
800: 
801: \bibliography{ms}
802: 
803: \clearpage
804: \end{document}
805: 
806: %__oOo__
807: