1: \begin{abstract}
2:
3: We consider a service system model primarily
4: motivated by the problem of efficient assignment of virtual machines to physical host machines in
5: a network cloud, so that the number of occupied hosts is minimized.
6:
7: There are multiple types of arriving
8: customers, where a customer's mean service time depends
9: on its type.
10: There is an infinite number of servers.
11: Multiple customers can be placed for service into
12: one server, subject to general ``packing'' constraints.
13: Service times of different customers are independent, even if served simultaneously
14: by the same server.
15: Each new arriving customer is placed for service immediately, either
16: into a server already serving other customers
17: (as long as packing constraints are not violated)
18: or into an idle server.
19: After a service completion, each customer leaves its server and the system.
20:
21: We propose an extremely simple and easily implementable
22: customer placement algorithm, called {\em Greedy-Random} (GRAND).
23: It places each arriving customer uniformly at random into either one of the already occupied servers (subject to packing constraints)
24: or one of the so-called {\em zero-servers}, which are empty servers designated to be available to new arrivals.
25: One instance of GRAND, called GRAND($aZ$), where $a\ge 0$ is a parameter, is such that the number of zero-servers
26: at any given time $t$ is $aZ(t)$, where $Z(t)$ is the current total number of customers in the system.
27: We prove that GRAND($aZ$) with $a>0$
28: is asymptotically optimal, as the customer arrival rates grow to infinity and $a\to 0$,
29: in the sense of minimizing the total number of occupied servers in steady state.
30: In addition, we study by simulations various versions of GRAND
31: and observe the dependence of convergence speed and steady-state performance
32: on the number of zero-servers.
33:
34: \end{abstract}
35: