cs0703019/MST.tex
1: \documentclass[letter, 11pt]{article}
2: 
3: \usepackage{geometry}
4: %\geometry{letterpaper,tmargin=2.5cm,bmargin=2.5cm,lmargin=2.5cm,rmargin=2.5cm}%
5: 
6: \usepackage{amsmath, amsthm, amssymb, verbatim}
7: \usepackage{epsfig}
8: \usepackage{graphics}
9: 
10: \newtheorem{proposition}{Proposition}
11: \newtheorem{cor}{Corollary}
12: \newtheorem{theorem}{Theorem}
13: \newtheorem{lemma}{Lemma}
14: \newtheorem{claim}{Claim}
15: 
16: \sloppy
17: \renewcommand{\baselinestretch}{0.95}
18: \newcommand{\StackMST}{\textsc{StackMST} }
19: \newcommand{\StackMSTnospace}{\textsc{StackMST}}
20: \newcommand{\scover}{\textsc{SetCover} }
21: \newcommand{\scovernospace}{\textsc{SetCover}}
22: \newcommand{\vcover}{\textsc{VertexCover} }
23: \newcommand{\ds}{\displaystyle}
24: \newcommand{\OPT}{\mathrm{OPT}}
25: \newcommand{\OPTVC}{\mathrm{OPT}_{\mathrm{VC}}}
26: \newcommand{\IP}{\mathrm{IP}}
27: \newcommand{\LP}{\mathrm{LP}}
28: 
29: \begin{document}
30: 
31: \title{The Stackelberg Minimum Spanning Tree Game%
32: \thanks{A preliminary version of this article appeared in the Proceedings of 
33: the 10th Workshop on Algorithms and Data Structures (WADS 2007), see~\cite{CDFJLNW07}.
34: This work was partially supported by the {\em Actions de Recherche Concert\'ees (ARC)\,} 
35: fund of the {\em Communaut\'e fran\c{c}aise de Belgique}.}}
36: 
37: \author{
38: Jean Cardinal\thanks{Universit\'e Libre de Bruxelles, D\'epartement d'Informatique, c.p.~212, B-1050 Brussels, Belgium, jcardin@ulb.ac.be.}
39: \and 
40: Erik D. Demaine\thanks{MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA, edemaine@mit.edu.}
41: \and
42: Samuel Fiorini\thanks{Universit\'e Libre de Bruxelles, D\'epartement de Math\'ematique, c.p.~216,  B-1050 Brussels, Belgium,  sfiorini@ulb.ac.be.}
43: \and 
44: Gwena\"el Joret\thanks{Universit\'e Libre de Bruxelles, D\'epartement d'Informatique, c.p.~212,  B-1050 Brussels, Belgium, gjoret@ulb.ac.be. G. Joret is a 
45: Postdoctoral Researcher of the Fonds National de la Recherche Scientifique (F.R.S.--FNRS).}
46: \and
47: Stefan Langerman\thanks{Universit\'e Libre de Bruxelles, D\'epartement d'Informatique, c.p.~212, B-1050 Brussels, Belgium, slanger@ulb.ac.be. S. Langerman is a Research Associate of the Fonds National de la Recherche Scientifique (F.R.S.--FNRS).}
48: \and
49: Ilan Newman\thanks{Department of Computer Science, University of Haifa, Haifa 31905, Israel, ilan@cs.haifa.ac.il.} 
50: \and 
51: Oren Weimann\thanks{MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA, oweimann@mit.edu.}
52: }
53: 
54: \date{}
55: 
56: \pagestyle{plain} \maketitle
57: 
58: \begin{abstract}
59: 
60: We consider a one-round two-player network pricing game,
61: the {\em Stackelberg Minimum Spanning Tree\/} game or \StackMSTnospace.
62: 
63: The game is played on a graph (representing a network),
64: whose edges are colored either red or blue,
65: and where the red edges have a given fixed cost
66: (representing the competitor's prices).
67: The first player chooses an assignment of prices to the blue edges,
68: and the second player then buys the cheapest possible minimum spanning tree,
69: using any combination of red and blue edges. The goal of the first player is to 
70: maximize the total price of purchased blue edges. 
71: This game is the minimum spanning tree analog of the well-studied
72: Stackelberg shortest-path game.
73: 
74: {}\quad
75: We analyze the complexity and approximability of the first player's
76: best strategy in \StackMSTnospace.
77: In particular, we prove that the problem is APX-hard even if there are
78: only two different red costs, and give an approximation algorithm whose
79: approximation ratio is at most $\min \{k,1+\ln b,1+\ln W\}$,
80: where $k$ is the number of distinct red costs, $b$ is the number of
81: blue edges, and $W$ is the maximum ratio between red costs.
82: We also give a natural integer linear programming formulation of the problem,
83: and show that the integrality gap of the fractional relaxation
84: asymptotically matches the approximation guarantee of our algorithm.
85: \end{abstract}
86: 
87: \section{Introduction}
88: 
89: Suppose that you work for a networking company that owns many
90: point-to-point connections between several locations,
91: and your job is to sell these connections.  A customer
92: wants to construct a network connecting all pairs of locations in the
93: form of a spanning tree. The customer can buy connections that you are
94: selling, but can also buy connections offered by your competitors.
95: The customer will always buy the cheapest possible spanning tree.
96: Your company has researched the price of each connection
97: offered by the competitors.
98: The problem considered in this paper is how to set the price of each
99: of your connections in order to maximize your revenue, that is,
100: the sum of the prices of the connections that the customer buys from you.
101: 
102: This problem can be cast as a {\em Stackelberg game}, a type of two-player
103: game introduced by the German economist Heinrich Freiherr von
104: Stackelberg~\cite{Stack34}. In a Stackelberg game, there are two players:
105: the {\em leader\,} moves first, then the {\em follower\,} moves,
106: and then the game is over.
107: The follower thus optimizes its own objective function, knowing the
108: leader's move. The leader has to optimize its own objective function
109: by anticipating the optimal response of the follower.
110: In the scenario depicted in the preceding paragraph,
111: you were the leader and the customer was the follower:
112: you decided how to set the prices for the connections that you own,
113: and then the customer selected a minimum spanning tree.
114: In this situation, there is an obvious tradeoff: the
115: leader should not put too high price on the connections---otherwise
116: the customer will not buy them---but on the other hand the leader needs to
117: put sufficiently high prices to optimize revenue.
118: 
119: Formally, the problem we consider is defined as follows.
120: We are given an undirected graph\footnote{All graphs in this paper are finite 
121: and may have loops and multiple edges.} 
122: $G=(V,E)$ whose edge set
123: is partitioned into a {\em red edge set\,} $R$ and a {\em blue edge
124: set\,} $B$. Each red edge $e \in R$ has a nonnegative fixed {\em cost\,}
125: $c(e)$ (the best competitor's price).
126: The leader owns every blue edge $e \in B$ and has to set a
127: {\em price\,} $p(e)$ for each of these edges. The cost function $c$ and
128: price function $p$ together define a {\em weight\,} function $w$ on
129: the whole edge set. By ``weight of edge $e$'' we mean either
130: ``cost of edge $e$'' if $e$ is red or ``price of edge $e$'' if
131: $e$ is blue. A spanning tree $T$ is a {\em minimum
132: spanning tree\,} (MST) if its {\em total weight}
133: %
134: \begin{equation}
135: \sum_{e \in E(T)} w(e) = \sum_{e \in E(T) \cap R} c(e)
136: + \sum_{e \in E(T) \cap B} p(e)
137: \end{equation}
138: %
139: is minimum. The {\em revenue\,} of $T$ is then
140: \begin{equation}
141: \sum_{e \in E(T) \cap B} p(e).
142: \end{equation}
143: %
144: The Stackelberg Minimum Spanning Tree problem, \StackMSTnospace, asks for a
145: price function $p$ that maximizes the revenue of an MST.
146: Throughout, we assume that the graph contains a spanning tree
147: whose edges are all red; otherwise, there is a cut consisting only of
148: blue edges and the optimum value is unbounded. Moreover, to
149: avoid being distracted by epsilons, we assume that among all edges
150: of the same weight, blue edges are always preferred to red edges;
151: this is a standard assumption. As a consequence, all minimum spanning trees
152: for a given price function $p$ have the same revenue;
153: see Section \ref{prelim} for details.
154: 
155: \paragraph{Related work.}
156: 
157: A similar pricing problem, where one wants to price the edges in $B$ and
158: the customer wants to construct a shortest path between two vertices instead 
159: of a spanning tree, has been studied in the literature; see van Hoesel \cite{vH06} for a
160: survey. Complexity and approximability results have recently been obtained
161: by Roch, Savard and Marcotte~\cite{RSM05}, and by 
162: Bouhtou, Grigoriev,  van Hoesel, van der Kraaij, Spieksma, 
163: and Uetz~\cite{BGvHvdKSU07}: the problem is strongly NP-hard
164: and $O(\log |B|)$-approximable. A generalization of the problem
165: to more than one customer has been tackled using mathematical
166: programming tools, in particular bilevel programming;
167: see Labb\'e, Marcotte, and Savard \cite{LMS98}.
168: This generalization was motivated by the problem of setting tolls on
169: highway networks. Note that the \StackMST problem is only 
170: interesting in the single-customer case, since otherwise all customers
171: purchase the same tree. Cardinal, Labb\'e, Langerman, and Palop~\cite{CLLP05}
172: give a geometric version of the shortest path problem.
173: 
174: Recently, part of the results of the current paper have been generalized to other problems by Briest, 
175: Hoefer and Krysta~\cite{BHK08}. They also exhibit a polynomial-time algorithm for a special case of a 
176: Stackelberg vertex cover problem, in which the follower's problem is to find a minimum vertex cover in a
177: bipartite graph.
178: 
179: Other pricing problems have been studied, in which 
180: the goal is to find the best prices for a set of items, after bidders have announced their
181: preferences in the form of subset valuations. {\em Envy-free} pricing, in particular, 
182: can be viewed as a simple Stackelberg game.
183: APX-hardness and approximability of such problems have been established 
184: by Hartline and Koltum~\cite{HK05}, and by Guruswami, Hartline, Karlin, Kempe, Kenyon, 
185: and McSherry~\cite{GHKKKS05}. Balcan and Blum~\cite{BB06} gave improved approximation results.
186: Approximability within a logarithmic factor has also been recently established for more general cases by 
187: Balcan, Blum and Mansour~\cite{BBM08}. The case in which items are edges of a graph has been 
188: studied by Grigoriev, van Loon, Sitters and Uetz~\cite{GLSU06}, and Briest and Krysta~\cite{BK06}.
189: A semi-logarithmic inapproximability result for a special case of the unlimited supply pricing problem
190: has been given by Demaine, Feige, Hajiaghayi, and Salavatipour~\cite{DFHStoappear}.
191: 
192: \paragraph{Our results.}
193: 
194: We analyze the complexity and approximability of the
195: \StackMST problem.  Specifically, we prove the following:
196: %
197: \begin{enumerate}
198: \item \StackMST is APX-hard, even if there are only two red costs,
199:   $1$ and~$2$ (Section~\ref{Hard}).
200:   This result is also the first NP-hardness proof for this problem, and, to our knowledge, the first APX-hardness proof 
201:   for a Stackelberg pricing game with a single customer. The reduction is from \scovernospace.
202: \item \StackMST is $O(\log n)$-approximable, and is
203:   $O(1)$-approximable when the red costs either fall in a
204:   constant-size range or have a constant number of distinct
205:   values (Section~\ref{sec-best}). More precisely, we analyze
206:   the following simple approximation algorithm, called
207:   \emph{Best-out-of-$k$}: for all $i$ between $1$ and~$k$,
208:   consider the price function for which all blue edges have
209:   price~$c_i$, and output the best of these $k$ price functions.
210:   Here, and throughout the paper, $c_i$ denotes the $i$th
211:   smallest cost of a red edge and $k$ the number of distinct red
212:   costs. We prove that the approximation ratio of this algorithm
213:   is bounded above by $\min \{k , 1+\ln b , 1+\ln (c_k/c_1)\}$,
214:   where $b := |B|$ is the number of blue edges.
215: \item The integrality gap of a natural integer linear programming formulation
216: asymptotically matches the approximation guarantee of
217:   Best-out-of-$k$ (Section~\ref{LP}). Thus, effectively, any
218:   approximation algorithm based on the linear programming
219:   relaxation of our integer program (or any weaker relaxation)
220:   cannot do better than Best-out-of-$k$. Of course, this result
221:   does not imply that Best-out-of-$k$ is optimal. In fact, a
222:   central open question about \StackMST is to determine if it
223:   admits a constant factor approximation algorithm.
224: \end{enumerate}
225: 
226: \section{Basic Results}
227: \label{prelim}
228: 
229: Before we proceed to our main results, we prove a few basic lemmas
230: about \StackMSTnospace.
231: 
232: We claimed in the introduction that the revenue of the leader depends
233: on the price function $p$ only, and not on the particular MST picked by
234: the follower. To see this, let $w_1 < w_2 < \cdots < w_\ell$ denote
235: the different edge weights. The greedy algorithm (a.k.a.\ Kruskal's
236: algorithm) will work in $\ell$ phases: in its $i$th phase, it will
237: consider all blue edges of price $w_i$ (if any) and then all red edges
238: of cost $w_i$ (if any). The number of blue edges selected in the $i$th
239: phase will not depend on the order in which blue or red edges of weight
240: $w_i$ are considered. This shows the claim. Moreover, if there is no red
241: edge of cost $w_i$ then $p$ is not an optimal price function because the
242: leader can raise the price of every blue edge of price $w_i$ to the next
243: weight $w_{i+1}$ and thus increase his/her revenue. This implies
244: the following lemma.
245: 
246: \begin{lemma}
247: \label{AssignedCostEqualGivenCost}
248: In every optimal price function, the
249: prices assigned to the blue edges appearing in some MST
250: belong to the set $\{c(e) : e \in R\}$. 
251: \end{lemma}
252: 
253: Notice that for optimal price functions, the prices given to the blue 
254: edges that are in no MST do not really matter, as long as they are high enough. 
255: We find it convenient to see them as equaling $\infty$. This has the same
256: effect as deleting those blue edges. A direct consequence of
257: Lemma~\ref{AssignedCostEqualGivenCost} is that the decision version
258: of \StackMST belongs to NP, using some price function $p$ with $p(e)
259: \in \{c(e) : e \in R\} \cup \{\infty\}$ for all $e \in B$
260: as a certificate. Another possibility for a certificate is
261: an acyclic set of blue edges $F$, interpreted as
262: the set of blue edges in any MST. Given $F$, we can easily compute an
263: optimal price function such that $F$ is the set of blue edges in any
264: MST, with the help of Lemma \ref{lemma-cuts} below. In the lemma, we
265: use the notation $\mathcal{C}(B',e)$ for the set of cycles of $G=(V,R \cup B')$ that
266: include the edge $e$, where $B'$ is an acyclic subset of blue edges and $e\in B'$.
267: (Notice that $\mathcal{C}(B',e)$ is nonempty because $(V,R)$ is connected.)
268: 
269: \begin{lemma}
270: \label{lemma-cuts} Consider a price function $p$, a corresponding
271: minimum spanning tree $T$, and let $F= E(T) \cap B$. Then for every
272: $e\in F$, we have
273: %
274: \begin{equation}
275: \label{eq:min-max}
276: p(e) \le \min_{C \in \mathcal{C}(F,e)} \max_{e'\in E(C) \cap R}c(e').
277: \end{equation}
278: %
279: Moreover, whenever $F$ is any acyclic set of blue edges and we set
280: $p(e)$ equal to the right hand side of \eqref{eq:min-max} for $e \in F$
281: and $p(e) = \infty$ for $e \in B - F$, we have $E(T') \cap B = F$ for
282: any corresponding MST~$T'$.
283: \end{lemma}
284: \begin{proof}
285: The first part of the lemma is straightforward. Indeed, if \eqref{eq:min-max}
286: fails for some edge $e \in F$, then there exists a red edge $e'$
287: with $c(e') < p(e)$ that links the two components of $T - e$, and so $T$
288: cannot be an MST. We now turn to the second part of the lemma. First note
289: that $E(T') \cap B$ is clearly contained in $F$ because no MST can use
290: any edge with an infinite price. By contradiction, suppose there is some
291: edge $e$ in $F$ that is not used by $T'$ and let $e'$ be a red edge with
292: maximum cost on the unique cycle of $T' + e$. Because the price function $p$
293: we have chosen satisfies \eqref{eq:min-max} (with equality), the weight
294: of edge $e$ is at most the weight of $e'$, and thus $T'$ is not an MST
295: because of our assumption that blue edges have priority over the red edges
296: of the same weight. 
297: \end{proof}
298: 
299: It follows from the above lemma that \StackMST is fixed parameter
300: tractable with respect to the number of blue edges. Indeed, to solve
301: the problem, one could try all acyclic subsets $F$ of $B$, and for
302: each of them put the prices as above (this can easily be done in
303: polynomial time), and finally take the solution yielding the highest
304: revenue.
305: We conclude this section by stating a useful property satisfied
306: by all optimal solutions of \StackMSTnospace.
307: 
308: \begin{lemma}
309: \label{obstruction}
310: Let $p$ be an optimal price function and $T$ be a corresponding MST.
311: Suppose that there exists a red edge $e$ in $T$ and a blue edge $f$
312: not in $T$ such that $e$ belongs to the unique cycle $C$ in $T + f$.
313: Then there exists a blue edge $f'$ distinct from $f$ in $C$ such that
314: $c(e) < p(f') \le p(f)$.
315: \end{lemma}
316: \begin{proof}
317: The inequality $c(e) < p(f)$ follows from the optimality of $T$ and
318: from our assumption on the priority of blue edges versus red edges
319: of the same weight. If all blue edges $f'$ distinct from $f$ in $C$
320: satisfied $p(f') \le c(e)$ or $p(f) < p(f')$ then by decreasing
321: the price of $f$ by some amount we would be able to find a new
322: price function $p'$ such that $T' = T-e'+f$ is an MST with respect to
323: $p'$, where $e'$ is some red edge on $C$. This contradicts the
324: optimality of $p$ because the revenue of $T'$ is bigger than that
325: of $T$. 
326: \end{proof}
327: 
328: \section{Complexity and Inapproximability}
329: \label{Hard}
330: 
331: By Lemma~\ref{AssignedCostEqualGivenCost}, \StackMST is trivially
332: solved when the cost of every red edge is exactly $1$, i.e.,
333: when $c(e) = 1$ for all $e\in R$.
334: In this section, we show that the problem is APX-hard
335: even when the costs of the red edges are only $1$ and $2$,
336: i.e., when $c(e)\in \{1,2\}$ for all $e\in R$.
337: We start with NP-hardness:
338: 
339: \begin{theorem}
340: \label{Hardness} \StackMST is NP-hard even when $c(e)\in \{1,2\}$
341: for all $e\in R$.
342: \end{theorem}
343: %
344: \begin{proof}
345: We present a reduction from \scover (in its decision version).
346: Let $(\mathcal{U,S})$ and the integer $t$ be an instance
347: of \scovernospace, where $\mathcal{U}=\{u_1,u_2,\ldots,u_n\}$,
348: and $\mathcal{S} = \{S_1,S_2,\ldots,S_m\}$. Without loss of
349: generality, we assume that $u_n \in S_i$ for every $i=1,2,\ldots,m$
350: (we can always add one element to $\mathcal{U}$ and to every $S_i$
351: to make sure this holds).
352: 
353: We construct a graph $G=(V,E)$ with edge set $E= R \cup B$ and a cost
354: function $c : R \rightarrow \{1,2\}$ as follows. The vertex set of
355: $G$ is $\mathcal{U} \cup \mathcal{S} = \{u_1,u_2,\ldots,u_n\} \cup
356: \{S_1,S_2,\ldots,S_m\}$. The edge set of $G$ and cost function $c$
357: are defined as follows:
358: %
359: \begin{itemize}
360: \item there is a red edge of cost $1$ linking $u_i$ and $u_{i+1}$
361:       for every $1 \leq i < n$;
362: \item there is a red edge of cost $2$ linking $u_n$ and $S_1$, and
363:       linking $S_j$ and $S_{j+1}$ for every $1 \leq j < m$;
364: \item whenever $u_i \in S_j$ we link $u_i$ and $S_j$ by a blue edge.
365: \end{itemize}
366: %
367: \begin{figure}[h!]
368: \begin{center}
369: \includegraphics[scale=0.48]{Construction.eps}
370: \caption{\label{Figure:Construction}(a) The graph $G$ constructed
371: for $n=6$, $m=3$ 
372: with $S_1=\{u_1,u_2,u_3,u_4,u_6\}$,
373: $S_2=\{u_3,u_4,u_6\}$ and $S_3=\{u_5,u_6\}$. The red edges of
374: cost $2$ are omitted for clarity. The red edges of cost $1$
375: are dashed, and the blue edges are solid. (b) An optimal price
376: function $p$ on the blue edges that yields a revenue of $9$, an
377: example MST is depicted in bold. 
378: }
379: \end{center}
380: \end{figure}
381: %
382: We illustrate such a construction in Fig.~\ref{Figure:Construction}.
383: We claim that $(\mathcal{U,S})$ has a set cover of size $t$ if and only if
384: there exists a price function $p : B \rightarrow \{1,2,\infty\}$ for
385: the blue edges of $G$ whose revenue is $n+2m-t-1$.
386: 
387: \medskip
388: 
389: \noindent ($\Rightarrow$) Suppose $(\mathcal{U,S})$ has a set cover of size
390: $t$. We construct $p$ as follows: for every blue edge $e=u_iS_j$,
391: we set $p(e)$ to be $1$ if $S_j$ is in the set cover, and $2$
392: otherwise. We show that the revenue of $p$ equals $n+2m-t-1$ by
393: running Kruskal's MST algorithm starting with an empty tree, $T$.
394: Because the blue edges of weight $1$ are the lightest, we start with
395: adding them one by one to $T$ such that we add an edge only if it
396: doesn't close a cycle in $T$. After going over all blue edges of
397: weight $1$, we are guaranteed that $T$ is a tree that spans all the
398: vertices $u_i$ for every $i = 1,\ldots ,n$, and every vertex $S_j$
399: such that $S_j$ is in the set cover. This is because these vertices are
400: connected through $u_n$ with only blue edges of weight $1$. So the
401: current weight of $T$ is $|T|-1=n+t-1$. We next try to add the red
402: edges of weight $1$, but every such edge connects two vertices,
403: $u_i$ and $u_{i+1}$, already spanned by $T$ and therefore closes
404: a cycle, so we add none of them. Next we add the blue edges of weight
405: $2$. For every $S_j$ not in the set cover, we connect $S_j$ to $T$
406: with one blue edge of weight $2$ (the second one will close a cycle).
407: Therefore, after going over all the blue edges of weight $2$, we
408: added a weight of $2(m-t)$ to $T$. Furthermore, $T$ spans the entire
409: graph so there is no need to add any red edges of weight $2$. All the
410: edges in $T$ are blue and the revenue of $T$ is $(n+t-1)+2(m-t)=n+2m-t-1$.
411: 
412: \medskip
413: 
414: \noindent ($\Leftarrow$) Suppose that there exists a price function $p : B
415: \rightarrow \{1,2,\infty\}$ for the blue edges of $G$ whose revenue is
416: $n+2m-t-1$ for some $t$. By Lemma~\ref{AssignedCostEqualGivenCost},
417: there exists such a function $p$ that is optimal. Choose then $p : B
418: \rightarrow \{1,2,\infty\}$ as an optimal price function that minimizes
419: the number of red edges in an MST $T$.
420: 
421: Assume first that $T$ contains only blue edges. Then every vertex $u_{i}$ is
422: incident to some blue edge in $T$ with price 1. To see this, observe that $u_i$ is
423: adjacent to a vertex $S_j$ that is not a leaf, thus $S_j$ has a neighbor $u_k$, and
424: the red edges in the cycle $S_j,u_1,\ldots ,u_k,S_j$ all have cost 1.
425: Thus the set $\mathcal{S}'$ of those $S_{j}$'s that are linked to some blue 
426: edge in $T$ with price 1 is a set cover of $(\mathcal{U,S})$. On the other hand, notice that
427: any $S_j \in \mathcal{S} \setminus \mathcal{S}'$ is a leaf of $T$, because
428: if there were two blue edges $u_{i}S_{j}, u_{i+\ell}S_{j}$ in $T$ then none of
429: them could have a price of 2 because of the cycle $S_{j}u_{i}u_{i+1}\dots u_{i+\ell}S_j$. 
430: Therefore, the revenue of $p$ equals $(n + |\mathcal{S}'| - 1) 
431: + 2(m - |\mathcal{S}'|) =  n + 2m - |\mathcal{S}'| - 1$. As by hypothesis
432: this is at least $n + 2m - t - 1$, we deduce that the set cover $\mathcal{S}'$
433: has size at most $t$.
434: 
435: Suppose now that $T$ contains some red edge $e$ and denote by $X_1$
436: and $X_2$ the two components of $T-e$. There exists some blue edge
437: $f=u_i S_j$ in $G$ that connects $X_1$ and $X_2$ because
438: the graph $(V,B)$ induced by the blue edges is connected (because $u_n$
439: is linked with blue edges to every $S_j$). By Lemma \ref{obstruction},
440: there exists a blue edge $f'=u_{i'}S_{j'}$ distinct from $f$ in the unique
441: cycle $C$ in $T+f$ such that $c(e) < p(f') \le p(f)$. In particular, we
442: have $c(e) = 1$ and $p(f') = 2$. By an argument given in the preceding
443: paragraph, $S_{j'}$ is a leaf of $T$, hence we have $j' = j$. Also,
444: every blue edge distinct from $f$ and $f'$ in $C$ has price $1$.
445: But then the price function $p'$ obtained from $p$ by setting the price of
446: both $f$ and $f'$ to 1 is also optimal and has a corresponding MST that uses
447: less red edges than $T$, namely $T-e+f$, a contradiction. This completes
448: the proof. 
449: \end{proof}
450: The reduction used in Theorem~\ref{Hardness} implies a stronger
451: hardness result.
452: \begin{theorem}
453: \label{APX_Hardness} \StackMST is APX-hard even when $c(e) \in \{1,2\}$
454: for all $e\in R$.
455: \end{theorem}
456: \begin{proof}
457: We will show that, for any $\varepsilon > 0$, a $(1-\varepsilon)$-approximation
458: for \StackMST implies a $(1+8 \varepsilon)$-approximation for
459: \vcover in graphs of maximum degree at most $3$. The claim will then follows
460: from the APX-hardness of the latter problem~\cite{VertexCoverOnBoundedDegree3,VertexCoverOnBoundedDegree}.
461: 
462: Let $H$ denote any given graph with maximum degree at most $3$.
463: We can assume that $H$ is connected because otherwise we process each
464: connected component separately. Moreover, we can assume that $H$ has at
465: least as many edges as vertices because \vcover can be solved exactly in
466: polynomial time if $H$ is a tree.
467: 
468: Clearly, the \vcover instance we consider is equivalent to a \scover
469: instance with $|V(H)|$ sets and $|E(H)|$ elements in the ground set.
470: Let $(\mathcal{U,S})$ be the \scover instance obtained from the latter
471: one by adding a new dummy element $d$ in the ground set, and adding $d$
472: to every subset of the instance. Hence, we have $n = |\mathcal{U}| =
473: |E(H)| + 1$ and $m = |\mathcal{S}| = |V(H)|$. Any vertex cover of $H$
474: yields a set cover of $(\mathcal{U,S})$ with the same size, and vice-versa.
475: Thus the reduction used in the proof of Theorem~\ref{Hardness} provides a
476: way to convert in polynomial time a vertex cover of size $s$ into a feasible
477: solution of the \StackMST instance corresponding to $(\mathcal{U,S})$
478: with revenue $n + 2m - s - 1$, and vice-versa. In particular, we have
479: $\OPT = n + 2m - \OPTVC - 1$, where $\OPT$ and $\OPTVC$ denote the
480: value of the optimum for the \StackMST and \vcover instances, respectively.
481: 
482: Now consider the vertex cover found by running the
483: $(1-\varepsilon)$-approximation algorithm on the \StackMST instance
484: and then converting the result into a vertex cover of $H$.
485: Denoting by $s$ its size and letting $r = n+2m-s-1$, we obtain:
486: $$
487: \begin{array}{r@{\quad}c@{\quad}l}
488: s = n + 2m - r-1 &\le& n + 2m - (1-\varepsilon) \, \OPT - 1\\[1ex]
489: &=& n + 2m - (1-\varepsilon) \, (n + 2m - \OPTVC - 1) - 1\\[1ex]
490: &=& \varepsilon \, (n - 1 + 2m) + (1-\varepsilon)  \, \OPTVC \\[1ex]
491: &\le& \varepsilon \, (3 \, \OPTVC + 6 \, \OPTVC) + (1-\varepsilon) \, \OPTVC \\[1ex]
492: &=& (1 + 8 \varepsilon)  \, \OPTVC.
493: \end{array}
494: $$
495: Above we have used the fact that $n - 1 = |E(H)| \ge |V(H)| = m$ and
496: that $\OPTVC \ge |E(H)|/3 = (n-1)/3$ because $H$ has maximum degree at
497: most~$3$.
498: \end{proof}
499: 
500: \section{The Best-Out-Of-$k$ Algorithm}
501: \label{sec-best}
502: 
503: As before, let $k$ denote the number of distinct red costs, and
504: let $c_1 < c_2 < \cdots < c_k$ denote those costs. Without loss
505: of generality, we assume that all red costs are positive (otherwise
506: we contract all red edges of cost $0$). Recall that the Best-out-of-$k$
507: algorithm is as follows. For each $i$ between $1$ and $k$, set $p(e) =
508: c_i$ for all blue edges $e \in B$ and compute an MST $T_i$. Then pick
509: $i$ such that the revenue of $T_i$ is maximum and output the
510: corresponding feasible solution. In this section, we analyze the approximation 
511: ratio ensured by this algorithm. 
512: 
513: \begin{theorem}\label{best-out-of-k}
514: Best-out-of-$k$ is a
515: $\min \{k , 1+\ln b, 1+\ln W\}$-approximation algorithm,  where $b$ denotes the
516: number of blue edges, and $W = c_k / c_1$ is the maximum ratio between red costs.
517: \end{theorem}
518: \begin{proof}
519: We let $p^*$ be an optimal price function, $T^*$ be an MST of $G$ with respect to $p^*$, and $n_i$ be the number of blue edges of price $c_i$ in $T^*$. We also define $N_i$ as the number of blue edges of price at least $c_i$ in $T^*$, that is, $N_i = \sum_{j=i}^k n_j$.
520: 
521: We first prove the following claim: for all $i = 1, \ldots, k$, the $i$th MST $T_i$ computed by Best-out-of-$k$ contains at least $N_i$ blue edges. 
522: For $S \subseteq E$, let $r(S)$ denote the maximum cardinality of an acyclic subset of $S$ (that is, the rank function of the graphic matroid of $G$).
523: We also let $R_{i}$ be the set of red edges with cost at most $c_i$, and $B^*_{i}$ be the set of blue edges $e$ such that $p^*(e) \le c_i$. 
524: 
525: Now consider an execution of Kruskal's algorithm on $G$ with respect to $p^*$, up to the point where all edges of weight at most $c_{i-1}$ have been processed. The total number of edges included up to that point in the MST $T^*$ equals $r(R_{i-1}\cup B^*_{i-1})$. Next, resume the execution of Kruskal's algorithm, process all blue edges of price $c_i$ and stop before processing any red edge of cost $c_i$. In order to maximize the number of blue edges $N_i$ of price at least $c_i$ included in $T^*$, we could lower to $c_i$ the price of all blue edges whose current price is at least $c_i$. Then, the total number of edges included up to now in $T^*$ would be exactly $r(R_{i-1}\cup B)$. This implies
526: $$
527: N_i \leq r(R_{i-1}\cup B) - r(R_{i-1}\cup B^*_{i-1}) \leq  r(R_{i-1}\cup B) - r(R_{i-1}).
528: $$
529: Because the latter expression gives the number of blue edges in $T_i$, this proves the claim.
530: 
531: Using this claim, we can bound the revenue $q$ of the solution returned by Best-out-of-$k$:
532: $$
533: q \geq \max_{i=1,\ldots,k} N_i\cdot c_i.
534: $$
535: We also know that $OPT = \sum_{i=1}^k n_i\cdot c_i$.
536: 
537: Since $n_i\leq N_i$, we have 
538: $$
539: OPT = \sum_{i=1}^k n_i\cdot c_i\leq \sum_{i=1}^k N_i\cdot c_i\leq k\cdot q,$$ 
540: proving the first approximation factor.
541: 
542: Also, we have (letting $N_{k+1}=0$):
543: \begin{eqnarray*}
544: OPT & = & \sum_{i=1}^k n_i\cdot c_i \\
545: 	& = & \sum_{i=1}^k N_i \cdot c_i \cdot \frac{n_i}{N_i} \\
546: 	& = & \sum_{i=1}^k N_i \cdot c_i \cdot \frac{N_i - N_{i+1}}{N_i} \\
547: 	& \leq & (\max_{i=1,\ldots,k} N_i \cdot c_i) \cdot \sum_{i=1}^k  \frac{N_i - N_{i+1}}{N_i} \\
548: 	& \leq & q \cdot \sum_{i=1}^k  \frac{N_i - N_{i+1}}{N_i}, 
549: \end{eqnarray*}
550: and
551: $$
552: \sum_{i=1}^k  \frac{N_i - N_{i+1}}{N_i} \leq 1+\int_{N_k}^{N_1} \frac{dt}{t} \leq 1+\ln\frac{N_1}{N_k}\leq 1+\ln b,
553: $$
554: which proves the second approximation  factor.
555: 
556: Finally, we also have the following (letting $c_0=0$):
557: \begin{eqnarray*}
558: OPT & = & \sum_{i=1}^k n_i\cdot c_i \\
559: 	& = & \sum_{i=1}^k n_i \sum_{j=1}^i (c_j - c_{j-1}) \\
560: 	& = & \sum_{j=1}^k N_j\cdot (c_j - c_{j-1}) \\
561: 	&\leq & q\cdot \sum_{j=1}^k \frac{c_j - c_{j-1}}{c_j},
562: \end{eqnarray*}
563: and
564: $$
565: \sum_{j=1}^k \frac{c_j - c_{j-1}}{c_j} \leq 1+\ln W,
566: $$
567: establishing the third approximation factor.
568: \end{proof}
569: 
570: The three approximation factors are tight for the following examples. Consider
571: a graph with $k+1$ vertices $v_1,v_2,\ldots ,v_{k+1}$, in which the red edges 
572: are of the form $v_iv_{i+1}$, and there is a blue edge parallel to every red edge.
573: The cost of the red edge $v_iv_{i+1}$ is $1/i$. The optimal solution involves setting a
574: price of $1/i$ for every blue edge $v_iv_{i+1}$, yielding a revenue of $\sum_{i=1}^k 1/i$.
575: On the other hand, the Best-out-of-$k$ algorithm sets the price of every blue edge to $1/i$
576: for some $i$, always yielding a revenue of 1. This proves that the ratios $1+\ln b$ and 
577: $1+\ln W$ are asymptotically tight.
578: 
579: The factor $k$ can be proven tight as well by considering a similar example.
580: The graph is composed of $1+\sum_{i=1}^k a^{i-1}$ vertices for some large integer $a$.
581: The red edges form a path connecting these vertices using $a^{k-i}$ edges of cost
582: $c_i=a^{i-1}$ for every $i$ between 1 and $k$. Every red edge is doubled by a blue edge.
583: The optimal solution again involves setting the prices of the blue edges equal to that of the 
584: parallel red edge, yielding a revenue of $k\cdot a^{k-1}$. The Best-out-of-$k$ algorithm
585: setting the prices to $c_i$ yields an MST containing $\sum_{j=i}^k a^{k-j}$ blue edges, with a revenue
586: of
587: % 
588: $$
589: a^{i-1}\cdot \sum_{j=i}^k a^{k-j} = a^{i-1}\cdot \frac{a^{k-i+1} - 1}{a-1} 
590: < a^{k-1}\cdot \frac{a}{a-1}.
591: $$
592: %
593: The ratio between the two revenues tend to $k$ as $a$ tends to infinity.\medskip
594: 
595: A natural generalization of \StackMST to matroids is
596: as follows. Given a matroid $(S, \mathcal{I})$ with $\mathcal{I}$
597: partitioned into two sets $\mathcal{R}$ and $\mathcal{B}$, and
598: nonnegative costs on the elements of $\mathcal{R}$, assign prices on
599: the elements of $\mathcal{B}$ in such a way that the revenue given
600: by a minimum weight basis of $(S, \mathcal{I})$ is maximized. We mention
601:  that the analysis of Best-out-of-$k$
602: given in the proof of Theorem~\ref{best-out-of-k} extends swiftly
603: to the case of matroids, yielding the same approximation for
604: this more general case.
605: 
606: \section{Linear Programming Relaxation}
607: \label{LP} In this section, we give an integer programming
608: formulation for the problem and study its linear
609: programming relaxation. All red costs $c_{i}$ are assumed to be positive
610: throughout the section. 
611: For each $j=1, \ldots, k$, and
612: each blue edge $e \in B$ we define a variable $x_{j,e}$. The
613: interpretation of these variables is as follows: think of a feasible
614: solution $p : B \to \{c_1,c_2,\ldots,c_k\}$ and an MST $T$
615: with respect to $p$.
616: Then $x_{j,e} = 1$ means that the blue edge $e$ appears
617: in $T$, with a price $p(e)$ of at least $c_j$.
618: 
619: We let $c_0 = 0$ and, as in the previous section, denote by $R_j$ the set of red
620: edges of cost at most $c_j$.
621: For $t$ pairwise disjoint sets of vertices $C_1, \dots, C_t$, we denote by
622: $\delta_B(C_1:C_2:\cdots:C_t)$ the set of blue edges that are in the
623: cut defined by these sets.
624: The integer programming
625: formulation then reads:
626: \begin{align}
627: \nonumber
628: (\IP) \quad \textrm{max}
629: &\ds \sum_{e \in B \atop 1 \le j \le k} (c_j - c_{j-1})\,x_{j,e}\\[4ex]
630: \textrm{s.t.}
631: &\ds \sum_{e \in \delta_B(C_1:C_2:\cdots:C_t)} \!\!\!\!\!\!\!\!\!\!\! \!\! x_{j,e} \le t - 1
632:   & &\forall j \in \{1,2,\dots,k\},\label{IP-forest}\\[-2ex]
633: \nonumber & & &\forall C_1, ..., C_t \textrm{ components of } (V,R_{j-1});\\[1ex]
634: &\ds \sum_{e \in P \cap B} x_{1,e} + x_{j,f} \le |P \cap B|
635:   & &\forall f=ab \in B, \forall j \in \{2,3,\dots,k\}, \label{IP-cycle}\\[-2ex]
636: \nonumber & & &\forall P \textrm{ $ab$-path in } (B \cup R_{j-1}) - f;\\[1ex]
637: &\ds x_{1,e} \ge x_{2,e} \ge \cdots \ge x_{k,e} \ge 0
638:   & &\forall e \in B; \label{IP-equality} \\[1ex]
639: &x_{j,e} \in \{0,1\} & & \forall j \in \{1,2,\dots,k\}, \forall e \in B. \label{IP-integer}
640: \end{align}
641: 
642: Let us first give some intuition on this integer program.
643: Consider a minimum spanning tree $T$ with respect to a feasible solution $p$, let $F$ be the set of blue edges appearing in $T$, and let $F_{j} =\{e\in F: p(e) \geq c_{j}\}$.
644: Then $F$ ($=F_{1}$) must obviously be a forest. 
645: Also, $F_{j}$ ($j \in \{2,\dots,k\}$) must be a forest in the
646: graph where every component of $(V, R_{j-1})$ has been contracted, since otherwise
647: we could swap in $T$ some edge of $F_{j}$ with an edge in $R_{j-1}$.
648: This is encoded by constraints~\eqref{IP-forest}.
649: Similarly, if a cycle $C$ of the graph is such that every red edge in $C$
650: has cost at most $c_{j-1}$ and some blue edge $f$ of $C$ appears in $T$ with a
651: price at least $c_{j}$, then there must be another blue edge of $C$ that is not included in $T$.
652: This is ensured by constraints~\eqref{IP-cycle}. 
653: 
654: \begin{proposition}
655: The integer program above is a formulation of \StackMSTnospace.
656: \end{proposition}
657: \begin{proof}
658: Consider a feasible solution $x$ of the integer program (IP) and
659: let $F = \{e \in B : x_{1,e} = 1\}$. Inequality~\eqref{IP-forest} ensures
660: that $F$ is a forest. For $e \in F$, let $p(e) = c_j$ if $j$ is
661: the last index for which $x_{j,e} = 1$ and, for $e \in B-F$,
662: let $p(e) = \infty$. Now consider a minimum spanning tree $T$
663: with respect to $p$. We claim $E(T) \cap B = F$ and that the revenue of $T$ is exactly the
664: objective value for $x$.
665: 
666: It suffices to prove that all edges of $F$ belong to $T$. All
667: edges $e \in F$ of price $c_1$ are necessarily in $T$. Assume
668: that all edges $e \in F$ of price less than $c_j$ are in $T$, for
669: some $j \ge 2$. We show that this holds too for edges of price
670: $c_j$. Consider some edge $f$ with $p(f) = c_j$. Suppose that $f$
671: is not in $T$. This means that there exists a cycle in $G$
672: consisting of blue edges of price at most $c_j$ and of red
673: edges of price at most $c_{j-1}$. But then~\eqref{IP-cycle} is violated,
674: a contradiction. So the claim holds.
675: 
676: Conversely, consider any optimal solution to the \StackMST problem
677: with price function $p(\cdot)$ and a corresponding MST $T$.
678: Let $F = E(T) \cap B$. We define a vector $x$ as follows: for $e\in B$,
679: $x_{i,e}=1$ if $e\in F$ and $p(e) \ge c_i$, otherwise $x_{i,e}=0$. It is
680: easily checked that the revenue given by $p$ equals the objective function
681: of the IP for $x$. Moreover, constraints~\eqref{IP-forest}, \eqref{IP-equality}
682: and~\eqref{IP-integer} are clearly satisfied by $x$. Finally, note that if
683: $x$ violates~\eqref{IP-cycle} for some $e\in F$, then $e$ also violates
684: the min-max formula given in Lemma~\ref{lemma-cuts}. This completes the proof.
685: \end{proof}
686: 
687: The rest of this section is devoted
688: to the LP relaxation of the above IP, obtained by dropping constraint
689: \eqref{IP-integer}. We show that the LP is tractable and that its integrality gap 
690: matches essentially the guarantee given by the Best-out-of-$k$ algorithm.
691: (Let us recall that the integrality gap of the LP on a specified set of instances $\mathcal{I}$
692: is defined as the supremum of the ratio $(\LP) / (\IP)$ over all instances in $\mathcal{I}$.)
693: 
694: 
695: \begin{proposition}
696: \label{prop-separation}
697: The LP can be separated in polynomial time.
698: \end{proposition}
699: \begin{proof}
700: For fixed $j$, \eqref{IP-forest} can be separated in polynomial time using
701: standard techniques for the forest polytope, as described e.g. in Schrijver~\cite[pp. 880--881]{S03B}.
702: Inequality \eqref{IP-cycle} can be rewritten as
703: $$
704: \sum_{e \in P \cap B} (1 - x_{1,e}) \ge x_{j,f}.
705: $$
706: Thus, for each fixed $j$ and $f=ab$, \eqref{IP-cycle} can be separated by finding
707: a shortest $ab$-path in the graph $(V, (B \cup R_{j-1}) - f)$ where every
708: red edge has weight 0 and every blue edge $e$ has weight $1-x_{1,e}$.
709: Finally,  \eqref{IP-equality} can obviously be separated in polynomial time.
710: \end{proof}
711: 
712: We first bound the integrality gap from above:
713: 
714: \begin{proposition}
715: \label{prop-LP-bound}
716: We have $(\LP) \le \min \{k , 1+\ln b, 1+\ln W\} \cdot (\IP)$,  where $b$ denotes the
717: number of blue edges, and $W = c_k / c_1$ is the maximum ratio between red costs.
718: \end{proposition}
719: %
720: \begin{proof}
721: Let $x$ be any feasible vector for the LP. The value
722: of the objective function for $x$ is thus 
723: %
724: $$
725: \sum_{e \in B \atop 1 \le i \le k} (c_i - c_{i-1})\,x_{i,e}.
726: $$
727: %
728: 
729: Let $i\in \{1,\dots,k\}$, 
730: let $C^{1}, \dots, C^{\ell}$ be components of the graph $(V, R_{i-1} \cup B)$,
731: and denote by $C^{j}_{1}, \dots, C^{j}_{\ell_{j}}$ the components of the subgraph of
732: $(V, R_{i-1})$ induced by $C^{j}$. For every $j\in \{1,\dots, \ell\}$, we have
733: %
734: $$
735: \sum_{e \in B[C^{j}_{1} \cup \cdots \cup C^{j}_{\ell_{j}}]} x_{i,e}
736: = \sum_{e \in \delta_B(C^{j}_{1}:C^{j}_{2}:\cdots:C^{j}_{\ell_{j}})} x_{i,e}.
737: $$
738: %
739: (Here, for $S\subseteq V$, the notation $B[S]$ means the set of blue edges with both endpoints
740: in $S$.)
741: Indeed, this holds trivially if $i=1$, since then each $C^{j}_{p}$ is a vertex of $C^{j}$.
742: For $i\geq 2$, for any blue edge $f=ab$ that is internal to a component
743: $C^{j}_{p}$ of $C^{j}$ (that is, $f \in B[C^{j}_{p}]$), there exists an $ab$-path
744: consisting of edges of $R_{i-1}$, and so~\eqref{IP-cycle} enforces that $x_{i,f} \leq 0$.
745: 
746: Also, constraints~\eqref{IP-forest} imply 
747: %
748: $$
749: \sum_{e \in \delta_B(C^{j}_{1}:C^{j}_{2}:\cdots:C^{j}_{\ell_{j}})} x_{i,e} \le \ell_{j} - 1,
750: $$
751: %
752: for every $j\in \{1,\dots, \ell\}$.
753: We thus obtain
754: %
755: $$
756: \sum_{e \in B} x_{i,e} 
757: = \sum_{j=1}^{\ell} \sum_{e \in \delta_B(C^{j}_{1}:C^{j}_{2}:\cdots:C^{j}_{\ell_{j}})} x_{i,e}
758:  \le \sum_{j=1}^{\ell} (\ell_{j} - 1) = r(R_{i-1}\cup B) - r(R_{i-1}).
759: $$
760: %
761: The number of blue edges in the $i$th MST computed by  Best-out-of-$k$ being exactly
762:  $r(R_{i-1}\cup B) - r(R_{i-1}) =: A_{i}$, it then follows
763: %
764: $$
765: \sum_{e \in B \atop 1 \le i \le k} (c_i - c_{i-1})\,x_{i,e} \leq \sum_{i=1}^{k} (c_i - c_{i-1})\,A_{i}.
766: $$
767: %
768: Letting $q=\max_{i=1,\ldots,k} A_{i}\cdot c_{i}$
769:  denote the revenue given by the Best-out-of-$k$ algorithm, we deduce
770: %
771: $$
772:  \sum_{i=1}^{k} (c_i - c_{i-1}) A_{i}
773:  =  \sum_{i=1}^{k} \frac{c_i - c_{i-1}}{c_{i}} A_{i}\cdot c_{i}
774:  \leq q\cdot \sum_{i=1}^{k} \frac{c_i - c_{i-1}}{c_{i}},
775: $$
776: %
777: and, letting $A_{k+1}=0$,
778: %
779: $$
780:  \sum_{i=1}^{k} (c_i - c_{i-1}) A_{i}
781:  =  \sum_{i=1}^{k} c_i (A_{i} - A_{i+1})
782: =  \sum_{i=1}^{k} A_{i}\cdot c_i \frac{A_{i} - A_{i+1}}{A_{i}}
783: \leq q\cdot \sum_{i=1}^{k} \frac{A_{i} - A_{i+1}}{A_{i}}. 
784: $$
785: %
786: As in the proof of Theorem~\ref{best-out-of-k}, we have 
787: %
788: $$
789: \sum_{i=1}^{k} \frac{c_i - c_{i-1}}{c_{i}} \leq \min\{k, 1 + \ln W\}
790: $$
791: %
792: and
793: %
794: $$
795: \sum_{i=1}^{k} \frac{A_{i} - A_{i+1}}{A_{i}} \leq 1 + \ln b.
796: $$
797: %
798: Therefore, 
799: %
800: \begin{align*}
801: \sum_{e \in B \atop 1 \le i \le k} (c_i - c_{i-1})\,x_{i,e} 
802: &\leq \min \{k , 1+\ln b, 1+\ln W\} \cdot q \\
803: &\leq \min \{k , 1+\ln b, 1+\ln W\} \cdot (\IP),
804: \end{align*}
805: %
806: as claimed.
807: \end{proof}
808: 
809: \begin{proposition}
810: \label{prop-gap}
811: The integrality gap of the LP is 
812: \begin{itemize}
813: \item $k$ on instances with $k$ distinct costs;
814: \item $\Theta(\ln W)$ on instances with maximum ratio between red costs $W$, and
815: \item $\Theta(\ln b)$ on instances with $b$ blue edges.
816: \end{itemize}
817: \end{proposition}
818: %
819: \begin{proof}
820: We already know from Proposition~\ref{prop-LP-bound} that
821: the integrality gap of the LP is at most $\min \{k , 1+\ln b, 1+\ln W\}$.
822: We first by prove that the integrality gap is at least $k$ on instances with $k$
823: distinct costs. To this aim, we define an instance of \StackMST as follows:
824: Choose an integer $a\ge 2$ and let the vertex set of the graph be
825: $V=\{0,1,2,\dots,a^{k-1}\}$. The graph has $a^{k-1}$ blue edges, linking vertex $0$
826: to every other vertex. The $i$th red cost is $c_i = a^{i-1}$.
827: For $i\in \{1,2,\dots,k-1\}$, the subgraph spanned by the
828: red edges with cost $c_{i}$ is a disjoint union of $a^{k-i-1}$ cliques, each
829: of cardinality $a^{i}$; the vertex sets of these cliques are
830: $\{1,\dots,a^{i}\}, \{a^{i} + 1,\dots,2a^{i}\}, \dots, \{a^{k-1} - a^{i} + 1,\dots,a^{k-1}\}$.
831: Finally, there is a unique red edge with cost $c_{k}$, linking vertex $0$ to vertex $1$.
832: 
833: Consider an optimal solution of the \StackMST problem for the instance defined above,
834: and let $T$ be a corresponding MST. Consider any blue edge $e$
835: in $T$, of price $c_i$, and let $C_e$ be the unique component of $(V - \{0\},R_{i-1})$
836: that contains an endpoint of $e$.
837: No other blue edge of $T$ has an endpoint in $C_e$, because
838: otherwise one could replace the edge $e$ in $T$ with an appropriate red edge of
839: $R_{i-1}$ and obtain a new spanning tree with weight strictly less than that of $T$, a
840: contradiction.
841: Thus, if $e$ and $f$ are two distinct blue edges of $T$, then
842: $C_e \cap C_f = \emptyset$. Noticing that the price given to $e$ is $c_i = a^{i-1} = |C_e|$, we deduce that
843: the revenue given by $T$ is
844: %
845: $$
846: \sum_{e\in B\cap E(T)} |C_e| \le a^{k-1}.
847: $$
848: Moreover, a revenue of $a^{k-1}$ is easily achieved, set for instance all blue
849: edges of the graph to the same price $c_i$ for some $i\in \{1,\dots,k\}$. Hence,
850: $(\IP)=a^{k-1}$.
851: 
852: We now define a feasible solution $x^*$ for the LP. The point $x^*$ will have the property that
853: $x^*_{i,e}=x^*_{i,f}$ for $1\le i \le k$ and all $e,f \in B$.
854: We thus let $y_i=x^*_{i,e}$ for $e\in B$. The constraints on the $y_i$'s imposed by the LP are then:
855: %
856: \begin{align*}
857: &a^{i-1}y_i \le 1 &  \textrm{ for } 1\le i\le k;\\
858: &y_1 + y_i \le 1 &  \textrm{ for } 2\le i\le k;\\
859: &y_1 \ge y_2 \ge \cdots \ge y_k \ge 0.
860: \end{align*}
861: %
862: 
863: Set $y_1 = (a-1)/a$ and $y_i = 1/a^{i-1}$ for $2 \le i \le k$, which satisfies
864: the above constraints. The value of the objective function of the LP for the point $x^*$ is
865: %
866: \begin{align*}
867: \LP(x^*)&=\sum_{e \in B \atop 1 \le i \le k} (c_i - c_{i-1}) x^*_{i,e}\\
868: &= a^{k-1}\left(\frac{a-1}{a} + \sum_{2 \le i \le k} (a^{i-1} - a^{i-2}) \frac{1}{a^{i-1}}\right)
869: = ka^{k-1} - ka^{k-2}.
870: \end{align*}
871: Therefore, the ratio $\LP(x^*) / (\IP)$ tends to $k$ as $a \to \infty$.
872: 
873: Now, the same construction can be used to show that the integrality gap
874: is $\Omega(\ln W)$ and $\Omega(\ln b)$ on instances with $c_{k}/c_{1} = W$
875: and $b$ blue edges, respectively. We explain it in the case where the number
876: of blue edges is fixed to some value $b$, the case 
877: where the ratio $c_{k}/c_{1}$ is fixed is done similarly.
878: 
879: Take an instance as above, with $a=2$ and $k$ being the greatest integer 
880: such that $2^{k-1} \leq b$. Choose an arbitrary blue edge and add $b - 2^{k-1}$ 
881: parallel blue edges to it (so that the number of blue edges is exactly $b$). 
882: These extra blue edges have clearly no influence on the value 
883: of $(\IP)$ and $\LP(x^*)$ (where $x^*$ is defined as before). Using
884: $b < 2^{k}$, we deduce
885: %
886: $$
887: \frac{\LP(x^*)}{(\IP)} = \frac{k2^{k-1} - k2^{k-2}}{2^{k-1}} = \frac{k}{2} > \frac{\log_{2}b}{2},
888: $$
889: %
890: and thus that the integrality gap is $\Omega(\ln b)$, as claimed.
891: \end{proof}
892: 
893: To conclude this section, let us mention that
894: we know of additional families of valid
895: inequalities that
896: cut the fractional point used in the above proof.
897: We leave the study of those for future research.
898: 
899: \section*{Acknowledgments}
900: We thank Martine Labb\'e and Gilles Savard for preliminary
901: discussions concerning this problem, Martin Hoefer for his comments 
902: which led us to refine our approximability result. 
903: We are also most grateful to the second 
904: anonymous referee  for providing us with a much shorter proof of Theorem~\ref{best-out-of-k},
905: and for her or his many insightful remarks which led to an improved version of the paper.
906: 
907: \bibliographystyle{plain}
908: \bibliography{MST}
909: \end{document}
910: