0811.0210/performance_analysis.tex
1: 
2: \section{Performance Analysis}
3: \label{sec_performance_analysis}
4: 
5: 
6: In this section, we  present a performance analysis of the proposed
7: classification algorithm. We  show that the optimality loss due to
8: relaxation and random rounding is negligible if the total sample
9: number $N$ is sufficiently large. Therefore, our algorithm is
10: near-optimal with reduced computational complexity.
11: 
12: 
13: We need to use the inequality in Lemma \ref{azuma_inequality} in our
14: discussion. The inequality is one variation of the Azuma inequality
15: proven by Janson \cite{azuma67}\cite{janson98}.
16: 
17: 
18: 
19: 
20: 
21: \begin{lemma} \cite{janson98}
22: \label{azuma_inequality}\emph{(Azuma Inequality)} Let
23: $Z_1,\dots,Z_N$ be independent random variables, with $Z_k$ taking
24: values in a set $\Lambda_k$. Assume that a (measurable) function
25: $f:\Lambda_1\times \Lambda_2\times \cdots \times
26: \Lambda_N\rightarrow {\mathbb R}$ satisfies the following Lipschitz
27: condition (L).
28: \begin{itemize}
29: \item (L) If the vectors $z,z'\in\prod_{1}^{N}\Lambda_i$ differ only
30: in the $k$th coordinate, then $|f(z)-f(z')|<c_k$, $k=1,\ldots,N$.
31: \end{itemize}
32: Then, the random variable $X=f(Z_1,\ldots,Z_N)$ satisfies, for any
33: $t\geq 0$,
34: \begin{align}
35: {\mathbb P}(X\geq {\mathbb E}X+t)\leq
36: \exp\left(\frac{-2t^2}{\sum_{1}^{N}c_k^2}\right),
37: \end{align}
38: \begin{align}
39: {\mathbb P}(X\leq {\mathbb E}X-t)\leq
40: \exp\left(\frac{-2t^2}{\sum_{1}^{N}c_k^2}\right).
41: \end{align}
42: \end{lemma}
43: 
44: 
45: 
46: As in the previous sections, we use $a_{ni}^{\ast}$ to denote the
47: solution for the relaxation programming. We use  $p_i^{\ast}$,
48: $\left(\sigma_i^{\ast}\right)^2$, $\mu_i^{\ast}$ to denote the
49: corresponding occurrence probability, variance and mean. That is,
50: \begin{align}
51: & \mu_i^{\ast}=\frac{\sum_{n=1}^{N}a_{ni}^{\ast}x_n}{\sum_{n=1}^{N}a_{ni}^{\ast}}, \\
52: & (\sigma_i^\ast)^2= \left(\frac{1}{\sum_{n=1}^{N}a_{ni}^\ast}\right)\sum_{n=1}^{N}a_{ni}^\ast\left(x_n-\mu_i^\ast\right)^2,\\
53: & p_i^\ast=\frac{\sum_{n=1}^{N}a_{ni}^\ast}{N}.
54: \end{align}
55: We  use $z_1,\ldots,z_N$ to denote the classification scheme
56: obtained from Algorithm \ref{relaxation_algorithm}. In the
57: following, we abuse the notation and use $a_{ni}$ to denote the
58: randomly rounded version of the variable $a_{ni}^\ast$, i.e.,
59: \begin{align}
60: a_{ni}=\left\{\begin{array}{ll}
61: 1, & \mbox{if }z_n=i \\
62: 0, & \mbox{otherwise}
63: \end{array}\right.
64: \end{align}
65: Similarly, we use  $p_i$, $\sigma_i^2$, $\mu_i$ to denote the
66: corresponding occurrence probability, variance, and mean. That is,
67: \begin{align}
68: & \mu_i=\frac{\sum_{n=1}^{N}a_{ni}x_n}{\sum_{n=1}^{N}a_{ni}}, \\
69: & \sigma_i^2= \left(\frac{1}{\sum_{n=1}^{N}a_{ni}}\right)\sum_{n=1}^{N}a_{ni}\left(x_n-\mu_i\right)^2,\\
70: & p_i=\frac{\sum_{n=1}^{N}a_{ni}}{N}.
71: \end{align}
72: 
73: 
74: \begin{definition}
75: Let $\epsilon_1,\epsilon_2,\epsilon_3$ be arbitrary positive real
76: numbers. We say that one classification scheme is
77: $(\epsilon_1,\epsilon_2,\epsilon_3)$-typical if the following
78: conditions hold for all $i$, $1\leq i\leq J$,
79: \begin{align}
80: \left|\sum_{n=1}^{N}a_{ni}-\sum_{n=1}^{N}a_{ni}^\ast\right|\leq
81: \epsilon_1 N,
82: \end{align}
83: \begin{align}
84: \left|\sum_{n=1}^{N}a_{ni}x_n-\sum_{n=1}^{N}a_{ni}^\ast
85: x_n\right|\leq \epsilon_2 N,
86: \end{align}
87: \begin{align}
88: \left|\sum_{n=1}^{N}a_{ni}\left(x_n-\mu_i^\ast\right)^2-\sum_{n=1}^{N}a_{ni}^\ast\left(x-\mu_i^\ast\right)^2\right|\leq
89: \epsilon_3 N.
90: \end{align}
91: \end{definition}
92: 
93: 
94: 
95: \begin{lemma}
96: If $\epsilon_1,\epsilon_2,\epsilon_3$ all go to zero, then for
97: $(\epsilon_1,\epsilon_2,\epsilon_3)$-typical classification schemes,
98: $\mu_i$, $p_i$, $\sigma_i^2$ go to  $\mu_i^\ast$, $p_i^\ast$,
99: $\left(\sigma_i^\ast\right)^2$ respectively.
100: \end{lemma}
101: \begin{proof}
102: It can be easily checked that $\mu_i$ goes to $\mu_i^\ast$, and
103: $p_i$ goes to $p_i^\ast$. For $\sigma_i^2$, we notice that
104: \begin{align}
105: & \sum_{n=1}^{N}a_{ni}(x_n-\mu_i)^2 \\
106: & =\sum_{n=1}^{N}a_{ni}(x_n-\mu_i^{\ast}+\mu_i^{\ast}-\mu_i)^2\\
107: & =\sum_{n=1}^{N}a_{ni}(x_n-\mu_i^{\ast})^2+
108: \sum_{n=1}^{N}a_{ni}(\mu_i^{\ast}-\mu_i)^2 \\
109: & \hspace{0.5in} +2
110: \sum_{n=1}^{N}a_{ni}(x_n-\mu_i^{\ast})(\mu_i^{\ast}-\mu_i) \\
111: & =\sum_{n=1}^{N}a_{ni}(x_n-\mu_i^{\ast})^2+
112: p_iN(\mu_i^{\ast}-\mu_i)^2 \\
113: & \hspace{0.5in} +2
114: (\mu_i^{\ast}-\mu_i)\sum_{n=1}^{N}a_{ni}(x_n-\mu_i^{\ast}) \\
115: & =\sum_{n=1}^{N}a_{ni}(x_n-\mu_i^{\ast})^2-
116: p_iN(\mu_i^{\ast}-\mu_i)^2
117: \end{align}
118: Therefore,
119: \begin{align}
120: \sigma_i^2 & =\frac{\sum_{n=1}^{N}a_{ni}(x_n-\mu_i)^2}{\sum_{n=1}^{N}a_{ni}}\\
121: &
122: =\frac{\sum_{n=1}^{N}a_{ni}(x_n-\mu_i^{\ast})^2}{\sum_{n=1}^{N}a_{ni}}-(\mu_i^{\ast}-\mu_i)^2
123: \end{align}
124: It follows that $\sigma_i^2$ goes to
125: $\left(\sigma_i^\ast\right)^2$.\qed
126: \end{proof}
127: 
128: 
129: \begin{theorem}
130: \label{main_theorem} Let $\epsilon_1,\epsilon_2,\epsilon_3$ be
131: arbitrary positive real numbers. Let $V=\max_n x_n-\min_n x_n$.
132: Then, the probability that the classification scheme obtained from
133: Algorithm \ref{relaxation_algorithm} is not
134: $(\epsilon_1,\epsilon_2,\epsilon_3)$-typical is upper bounded as
135: follows.
136: \begin{align}
137: & {\mathbb P}\left(\mbox{the classification scheme is not
138: }(\epsilon_1,\epsilon_2,\epsilon_3)\mbox{-typical}\right) \\
139: & \leq 2J
140: \exp\left(-2\epsilon_1^2N\right)+2J\exp\left(\frac{-2\epsilon_2^2N}{V^2}\right)+
141: 2J\exp\left(\frac{-2\epsilon_3^2N}{V^4}\right)
142: \end{align}
143: \end{theorem}
144: \begin{proof}
145: By using the Azuma inequality, we can show that
146: \begin{align}
147: {\mathbb
148: P}\left(\left|\sum_{n=1}^{N}a_{ni}-\sum_{n=1}^{N}a_{ni}^\ast\right|\geq
149: \epsilon_1 N\right)\leq 2\exp\left(-2\epsilon_1^2N\right),
150: \end{align}
151: \begin{align}
152: {\mathbb
153: P}\left(\left|\sum_{n=1}^{N}a_{ni}x_n-\sum_{n=1}^{N}a_{ni}^\ast
154: x_n\right|\geq \epsilon_2 N\right)\leq
155: 2\exp\left(\frac{-2\epsilon_2^2N}{V^2}\right),
156: \end{align}
157: \begin{align}
158: & {\mathbb
159: P}\left(\left|\sum_{n=1}^{N}a_{ni}\left(x_n-\mu_i^\ast\right)^2-\sum_{n=1}^{N}a_{ni}^\ast\left(x-\mu_i^\ast\right)^2\right|\geq
160: \epsilon_3 N\right) \\
161: & \leq 2\exp\left(\frac{-2\epsilon_3^2N}{V^4}\right).
162: \end{align}
163: The theorem follows from a union bound.\qed
164: \end{proof}
165: 
166: 
167: \begin{corollary}
168: \label{main_corollary}  If the sample number $N$ is sufficiently
169: large, then the classification scheme obtained from Algorithm
170: \ref{relaxation_algorithm} is
171: $(\epsilon_1,\epsilon_2,\epsilon_3)$-typical with probability close
172: to one.
173: \end{corollary}
174: \begin{proof}
175: The upper bound in Theorem \ref{main_theorem} is close to zero for
176: sufficiently large $N$.\qed
177: \end{proof}
178: 
179: 
180: \begin{corollary}
181: \label{existence_corollary}  If the sample number $N$ is
182: sufficiently large, then there exists at least one
183: $(\epsilon_1,\epsilon_2,\epsilon_3)$-typical classification scheme.
184: \end{corollary}
185: \begin{proof}
186: We have presented an algorithm, which constructs such a
187: classification scheme with success probability close to one.\qed
188: \end{proof}
189: 
190: 
191: \begin{remark}
192: Theorem \ref{main_theorem} and Corollary \ref{existence_corollary}
193: imply that the gap between the optimal classification gain achieved
194: in the relaxation optimization and the optimal classification gain
195: achieved in the integer optimization goes to zero asymptotically. In
196: other words, the continuous relaxation incurs an asymptotically
197: vanishing optimality loss.
198: \end{remark}
199: 
200: 
201: 
202: 
203: 
204: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
205: 
206: 
207: 
208: 
209: 
210: 
211: 
212: 
213: 
214: 
215: 
216: 
217: 
218: 
219: 
220: 
221: 
222: 
223: 
224: 
225: 
226: 
227: 
228: 
229: 
230: 
231: 
232: 
233: 
234: 
235: 
236: 
237: 
238: 
239: 
240: 
241: 
242: 
243: 
244: 
245: 
246: 
247: %
248: