0705.1789/pp.tex
1: \documentclass[10pt,twocolumn, a4paper]{IEEEtran}
2: 
3: %-------------------------------------------------------
4: \usepackage{epsfig}
5: \usepackage{amsmath}
6: \usepackage{latexsym}
7: \usepackage{amsfonts,amsmath,color,amssymb,amsxtra, graphicx, times}
8: \usepackage[ansinew]{inputenc}
9: \usepackage{subfigure}
10: %-------------------------------------------------------
11: 
12: \date{}
13: \begin{document}
14: %-------------------------------------------------------
15: \newcommand{\edgesin}{\Gamma_{I}(v)}
16: \newcommand{\edgesout}{\Gamma_{O}(v)}
17: \newcommand{\indegree}{\delta_{I}(v)}
18: \newcommand{\outdegree}{\delta_{O}(v)}
19: \newcommand{\field}{{\mathbb F}_{q}}
20: \newcommand{\comment}[1]{}
21: \newcommand{\fig}[1]{{\itshape Figure~\ref{#1}}}
22: \newcommand{\alg}[1]{{\itshape Algorithm~\ref{#1}}}
23: \newcommand{\sect}[1]{{\itshape Section~\ref{#1}}}
24: \newcommand{\tabl}[1]{{\itshape Table~\ref{#1}}}
25: \newcommand{\eq}[1]{\eqref{#1}}
26: \newcommand{\secref}[1]{Section~\ref{#1}}
27: \newcommand{\ie}{i.e.~}
28: \newcommand{\eg}{e.g.~}
29: \newcommand{\theorref}[1]{{\itshape Theorem~\ref{#1}}}
30: \newcommand{\proprref}[1]{{\itshape Proposition~\ref{#1}}}
31: \newcommand{\lemmarref}[1]{{\itshape Lemma~\ref{#1}}}
32: \newcommand{\boxend}{\hfill$\blacksquare$} 
33: \newtheorem{theorem}{Theorem}
34: \newtheorem{lemma}{Lemma}
35: \newtheorem{proposition}{Proposition}
36: \newtheorem{corollary}{Corollary}
37: \newtheorem{example}{Example}
38: \newtheorem{definition}{Definition}
39: \newtheorem{remark}{Remark}
40: 
41: %-------------------------------------------------------
42: \title{Random Linear Network Coding: \\ A free cipher?}
43: 
44: 
45: \author{Lu\'isa Lima \qquad Muriel M\'edard \qquad Jo\~{a}o Barros
46: \thanks{L. Lima (luisalima@ieee.org) and J. Barros (barros@dcc.fc.up.pt) are with the Instituto de Telecomunica\c{c}\~oes (IT) and the
47: Department of Computer Science, Faculdade de Ci\^{e}ncias da
48: Universidade do Porto, Portugal.
49: M. M\'edard (medard@mit.edu) is with the Laboratory for Information and Decision Systems at the Massachusetts Institute of Technology.
50: This work was partly supported by the Funda\c{c}\~{a}o
51: para a Ci\^{e}ncia e Tecnologia
52: (Portuguese Foundation for Science and Technology)
53: under grant SFRH/BD/24718/2005 and by AFOSR under grant "Robust Self-Authenticating Network Coding" AFOSR 000106.
54:  Part of this work was done while the first author was
55: a visiting student at the Laboratory for Information and Decision Systems at the Massachusetts Institute of Technology.
56: }
57: }
58: 
59: \maketitle
60: 
61: %-------------------------------------------------------
62: \begin{abstract}
63: We consider the level of information security provided by
64: random linear network coding in network scenarios in which all nodes
65: comply with the communication protocols yet are assumed
66: to be potential eavesdroppers (i.e.~``nice but curious").
67: For this setup, which differs from  wiretapping
68: scenarios considered previously, we develop a natural {\it algebraic security}
69: criterion, and prove several of its key properties.
70: A preliminary analysis of the impact of
71:  network topology on the overall network coding security,
72:  in particular for complete directed acyclic graphs, is also included.
73: \end{abstract}
74: 
75: %-------------------------------------------------------
76: \begin{keywords}
77: security, information theory, graph theory, network coding.
78: \end{keywords}
79: 
80: %-------------------------------------------------------
81: \section{Introduction}\label{sect:Introduction}
82: 
83: Under the classical networking paradigm, in which intermediate nodes are only allowed to store and forward packets,
84: information security is usually viewed as an independent feature with little or no relation to other communication tasks.
85: In fact, since intermediate nodes receive exact copies of the sent packets, data confidentiality is commonly ensured by
86: cryptographic means at higher layers of the protocol stack. Breaking with the ruling paradigm,
87: network coding allows intermediate nodes to mix information from different data flows
88: ~\cite{ahlswede2000nif, koetter2003aan} and thus provides an intrinsic level of data security
89: --- arguably one of the least well understood benefits of network coding.
90: 
91: \par Previous work on this issue has been mostly concerned with constructing codes capable of spliting the data among different links, such that reconstruction by a wiretapper is either very difficult or impossible. In  \cite{cai2002snc}, the authors present a secure linear network code that achieves perfect secrecy against an attacker with access to a limited number of links. A similar problem is considered in \cite{feldman2004csn}, featuring a random coding approach in which only the input vector is modified. \cite{bhattad2005wsn} introduces a different information-theoretic security model, in which a system is deemed to be secure if an eavesdropper is unable to get any
92: decoded or decodable (also called {\it meaningful}) source data. Still focusing on wiretapping attacks, \cite{jain2004sbn} provides a simple security protocol exploiting the network topology: an attacker is shown to be unable to get any meaningful information unless it can access those links that are necessary for the communication between the legitimate sender and the receiver, who are assumed to be using network coding.
93: As a distributed capacity-achieving approach for the multicast case, randomized network coding \cite{ho2003bco,ho2003rnc} has been
94: shown to extend naturally to packet networks with losses \cite{lun2005crc} and
95: Byzantine modifications (both detection and correction \cite{ho2004bmd,jaggi2005cae,jaggi2006rnc,jaggiThesis}).
96: ~\cite{tan2006snc} adds a cost criterion to the secure network coding problem, providing heuristic solutions for a coding scheme that minimizes both the network cost and the probability that the wiretapper is able to retrieve all the messages of interest.
97: 
98: \begin{figure}[t!]
99: \centering
100: \includegraphics[height=4cm]{butterfly1.pdf}
101: \caption{Canonical Network Coding Example. In this image, intermediate nodes are represented with squares. With this code, node 4 is a vulnerability for the network since it can decode all the information sent through it. Note that the complete opposite happens for node 5, that receives no meaningful information whatsoever.
102: }
103: \label{fig:BU}
104: \end{figure}
105: In this work, we approach network coding security from a different angle: our focus is {\it not} on the threat posed by external wiretappers but on the more general threat posed by intermediate nodes. We assume that the network consists entirely of ``nice but curious" nodes, i.e. they comply with the communication protocols (in that sense, they are well-behaved) but may try to acquire as much information as possible from the data that passes through them (in which case, they are potentially malicious). This notion is highlighted in the following example.
106: 
107: \begin{example}
108: Consider the canonical network coding example with $7$ nodes, shown in~\fig{fig:BU}. Node $1$ sends a flow to sinks $6$ and $7$ through intermediate nodes $2$, $3$, $4$ and $5$. From the point of security, we can distinguish between three types of intermediate nodes in this setting: (1) those that only get a non-meaningful part of the information, such as node $5$; (2) those that obtain all of the information, such as node $4$; and (3) those that get partial yet meaningful information, such as nodes $2$ and $3$. Although this network code could be considered {\em secure} against single-edge external wiretapping --- \ie, the wiretapper is not able to retrieve the whole data simply by eavesdropping on a single edge  --- it is clearly insecure against internal eavesdropping by an intermediate node.
109: \end{example}
110: 
111: Motivated by this example, we set out to investigate the security potential of network coding.
112: Our main contributions are as follows:
113:  \begin{itemize}
114:  \item {\it Problem Formulation}: We formulate a secure network coding problem, in which all intermediate nodes are viewed as potential eavesdroppers and the goal is to characterize the intrinsic level of security provided by random linear network coding.
115:  \item {\it Algebraic Security Criterion}: Based on the notion that the number of decodable bits available to each intermediate node is limited by the degrees of freedom it receives, we are able to provide a natural secrecy constraint for network coding and to prove some of its most fundamental properties.
116:  \item {\it Security Analysis for Complete Directed Acyclic Graphs}: As a preliminary step towards understanding the interplay between network topology and security against eavesdropping nodes, we present a rigorous characterization of the achievable level of algebraic security for this class of complete graphs.
117:  \end{itemize}
118:  The remainder of this paper is organized as follows. First, a formal problem statement is
119:  in \sect{sect:ProblemFormulation}, followed by a detailed analysis of the algebraic security
120:  of Randomized Linear Network Coding in \sect{sect:secRLNC}. In \sect{sect:DAG}, this analysis is carried out
121:  specifically for complete directed acyclic graphs.
122: The paper concludes with \sect{sect:ConcludingRemarks}.
123: 
124: %-------------------------------------------------------
125: \section{Problem Setup}\label{sect:ProblemFormulation}
126: 
127: We adopt the network model of ~\cite{koetter2003aan}: we represent the network as an acyclic directed graph $G = (V, E)$, where $V$ is the set of nodes and $E$ is the set of edges.
128: Edges are denoted by round brackets $e=(v,v') \in E$, in which $v=\textrm{head}(e)$ and $v'=\textrm{tail}(e)$. The set of edges that end at a vertex $v \in V$ is denoted by $\edgesin = \{ e \in E: \textrm{head}(e) = v \} $, and the in-degree of the vertex is $\indegree = |\edgesin|$; similarly, the set of edges originating at a vertex $v \in V$ is denoted by $\Gamma_{O}(v) = \{ e \in E: \textrm{tail}(e) = v \} $, the out-degree being represented by $\outdegree = |\edgesout|$.
129: 
130: Discrete random processes $X_{1}, ... X_{K}$ are observable at one {or more} source nodes. To simplify the analysis, we shall consider that each network link is free of delays and that there are no losses. Moreover, the capacity of each link is one bit per unit time, and the random processes $X_{i}$ have a constant entropy rate of one bit per unit time. Edges with larger capacities are modelled as parallel edges and sources of larger entropy rate are modelled as multiple sources at the same node.
131: { We shall consider multicast connections as it is the most general type of single
132: connection; there are $d\ge1$ receiver nodes. The objective is to transmit all the source processes to each of the receiver nodes.}
133: 
134: In linear network coding, edge $e=(v,u)$ carries the process $Y(e)$, which is defined below:
135: 
136: $$Y(e) = \sum_{l: X_{l} \textrm{ generated at v}}\alpha_{l,e}X(v,l)+\sum_{e':head(e')=tail(e)}\beta_{e',e}Y(e')$$
137: 
138: The {\em transfer matrix $M$} describes the relationship between an input vector $\underline{x}$ and an output vector $\underline{z}$, $\underline{z} = \underline{x}M$; $M=A(I-F)^{-1}B^{T}$, where $A$ and $B$ represent, respectively, the linear mixings of the input vector and of the output vector, and have sizes $K\times|E|$ and $\nu\times|E|$. $F$ is the adjacency matrix of the directed labelled line graph corresponding to the graph $G$.
139: In this paper we shall not consider matrix $B$, which only refers to the decoding at the receivers. Thus, we shall mainly analyse parts of the matrix $AG$, such that $G=(I-F)^{-1}$; $\underline{a}_{i}$ and $\underline{c}_{i}$ denote column $i$ of $A$ and $AG$, respectively. 
140: We define the {\em partial transfer matrix} $M'_{\edgesin}$ (also called {\em auxiliary encoding vector}~\cite{lun2005crc}) as the observable matrix at a given node $v$, \ie the observed matrix formed by the symbols received at a node $v$. This is equivalent to the fraction of the data that an intermediate node has access to in a multicast transmission.
141: 
142: Regarding the coding scheme, we consider the random linear network coding scheme introduced in ~\cite{ho2003bco}: and thus each coefficient of the matrices described above is chosen independently and uniformly over all elements of a finite field $\field$, $q=2^m$.
143: 
144: Our goal is to evaluate the {\em intrinsic security} of random linear network coding, in multicast scenarios where all the intermediate nodes in the network are potentially malicious eavesdroppers.
145: Specifically our threat model assumes that intermediate nodes perform the coding operations as outlined above, and will try to decode as much data as possible.
146: 
147:  %-------------------------------------------------------
148: \section{Algebraic Security of Random Linear Network Coding}\label{sect:secRLNC}
149: 
150: \subsection{Algebraic security}
151: 
152: The Shannon criterion for information-theoretic security~\cite{shannon1949cta} corresponds in general terms to a zero mutual information between the cypher-text ($C$) and the original message ($M$),~\ie $I(M;C)=0$.
153: This condition implies that an attacker must guess $\leq H(M)$ symbols to be able to compromise the data.
154: With network coding, on the other hand,
155:  if the attacker is capable of guessing $M$ symbols,  $K-M$ additional observed symbols are required for decoding --- by noting that each received symbol is a linear combination of the $K$ message symbols from the source, we can see that
156: a receiver must receive $K$ coded symbols in order to recover one message symbol. Thus, as will be shown later, restricted rank sets of individual symbols do not translate
157: into immediately decodable data with high probability.
158: This notion is illustrated in \fig{fig:SC1}. In the scheme shown on top, each intermediate node can recover half of the
159: transmitted symbols, whereas in the bottom scheme none of the nodes can recover any portion of the sent data.
160: 
161: \begin{figure}[h!]
162: \centering
163: \includegraphics[width=6cm]{securityCriteria1.pdf}
164: \caption{Example of algebraic security. In the upper scheme data is not protected, whereas in the lower scheme nodes 2 and 3 are
165: unable to recover any data symbols.}
166: \label{fig:SC1}
167: \end{figure}
168: 
169: 
170: \definition[Algebraic Security Criterion]{The level of security provided by random linear network coding is measured by the
171:  number of symbols that an intermediate node $v$ has to guess in order to decode {\it one} of the transmitted symbols.
172:  From a formal point of view,
173: $$\Delta_{S}(v)= \frac{K - (\textrm{rank}( M'_{\edgesin}) + l_{d}}{K}, $$
174:   where $l_{d}$ represents the number of partially diagonalizable lines of the matrix (i.e.~the number of message symbols that can be recovered by Gaussian elimination).}
175: 
176: Notice that the previous definition is equivalent to computing the difference between the global rank of the code and the local rank in each intermediate node $v$. Moreover, as more and more symbols become compromised of security criteria, the level of security tends to $0$, since as we shall show in this section, with high probability the number of individually decodable symbols $l_{d}$ goes to zero
177: as the size of the field goes to infinity.
178: 
179: 
180: \subsection{Security Characterization}
181: 
182: We are now ready to solve the problem of characterizing the algebraic security of random linear network coding. The key to our proofs is to analyze the properties of the partial transfer matrix at each intermediate node. 
183: Recall that there are two cases in which the intermediate node can gain access to relevant information: (1) when the partial transfer matrix has full rank and (2) when the partial transfer matrix has diagonalizable parts. Thus, we shall carry out independent analyzes in terms of rank and in terms of partially diagonalizable matrices.
184: 
185: The following lemmas will be useful.
186: \lemma{In the random linear network coding scheme, $$ P(\Delta_{S}>0) \leq P(\exists v: \indegree > K). $$ }\label{prop:rank}
187: \begin{proof} See the {\it Appendix}. \end{proof}
188: It follows from this lemma that it is only necessary to consider the case in which  $K \leq \indegree $.
189: 
190: \begin{lemma}
191: The probability that a linear combination of independent and uniformly distributed values in $\field$
192: yields the zero result is bounded by $$P(X_{lin}=0) \leq \frac{2q+h(q)}{q^2},$$ where $h(q)$ is a function
193: such that $O(h(q))<O(q^2)$. Moreover, $P(X_{lin}=0)$ tends to $0$ when $q\rightarrow\infty$.
194: \label{prop:linCombFq}
195: \end{lemma}
196: \begin{proof} See the {\it Appendix}. \end{proof}
197: 
198: \lemma{The probability of obtaining $y$ zeros in one line of the $\xi\times\xi$ transfer matrix $M$ is bounded by
199: $$P(Y=y) \leq \dbinom{\xi}{\xi-y}\left(\frac{2q+h(q)}{q^2}\right)^{y}\left(1-\frac{2q+h(q)}{q^2}\right)^{\xi-y}.$$
200: }\label{prop:lineTransferMatrix}
201: \begin{proof} See the {\it Appendix}. \end{proof}
202: 
203: \theorem{Let $P(l_{d}>0)$ be the probability  of recovering a strictly positive number of symbols $l_{d}$ at the intermediate nodes  with $\indegree \leq K-1$ by Gaussian elimination. Then, $P(l_{d}>0)\rightarrow0$ with $q\rightarrow \infty$ and $K\rightarrow \infty$.}\label{theorem:diag}
204: 
205: \begin{proof}
206: Let $M'$ be the transpose of the partial transfer matrix at some vertex $v$, $M'=M_{\edgesin}^T$. We consider the process of Gaussian elimination of $M'$.
207: It is unnecessary to consider rank $K$, since in that case the matrix, w.h.p, is invertible and hence diagonalizable~\cite{ho2003rnc}. Thus, $M'$ is a
208: $\indegree\times K$ matrix, $\indegree < K$.
209: 
210: We prove the theorem constructively by analysing the probability of having $K-1$ zeros in one or more lines of $M'$. Let $p$ be the probability of having $K-1$ zeros in a line of $M'$, and let $X$ be a random variable representing the recoverable number of symbols when an intermediate node has $\indegree$ degrees of freedom. It follows from \lemmarref{prop:lineTransferMatrix} that
211: 
212: $$p=\dbinom{K}{K-1}\left(\frac{2q+h(q)}{q^2}\right)^{1}\left(1-\frac{2q+h(q)}{q^2}\right)^{K-1}.$$
213: 
214: 
215: In the base case with $\indegree=1$, at most $X=1$ symbols can be recovered, since there are not enough degrees of freedom to perform Gaussian elimination and the only chance for recovering a symbol is that the line of the matrix $M$ already has $K-1$ zeros. The probability for this is $p$.
216: 
217: In the case that $1<\indegree<K$, we can obtain directly a number $L=l$ of lines with $K-1$ zeros, and a number $\indegree-l$ of lines in the opposite situation. Since we have $\indegree$ degrees of freedom to perform Gaussian elimination, we can obtain at most $\indegree$ symbols by successive elimination. At each step the probability of obtaining a line with $K-1$ zeros is bounded by $p$.
218: 
219: By analysing the different possibilities of combinations for the lines that already have $K-1$ zeros and the ones that can be obtained by Gaussian elimination, we get
220: $$P(X=x) \leq \sum_{l=0}^{x}\dbinom{\indegree}{l}p^l(1-p)^{\indegree-l}P_{l}(X=x)$$
221: $$P_{l}(X=x) \leq \dbinom{\indegree-l}{x-(\indegree-l)}p^{x-\indegree+l}(1-p)^{2\indegree-2l-x},$$
222: where $P_{l}(X=x)$ represents $P(X=x|L=l)$.
223: 
224: Approximating the binomial distribution by a normal distribution yields
225: \begin{align*}
226: P_{l}(X=x) \approx \frac{e'}{\sqrt{2\pi(\indegree-l)p(1-p)}},
227: \end{align*}
228: where
229: \begin{align*}
230: e'=\exp\left(-\frac{1}{2}\frac{(x-(\indegree-l)p)^2}{(\indegree-l)p(1-p)}\right)
231: \end{align*}
232: Since $p\rightarrow p*<1$, we can state that, when $q\rightarrow \infty$ and $p\rightarrow 0$ is $\approx \exp(x^2)$. When $K$ goes to $\infty$, so does $x$, and hence
233: $$\exp(x^2)_{x\rightarrow\infty}\rightarrow 0,$$
234: and
235: $$P_{l}(X=K-1)_{q\rightarrow\infty, K\rightarrow\infty}\rightarrow0.$$
236: Since
237: $$P(X=K-1) = \sum_{l=0}^{K-1}\dbinom{\indegree}{l}p^l(1-p)^{\indegree-l}P_{l}(X=K-1),$$
238: and $P_{l}(X=K-1)$ decreases exponentially, and $l$ only increases linearly,
239: $$P(X=K-1)_{q\rightarrow 0, K\rightarrow \infty}\rightarrow0.$$
240: The probability of obtaining $X < K-1$ symbols is bounded by $P(X=K-1)$; it follows that the probability of decoding $X$ symbols
241: with any $\indegree<K$ goes to zero as $q$ and $K$ tend to infinity.
242: \end{proof}
243: 
244: 
245: %-------------------------------------------------------
246: 
247: \section{Algebraic Security of the Complete Graph}\label{sect:DAG}
248: 
249: 
250: Notice that, in consequence of the property outlined in Lemma ~\ref{prop:rank}, the algebraic security of a graph is topology dependent. A node with $\indegree\ge K$  will not necessarily receive a full-rank partial transfer matrix. The rank depends on the available paths between sources and each intermediate node. More specifically, depending on the topology of the graph, some nodes may receive only combinations of symbols derived from
251: matrices with restricted rank, i.e.~less than $K$. This includes, for example, trees, where a node connected directly to the source by a link of capacity $C$  can only have children that receive at most rank $C$.
252: 
253: As a first step towards general network models, we consider the case of complete acyclic directed graphs $G=(V,E)$, $n=|V|$, which can
254: be generated as follows.
255: \begin{itemize}
256: \item Generate random labels for the $n$ vertices. These have some ordering $\{e_{1},e_{2},...,e_{n}\}$ associated to them;
257: \item Make an outgoing (directed) edge from the vertex with the minimum label to every vertex with a higher label;
258: \item Continue until we reach a vertex where there are no more possibilities for connections.
259: \end{itemize}
260: 
261: This algorithm generates a complete acyclic directed graph with one source, one sink and $|E|=n(n-1)/2$ edges, since the total degree of each vertex is $n-1= \indegree+\outdegree$. The source and the sink are naturally determined as those nodes that have only outgoing edges or only incoming edges, respectively.
262: The ordering ensures that this algorithm always generates an acyclic directed graph, conferring the graphs generated in this way specific properties such as the distribution of the in and out-degrees. These properties can be determined directly from the order of the vertex using $\outdegree=n-order(v)$ and $\indegree=n-\outdegree-1=order(v)-1$.
263: 
264: Before proving our next theorem, we introduce the following lemmas.
265: 
266: \lemma{In complete acyclic directed graphs, a node that receives $R$ symbols,  receives w.h.p.~a partial transfer matrix with rank equal to min($R,K$).}\label{prop:deltaDAG2}
267: 
268: \begin{proof} See the {\it Appendix}. \end{proof}
269: 
270: 
271: \lemma{ For the complete directed acyclic graph, w.h.p.,
272: $$\Delta_{S}(v) = \frac{K - \min(K, \textrm{order}(v))}{K}.$$}\label{prop:deltaDAG1}
273: 
274: \begin{proof} See the {\it Appendix}. \end{proof}
275: 
276: \theorem{Let $\phi_{S}$ be the {\em secure max-flow}, defined as the maximum number of symbols that may be secured in a transmission by using random linear network coding. For a complete acyclic directed graph with $n$ nodes, the secure max-flow equals the max-flow min-cut capacity of the network and is $n-1$. Conversely, the minimum numbers of required symbols for secured transmission is $n-1$ symbols.
277: }
278: 
279: \begin{proof}
280: 
281: Suppose, by contradiction, that $K=n-1$ is the max-flow min-cut capacity of the complete directed acyclic graph. The maximum order of an intermediate node $v$ is $n-2$, thus by \lemmarref{prop:deltaDAG1} we have $\Delta_{S}(v) = 1/(n-1)$.
282: It follows that the secure max-flow of the complete acyclic directed graph equals the capacity of the graph.
283: 
284: 
285: By contradiction, let the minimum number of required symbols for secured transmission be $m_{s}\leq n-2$. There exists an intermediate node $v$ such that $\textrm{order}(v)=n-1$, and consequently, $\Delta_{S}(v) = 0$. Then the minimum number of required symbols for secure transmission is $m_{s}=n-1$.
286: 
287: \end{proof}
288: 
289: It follows that the way to secure this class of complete graphs is to transmit at the max-flow min-cut capacity, if necessary
290: by adding ``dummy'' symbols.
291: 
292: %-------------------------------------------------------
293: \section{Conclusions}\label{sect:ConcludingRemarks}
294: 
295: Intrigued by the security potential inherent to random linear network coding,
296: we developed a specific algebraic security criterion, for which we proved
297: a set of key properties.
298: Perhaps one of the most striking conclusions of our analysis is that
299: algebraic security with network coding is very dependent on the
300: topology of the network. As an example, we focused on complete acyclic directed graphs,
301: and determined the secure max-flow, as well as the minimum number of symbols required
302: for algebraic security. As part of our ongoing work, we are extending this analysis
303: to other more general network models. Ultimately, we would like to develop secure
304: communication protocols capable of exploiting random linear network coding as 
305: an {\it almost free} cypher.
306: 
307: %-------------------------------------------------------
308: \section*{Acknowledgements}
309: The authors gratefully acknowledge insightful discussions with Rui A. Costa (Univ. of Porto).
310: %-------------------------------------------------------
311: \vspace{-0.2cm}
312: 
313: \begin{thebibliography}{10}
314: 
315: \bibitem{ahlswede2000nif}
316: R.~Ahlswede, N.~Cai, S.Y.R. Li, and RW~Yeung,
317: \newblock ``{Network information flow},''
318: \newblock {\em IEEE Transactions on Information Theory}, vol. 46, no. 4, pp.
319:   1204--1216, 2000.
320: 
321: \bibitem{koetter2003aan}
322: R.~Koetter and M.~Medard,
323: \newblock ``{An algebraic approach to network coding},''
324: \newblock {\em IEEE/ACM Transactions on Networking}, vol. 11, no. 5, pp.
325:   782--795, 2003.
326: 
327: \bibitem{cai2002snc}
328: N.~Cai and RW~Yeung,
329: \newblock ``{Secure network coding},''
330: \newblock {\em Proceedings of the IEEE International
331:   Symposium on Information Theory}, Lausanne, Switzerland, 2002.
332: 
333: \bibitem{feldman2004csn}
334: J.~Feldman, T.~Malkin, C.~Stein, and RA~Servedio,
335: \newblock ``{On the capacity of secure network coding},''
336: \newblock {\em Proc. of the 42nd Annual Allerton Conference on Communication, Control,
337:   and Computing}, 2004.
338: 
339: \bibitem{bhattad2005wsn}
340: K.~Bhattad and K.R. Narayanan,
341: \newblock ``{Weakly secure network coding},''
342: \newblock {\em Proc. of the First Workshop on Network Coding, Theory, and Applications
343:   (NetCod)}, Riva del Garda, Italy, 2005.
344: 
345: \bibitem{jain2004sbn}
346: K.~Jain,
347: \newblock ``{Security based on network topology against the wiretapping
348:   attack},''
349: \newblock {\em IEEE Wireless Communications}, vol. 11, no. 1, pp. 68--71, 2004.
350: 
351: \bibitem{ho2003bco}
352: T.~Ho, R.~Koetter, M.~Medard, D.R. Karger, and M.~Effros,
353: \newblock ``{The benefits of coding over routing in a randomized setting},''
354: \newblock {\em Proc. of the IEEE International Symposium on Information Theory (ISIT)},
355: Yokohama, Japan, June/July 2003.
356: 
357: \bibitem{ho2003rnc}
358: T.~Ho, M.~Medard, J.~Shi, M.~Effros, and D.R. Karger,
359: \newblock ``{On randomized network coding},''
360: \newblock {\em Proceedings of the 41st Annual Allerton Conference on Communication,
361:   Control, and Computing}, 2003.
362: 
363: \bibitem{lun2005crc}
364: D.S. Lun, M.~Medard, R.~Koetter, and M.~Effros,
365: \newblock ``{On Coding for Reliable Communication over Packet Networks},''
366: \newblock {\em Arxiv preprint cs.IT/0510070}, 2005.
367: 
368: \bibitem{ho2004bmd}
369: T.~Ho, B.~Leong, R.~Koetter, M.~Medard, M.~Effros, and DR~Karger,
370: \newblock ``{Byzantine modification detection in multicast networks using
371:   randomized network coding},''
372: \newblock {\em Proceedings of the International
373:   Symposium on Information Theory}, Yokohama, Japan, June/July 2003.
374: 
375: \bibitem{jaggi2005cae}
376: S.~Jaggi, M.~Langberg, T.~Ho, and M.~Effros,
377: \newblock ``{Correction of adversarial errors in networks},''
378: \newblock {\em Proceedings of the International Symposium on 
379: Information Theory}, Adelaide, Australia, September 2005.
380: 
381: \bibitem{jaggi2006rnc}
382: S.~Jaggi, M.~Langberg, S.~Katti, T.~Ho, D.~Katabi, and M.~Medard,
383: \newblock ``{Resilient Network Coding In the Presence of Byzantine
384:   Adversaries},'' IEEE Infocom,
385: \newblock 2006.
386: 
387: \bibitem{jaggiThesis}
388: Sidharth Jaggi,
389: \newblock {\em Design and Analysis of Network Codes},
390: \newblock Ph.D. thesis, California Institute of Technology, 2005.
391: 
392: \bibitem{tan2006snc}
393: J.~Tan and M.~Medard,
394: \newblock ``{Secure Network Coding with a Cost Criterion},''
395: \newblock {\em Proc. 4th International Symposium on Modeling and Optimization
396:   in Mobile, Ad Hoc and Wireless Networks (WiOpt'06)}, Boston MA, April, 2006.
397: 
398: \bibitem{shannon1949cta}
399: C.E. Shannon,
400: \newblock {\em Communication Theory of Secrecy Systems},
401: \newblock Bell Systems Technical Journal, Vol. 28, pp. 656-715, October 1949.
402: 
403: \end{thebibliography}
404: 
405: 
406: \vspace{-0.2cm}
407: 
408: 
409: \appendix
410: 
411: \subsection*{Proof of \lemmarref{prop:rank}}
412: We will prove this constructively in terms of the ranks of parts of the transfer matrix. The auxiliary encoding vector in each intermediate node $v$ is
413: given by
414: $$M'_{\edgesin} = ( A(I-F)^{-1} )_{\edgesin},$$
415: where $M'_{\edgesin}$ denotes the columns of the matrix corresponding to the incoming edges of $v$. The dimension of  $M'_{\edgesin}$ is $K\times\indegree$, with $\indegree<|E|$.
416: 
417: To determine the rank of the partial transfer matrix, we note that the transfer matrix $M=A(I-F)^{-1}B^T$ for the network must be invertible, and hence, $\textrm{rank}(M)=K$. On the other hand, to determine the rank of $A(I-F)^{-1}$ we use the fact that $(I-F)^{-1}$ is invertible and thus $ \textrm{rank}((I-F)^{-1}) = |E| $. 
418: We also have $$\textrm{rank}(A(I-F)^{-1}) \leq |E|,$$ because the dimension of $A(I-F)^{-1})$ is $K\times|E|$.
419: But, since $$ \textrm{rank}(A(I-F)^{-1}B^T) = K = \min(\textrm{rank}(A(I-F)^{-1}),B) $$ holds and $K<|E|$ (true because $K$ must be less than the minimum cut in the network) we conclude that $$\textrm{rank}(A(I-F)^{-1}) = K.$$
420: 
421: We now consider $\Delta_{S}(v)$ at some vertex $v$. For that, we can consider two distinct cases: the first one is if $K < \indegree $. In this case, we cannot assume anything about $\Delta_{S}(v)$, since the rank of the matrix $M'_{\edgesin}$ will be dependent on the topology of the network. As for the second case, $\textrm{rank}(M'_{\edgesin})<K \Rightarrow \Delta_{S}(v)<0$.  
422: \boxend
423: \subsection*{Proof of Lemma \ref{prop:linCombFq}}
424: Contrary to the sum, the product of independent and uniformly distributed values in $\field$ is {\it not} independent and uniformly distributed. In fact, there are two ways to obtain a zero in a multiplication in $\field$: (1) by multiplication between an element $a\in \field$ and $0$, and (2) by multiplication over two elements $a \in \field$ and $b \in \field$, such that $a \neq 0$ and $b \neq 0$, but $ab=0$.
425: Now, the total number of entries of the multiplicative table between $q$ elements of $\field$ is $q^2$, and
426: there are at most $2q$ instances of the first case: q instances of $ab=0$, $a=0$ and $b\neq0$, and $q$ instances of $ab=0$, $a=0$ and $b\neq0$. As for the second case, it is possible to prove by contradiction that the number of zeros obtained this way is strictly less than $q^2$: if this was not the case, all products of elements of $\field$ would be zero, and that is absurd. Since this is true for any $q$, the number of zeros grows $O(h(q))<O(q^2)$.
427: Thus, we have
428: $$P(X_{lin}=0) \leq \frac{2q+h(q)}{q^2}.$$
429: Since for large enough $q$ we have $(2+h(q))/q < 1$, it follows that
430: $$P(X_{lin}=0)_{q\rightarrow\infty} = 0.$$ \boxend
431: \vspace{-0.25cm}
432: \subsection*{Proof of Lemma \ref{prop:lineTransferMatrix}}
433: Each position of a line of the transfer matrix $M$ is a linear combination of independently and uniformly chosen values in $\field$, and thus, the probability of obtaining a zero in a position is given by \lemmarref{prop:linCombFq}. The result follows by considering all the combinations of the possible positions in which the $Y$ zeros may occur. \boxend
434: 
435: \vspace{-0.25cm}
436: \subsection*{Proof of Lemma \ref{prop:deltaDAG2}}
437: 
438: Suppose that a given intermediate node receives $R=K+\theta$ symbols, $\theta\ge0$.
439: It is clear that the maximum possible rank is $K$ and thus there is a way to remove $\theta$ columns s.t. the rank of the resulting set will still be at maximum $K$.
440: Now consider the case in which vertex $v$ receives at most $K$ symbols. If the columns are linearly dependent, the condition
441: $$ \{x_{h_{1}}\underline{c}_{h_{1}}+x_{h_{2}}\underline{c}_{h_{2}}+...+x_{h_{n}}\underline{c}_{h_{n}}=(0...0)^T\},$$
442: such that $x_{h_{1}}, x_{h_{2}}, ..., x_{h_{n}} \textrm{not all } 0, \in \field$ and $h_{1},h_{2},...,h_{n}$ represent the columns  $\in \edgesin$, will be satisfied. Since the linear combination of lines of the transfer matrix is again a linear combination of independent and uniformly distributed values in $\field$, it follows from \lemmarref{prop:lineTransferMatrix} that the probability of obtaining $(0...0)^T$ tends to $0$ when $q\rightarrow\infty$ and $K\rightarrow\infty$, and thus, the columns $h_{1},h_{2},...,h_{n} \in \edgesin$ are linearly independent w.h.p.
443: \boxend\vspace{-0.25cm}
444: \subsection*{Proof of Lemma \ref{prop:deltaDAG1}}
445: 
446: It follows from \lemmarref{prop:deltaDAG2} that w.h.p., the number of symbols received by a vertex is the rank of the partial transfer matrix received (and at most $K$) and thus
447: \begin{align*}
448: \Delta_{S}(v) =
449: &\frac{K - \min(K,\indegree)}{K} =&\\
450: &\frac{K - \min(K, \textrm{order}(v)-1)}{K}
451: \end{align*}
452: \boxend
453: 
454: \end{document}