cond-mat0012487/wcs.tex
1: %% \documentstyle[eclepsf,twocolumn]{article}
2: \documentstyle[epsbox,twocolumn]{article}
3: \setlength{\textwidth}{7.067716in}
4: \setlength{\textheight}{220.8mm}
5: \setlength{\oddsidemargin}{-0.4in}
6: \setlength{\evensidemargin}{-0.4in}
7: \setlength{\columnsep}{0.25in}
8: \setlength{\topmargin}{0in}
9: \setlength{\headheight}{0in}
10: \setlength{\headsep}{0in}
11: \pagestyle{empty}
12: \title{Self-Similar Traffic Originating in the Transport Layer}
13: \author{Kensuke Fukuda\\[0.2cm]
14: Network Innovation Laboratories,\\ 
15: Nippon Telegraph and Telephone Corp.\\
16: 3-9-11 Midori-cho Musashino, \\180-8585, Japan\\
17: fukuda@t.onlab.ntt.co.jp
18: \and Misako Takayasu\\[0.2cm]
19: Faculty of Complex Systems, \\Future University-Hakodate\\
20: 116-2 Kameda-Nakano, Hakodate,\\
21: 041-0803, Japan\\
22: takayasu@fun.ac.jp
23: \and Hideki Takayasu\\[0.2cm]
24: Sony Computer Science Laboratories\\
25: 3-14-13 Higashi-Gotanda, Shinagawa\\
26: 141-0022, Japan\\
27: takayasu@csl.sony.co.jp}
28: \date{}
29: \begin{document}
30: \thispagestyle{empty}
31: \maketitle
32: \thispagestyle{empty}
33: \noindent {\bf Keywords:} \ self-similar traffic, phase transition, TCP
34: \begin{abstract}
35: We performed a network traffic simulation  to clarify the  
36: mechanism producing self-similar traffic originating in the 
37: transport layer level.
38: Self-similar behavior could be observed without assuming a 
39: long-tailed distribution of the input file size.
40: By repeating simulations with modified TCP we 
41: found that the feedback mechanism from the network, such as  
42: packet transmission driven by acknowledgement packets, 
43: plays an essential role in explaining the self-similarity 
44: observed in the actual traffic. 
45: \end{abstract}
46: \section{INTRODUCTION}
47: %Recent traffic measurement analysis have been clarified that 
48: %the network traffic fluctuation indicates the 
49: %self-similarity (long-range dependency)
50: %\cite{Leland94,Csabai94,Paxson95,Crovella97,Willinger97}. 
51: 
52: Internet traffic fluctuation is known to show  
53: self-similarity or long-range dependency
54: \cite{Leland94,Csabai94,Paxson95,Crovella97,Willinger97}. 
55: This self-similarity is the scale invariant property that 
56: the burst size of the flow density fluctuation 
57: seems to have the same tendency at various observation time scales. 
58: Recently, it was pointed out that 
59: network traffic behavior can be regarded as phase transition 
60: phenomena in statistical physics\cite{Takayasu96a,Takayasu99a,Fukudaphd}, 
61: which naturally involves the self-similar model.
62: The phase transition\cite{Stanley71} is characterized by dynamical 
63: phase changes between non-congested and congested phases, and 
64: self-similarity can be observed at the critical point between these two phases. 
65: 
66: Similar to the observations, some studies investigated the mechanism 
67: of the self-similarity observed in network traffic. 
68: In the application layer level, 
69: Crovella et al. explained self-similar traffic from the viewpoint that 
70: the sizes of files on the web server have a power-law 
71: distribution\cite{Crovella97}. 
72: For the datalink layer, Fukuda et al. showed that 
73: self-similar Ethernet traffic can be reproduced by the 
74: effects of the contention between the nodes and of  
75: the exponential backoff mechanism at the packet collisions\cite{Fukuda2000a}. 
76: Also, 
77: Park et al.\cite{Park97} and Feldmann et al.\cite{Feldmann99} 
78: pointed out that the transport layer functionality (especially TCP) strengthens 
79: the long-range dependency. 
80: Although they aimed to show the transport layer effect, 
81: their approaches implicitly assume the long-range dependency in the 
82: application level such  as the file size distribution of the application. 
83: Thus, there seems to be little understanding of the physical explanation 
84: of the transport layer functionality itself from the viewpoint of the 
85: generation of self-similar traffic. 
86:  
87: In this paper, we focus on the transport layer effect 
88: independent of the application level causality. 
89: In other words, we investigate essential factors for generating  
90: self-similar traffic in TCP. 
91: To clarify them, 
92: we performed a simple topological simulation using the ns-2 simulator. 
93: The results show that phase transition phenomena (and 
94: self-similarity) quite similar to those observed 
95: in the actual traffic can occur in 
96: aggregated TCP traffic even when   
97: input traffic sources have no temporal correlation or  
98: long-range distribution in  file size. 
99: We also demonstrate that the feedback mechanism (especially 
100: acknowledgement packet driven events) plays  
101: an important role in generating the self-similarity observed in  
102: actual traffic, though the rate control and retransmission mechanism have less 
103: impact on it.
104: 
105: \section{SIMULATION SETUP AND ANALYSIS METHOD}
106: \subsection{Simulation Setup}
107: In this simulation, we used the VINT network simulator ns-2, and 
108: added some tcl scripts and C++ code. 
109: The scenario was file transfer from the sender to the receiver on the 
110: leaf nodes. 
111: Figure 1 shows our simple simulation topology consisting of  
112: three leaf nodes and one router. 
113: \begin{figure}[htbp]
114: \begin{center}
115: \epsfile{file=fig/topology.ps,scale=0.9}
116: \end{center}
117: \caption{Network topology.}
118: \end{figure}
119: A connection employed the TCP Reno as the basic transport protocol, and 
120: a TCP connection was established between a pair of randomly selected 
121: leaf nodes. 
122: The connection interval times of the connection were exponentially distributed. 
123: It is important to note that the number of packets in a connection  
124: followed an exponential distribution (mean size = 100 packets) 
125: throughout this simulation. 
126: Namely, the distribution of the number of packets 
127: had no temporal correlation, which means that there was no 
128: application-level causality. 
129: This condition is needed to clarify the transport mechanism 
130: from the viewpoint of the self-similarity. 
131: The buffer sizes in the leaf nodes and the router were set to 800 packets in 
132: most simulation, and the packet size was set to 576 bytes. 
133: Also, the bandwidth and transmission delay of each half duplex link were  
134: $500$ kbyte/sec and $50$ msec., respectively. 
135: The large buffer size was chosen for easy analysis of the fluctuation of  
136: packets in the buffer. 
137: Our results were obtained from several runs of the simulation, 
138: each lasting $4800$ seconds. 
139: We were interested in the statistical behaviors of the aggregated traffic 
140: streams when the connection arrival rate (to be denoted as ``r'') 
141: to the leaf nodes varied. 
142: 
143: \subsection{Analysis Method}
144: In order to examine the self-similar nature of network traffic fluctuation, 
145: we focused on two empirical distributions, 
146: namely the congestion duration and the recurrent time of the queue length.
147: 
148: The congestion duration length distribution is a well-known distribution  
149: for judging the self-similarity of a given time series\cite{Takayasu93b,Willinger97}. 
150: The congestion state is defined by the condition that 
151: the output flow density in the observed link is larger than a certain 
152: threshold flow density. 
153: Then, the congestion duration length (L) is calculated as  
154: the sequential number of  congestion states multiplied by 
155: the bin size.
156: The cumulative distribution of this duration ($P(>L)$)  
157: is a power-law distribution with exponent -1.0 
158: ($P(>L) \propto L^{-1.0}$) when the original flow fluctuation is 
159: self-similar\cite{Takayasu93b}, 
160: which is characterized by the $1/f$ type power spectrum.
161: Theoretically, the power-law distribution is observed independent of 
162: the value of the threshold if the original fluctuation is self-similar. 
163: 
164: 
165: The recurrent time of the queue length is 
166: introduced as the duration time until the queue length becomes zero. 
167: The cumulative distribution of such recurrent time obeys the same 
168: power-law distribution with exponent $-1.0$ as the congestion duration length.
169: 
170: We checked the congestion duration length of the link from 
171: the router to leaf node 2 in Figure 1 (denoted by A). 
172: Also, we observed the queue length at the Router's output queue to Leaf node 3.  
173: 
174: 
175: \section{TRANSPORT LAYER EFFECT}
176: \subsection{Real Traffic Flow}
177: In this subsection, we review the traffic fluctuation 
178: in an actual network focusing on the self-similarity and the phase 
179: transition phenomena. 
180: 
181: \begin{figure}[htbp]
182: \begin{center}
183: \epsfile{file=fig/onoff3.ps,scale=0.6}
184: \end{center}
185: \caption{Congestion duration length in actual network traffic. 
186: The straight line indicates the slope $-1.0$.}
187: \end{figure}
188: Figure 2 shows the cumulative distribution of the 
189: congestion duration length in an actual traffic flow\footnote{
190: More detailed analysis is shown in  \cite{Fukudaphd}.}. 
191: This traffic trace was collected in the 
192: Ethernet link connecting the WIDE backbone in Japan and 
193: the US west coast for 4 hours; 
194: 80\% of the traffic was due to web applications. 
195: The three curves in the figure indicate the difference in  mean flow 
196: density of the traffic flow. 
197: This figure shows that for medium flow 
198: the distribution is approximately the power-law distribution with 
199: exponent $-1.0$ representing  self-similarity.
200: However, away from the critical point the distributions deviate from the 
201: power-law distribution. 
202: When the total amount of traffic is small, 
203: the congestion duration length obeys an exponential distribution 
204: characterized by the short-time dependency.  
205: On the other hand, 
206: the larger traffic density case denoted by the high flow in this figure 
207: demonstrates the existence of the large-cluster congestion. 
208: Consequently, this result clearly shows that self-similarity occurs in  
209: a special case in the actual network traffic, and the phase transition view 
210: can capture all the properties\cite{Fukudaphd,Takayasu99a}. 
211: 
212: 
213: \subsection{TCP Traffic Behavior}
214: We are interested in the mechanism of generating the self-similar traffic 
215: observed in the previous subsection from the standpoint of the transport 
216: layer functionality.
217: This subsection explains a numerical simulation with an ordiary TCP Reno 
218: algorithm. 
219: 
220: \begin{figure}[htbp]
221: \begin{center}
222: \epsfile{file=fig/normal_tcp/onoff/on.ps,scale=0.6}
223: \end{center}
224: \caption{Congestion duration length of TCP Reno. 
225: The straight line indicates the slope $-1.0$.}
226: \end{figure}
227: Figure 3  shows the cumulative distribution of the 
228: congestion duration length of the aggregated TCP traffic flows at  link A 
229: in Figure 1. 
230: The threshold value of congestion level was empirically set to 
231: 5000 bytes throughout this simulation. 
232: The three lines correspond to three different connection interval rates
233: (r = 0.5, 2.5, 4.0 connections/sec). 
234: The mean connection duration times were 1.53, 13.4, and 412.64 sec., 
235: respectively. 
236: We found that the slope of the distribution at r = 2.5 was approximately 
237: $-1.0$, which is recognized as the critical point behavior (
238: the slope of the straight line is $-1.0$ in the figure).
239: For medium  connection arrival rate 
240: the traffic flow had self-similarity like  the actual traffic. 
241: Also, the plot decays exponentially below the critical point (r = 0.5), 
242: and it was characterized by a stretched curve above the critical point
243: (r = 4.0) representing the existence of coarse-grained congestion. 
244: These curves are completely consistent with the actual traffic 
245: behavior. 
246: The most significant point in this simulation is  that 
247: this power-law distribution with exponent $-1.0$ could be observed 
248: even when the input traffic 
249: followed an exponential distribution, not assuming a fat-tail distribution. 
250: Namely, the self-similarity appeared in the traffic fluctuation  
251: independent of the input file size distribution at the critical point.
252: 
253: \begin{figure}[htbp]
254: \begin{center}
255: \epsfile{file=fig/normal_tcp/rec/on.ps,scale=0.6}
256: \end{center}
257: \caption{Recurrent time of queue length. 
258: The straight line indicates the slope $-1.0$.}
259: \end{figure}
260: Next, we show the distribution of the recurrent time of the queue length 
261: at the router's queue (Figure 4). 
262: Again, the plotted curve  followed the 
263: power-law distribution with exponent -1.0 at the critical point (r = 2.5) 
264: which is the same connection arrival rate as in the congestion duration 
265: length analysis.
266: Also, when the connection arrival rate was smaller, the 
267: plotted curve exhibited a quick decay, and 
268: a larger rate led to a more stretched distribution due to 
269: the large size congestion.  
270: 
271: Additionally, we confirmed that the congestion duration length distribution 
272: and the recurrent time distribution had the same 
273: phase transition behavior in all  leaf nodes (links).
274: 
275: \subsection{Effect of TCP Component}
276: Section 3.2 clarified that 
277: transport functionality (TCP) itself plays a role in producing 
278: self-similar traffic. Now we 
279: focus on the generation mechanism of the self-similarity in 
280: the aggregated traffic fluctuation.
281: This subsection explains 
282: the results of additional simulations based on modified TCP 
283: in order to clarify the mechanism of the 
284: power-law distribution with exponent $-1.0$ observed in the previous subsection.
285: 
286: The first modification is not to use the slow start algorithm which increases 
287: the transmission rate exponentially. 
288: The modified algorithm employs a linear rate increment even in 
289: the connection starting time, instead of the original exponential rate increment.
290: 
291: Figure 6  shows the congestion duration length distribution of 
292: the linear rate increment case with feed back control. 
293: The aggregated traffic also exhibited phase transition similar to the normal 
294: TCP simulation. 
295: Thus, the phase transition and corresponding self-similarity are 
296: known to be independent of the details of the increment algorithm of 
297: the transmission rate.
298: \begin{figure}[htbp]
299: \begin{center}
300: \epsfile{file=fig/linear_tcp/onoff/on.ps,scale=0.6}
301: \end{center}
302: \caption{Congestion duration length of linear start TCP. 
303: The straight line indicates the slope $-1.0$.}
304: \end{figure}
305: 
306: Next, we modified the feedback control algorithm. 
307: We checked two non-feedback control schemes, CBR over UDP, and 
308: a linear rate increment algorithm over UDP. 
309: In CBR-over-UDP simulation, the packet inter-arrival duration time  
310: was set to 20 msec. 
311: Also, the linear rate increment algorithm includes a  method  
312: in which the transmission rate from the sender increases by 1 
313: for the fixed interval (150 msec).
314: The latter method is similar to the previous linear increment algorithm 
315: modified from  the original TCP. 
316: The difference is in the trigger of the packet transmission event; 
317: namely, the packet transmission event is based on the fixed-time interval 
318: event independent of the reception of the acknowledgement packets.
319: The distributions of the number of 
320: packets to be sent and connection arrival duration are exponential 
321: as in the previous simulations.
322: 
323: The distribution of congestion duration length for non-feedback 
324: control algorithm is shown in Figure 6. 
325: The two non-feedback algorithms reproduce similar statistical tendencies of 
326: the congestion duration length, therefore, we only show the result of the 
327: linear rate increment algorithm over UDP. 
328: We found that the exponent of the power-law distribution had a value, $-0.5$, 
329: clearly different from the exponent, $-1.0$, obtained in the previous 
330: subsection, 
331: although phase transition behavior is observed quantitatively like the 
332: previous simulations (the straight line in the figure indicates slope $-0.5$).
333: This type of exponent is known for the single buffer system with 
334: Poisson input\cite{Takayasu96a}. 
335: Thus, from this simulation, we can conclude that the feedback mechanism is 
336: important in generating a self-similar fluctuation with  exponent $-1.0$,  
337: which is observed in  actual Internet traffic.
338: \begin{figure}[htbp]
339: \begin{center}
340: \epsfile{file=fig/linear_udp/onoff/on.ps,scale=0.6}
341: \end{center}
342: \caption{Congestion duration length of non-feedback algorithm. 
343: The straight line indicates the slope $-0.5$.}
344: \end{figure}
345: 
346: \subsection{Effect of Buffer Capacity}
347: \begin{figure}[htbp]
348: \begin{center}
349: \epsfile{file=fig/phase/phase.ps,scale=0.6}
350: \end{center}
351: \caption{Packet loss and critical point vs. buffer capacity.}
352: \end{figure}
353: Figure 7 plots the connection arrival rate at which  
354: packet loss is first observed in the system 
355: as a function of the buffer capacity in the nodes 
356: together with the critical connection arrival rate showing the power-law 
357: distribution.
358: 
359: This figure shows that both the critical point and the packet 
360: loss point become larger as  
361: the buffer capacity increases.  
362: However, it should be emphasized that self-similarity was observed without 
363: packet loss event for buffer capacity larger than 400, 
364: namely, the retransmission event is not directly 
365: concerned with the generation of the phase transition. 
366: Moreover, we confirmed that there was no timeout event of the retransmission 
367: timer in above range. 
368: It is an evidence that the exponent of the power-law distribution 
369: is independent of the method of the retransmission.
370: Also, the larger buffer capacity leads to a larger critical point value,  
371: however, the buffer capacity itself does not 
372: affect the generation of the phase transition phenomena. 
373: 
374: %\section{Discussion}
375: \section{CONCLUDING REMARKS}
376: In this paper we  focused on a simple mechanism for generating  
377: the self-similarity observed in actual network traffic at 
378: the transport layer. 
379: In order to clarify the essence of the mechanism, 
380: we performed simulations with simple settings such as 
381: exponential file size, fixed-size packets, and the simple topology.
382: 
383: We showed that the self-similar traffic is observable in these simple 
384: settings when the traffic sources send packets using the normal TCP algorithm.
385: In addition, the reproduced traffic behavior is consistent with the 
386: phase transition model. 
387: The most significant result is that the self-similarity appears even with the 
388: exponential input file size. 
389: This indicates that the transport protocol itself includes the 
390: mechanism generating self-similar traffic. 
391: 
392: Moreover, we  clarified that the 
393: feedback mechanism, especially the packet transmission triggered by 
394: the acknowledgement packet, in TCP 
395: is an essential factor in generating the 
396: self-similarity from the results for the modified algorithm.
397: Traffic employing non-feedback effect with linear rate incremental algorithm 
398:  exhibits similar phenomena, 
399: however, the exponent of the power-law distribution, $-0.5$, is 
400: inconsistent with that of the TCP with linear rate increment algorithm, $-1.0$. 
401: This indicates that an essential factor is the acknowledgement-driven packet 
402: transmission rather than the timer-driven one.
403: We also confirmed in both the normal and modified TCP simulation that 
404: the distribution of the inter-packet arrival of the acknowledgement packet in a TCP 
405: connection has a power-law distribution at the critical point
406: (between 0.01 -- 1.0 seconds).
407: Namely, the origin of the self-similar traffic is likely due to 
408: the feedback mechanism from the network such as the delay of  
409: acknowledgement packets.
410: 
411: Finally, we showed that the retransmission mechanism and 
412: buffer capacity have less impact on the generation of self-similarity 
413: (1/f type fluctuation).
414: We concluded that these effects only work to stretch the time spent staying 
415: in the self-similar state. 
416: The network state seems to have been heavily congested at the 
417: critical connection arrival rate in our simulation. However, 
418: there are congested routers in actual wide area networks, 
419: and our results indicate that if a flow passes through the router in the 
420: critical state once, the flow can be self-similar. 
421: 
422: Our simulation results have extracted the essence of the origin of  
423: self-similar traffic from an actual complex network system 
424: (both topologically and algorithmically). 
425: The future direction of this research will be to support the development of  a 
426: more effective congestion control algorithm based on the knowledge 
427: obtained from these simulations. 
428: 
429: \section*{ACKNOWLEDGEMENTS}
430: We wish to thank Kenjiro Cho and Toshio Hirotsu for helpful discussion. 
431: 
432: %\footnotesize
433: \bibliographystyle{plain}
434: %\bibliography{./bib/acm,./bib/ton,./bib/ieee-conf,./bib/ieee-journal,./bib/fractal,./bib/misc,./bib/cs-fractal,./bib/rfc}
435: \input{wcs-bib.tex}
436: 
437: \section*{AUTHOR BIOGRAPHIES}
438: \noindent{\bf KENSUKE FUKUDA} is a research scientist in Network Innovation 
439: Laboratories at Nippon Telegraph and Telephone Corpration (NTT). 
440: He received a B.S. in Electrical Engineering, M.S.  
441: and Ph.D. in Computer Science in Keio University, Japan. 
442: His research interests include network traffic analysis and modeling, 
443: traffic control, and network protocol design. 
444: 
445: \noindent{\bf MISAKO TAKAYASU} is an associate professor in 
446: department of complex systems, Future University - Hakodate, Japan.  
447: She received a B.S. in Physics at Nagoya University,
448: and Ph.D. in Science at Kobe University, Japan.
449: Her research interests include non-equilibrium physics, especially in
450: dynamic phase transition, and information physics.
451: 
452: \noindent{\bf HIDEKI TAKAYASU} is a senior researcher 
453: at Sony Computer Science Laboratories Inc.
454: He received a B.S. and Ph.D. in Physics at Nagoya University. He was a
455: professor in Graduate School of Information Sciences at 
456: Tohoku University, Japan. His research covers wide area of
457: topics relating to fractals, such as stable distributions, earthquakes and
458: econophysics.
459: 
460: 
461: 
462: \end{document}
463: