1: \documentclass[12pt]{article}
2: %
3: % Vladimir Batagelj
4: % Efficient Algorithms for Citation Network Analysis
5: % ----------------------------------------------------------------------------
6: % version : Jan 21, 1991 Pittsburgh, Some Mathematics of Network Analysis
7: % version : Aug 28, 1994 slides
8: % version : May 5, 1996 LaTeX
9: % version : Aug 31, 1997 corrections
10: % version : Sep 15, 2001 used in Layouts for GD01 Graph-Drawing Competition
11: % version : Sep 1, 2002 extensions, real-life example
12: % version : Aug-Sep 2003 extensions, islands, US patents
13: %
14: % Pictures:
15: % networkc.eps, preprint.eps, mainP.eps, CPM.eps, som07LH.eps, main.eps,
16: % CPMpath.eps, size.eps, islandMa.eps, island50.eps, island38.eps
17: % ----------------------------------------------------------------------------
18: %
19:
20: \usepackage{latexsym}
21: \usepackage{times}
22: \usepackage[dvips]{graphicx}
23:
24: \newcommand{\Qed}{\hspace*{1mm}\hfill$\Box$\endgraf}
25: \def\RR{\hbox{\sf I\kern-.14em\hbox{R}}}
26: \def\NN{\hbox{\sf I\kern-.13em\hbox{N}}}
27: \def\Min{\mathop{\rm Min}\nolimits}
28: \def\Max{\mathop{\rm Max}\nolimits}
29: \newcommand{\Units}{\mathbf{U}}
30: \newcommand{\Net}{\mathbf{N}}
31: \newcommand{\DK}[1]{\stackrel{\rightharpoonup}{\mathbf{K}}_{#1}}
32: \newcommand{\inv}[1]{#1^\mathrm{inv}}
33: \newcommand{\trecl}[1]{{#1}^\star}
34: \newcommand{\tracl}[1]{\overline{#1}}
35: \newcommand{\url}[1]{{\textbf{\texttt{\small #1}}}}
36: \renewcommand{\textfraction}{.05}
37: \renewcommand{\topfraction}{.95}
38:
39: \oddsidemargin 5pt \evensidemargin 5pt \marginparwidth 20pt
40: \marginparsep 10pt \topmargin -12 true mm \headheight 12pt \headsep 25pt
41: \textheight 23 true cm \textwidth 16 true cm
42: \columnsep 10pt \columnseprule 0pt
43:
44:
45: \title{Efficient Algorithms for Citation Network Analysis}
46: \author{Vladimir Batagelj \\
47: University of Ljubljana, Department of Mathematics,\\
48: Jadranska 19, 1\,111 Ljubljana, Slovenia \\
49: e-mail: \texttt{vladimir.batagelj@uni-lj.si}}
50: \date{}
51: %\date{\today}
52: %\date{September 14, 2003}
53:
54: \begin{document}
55: \maketitle
56:
57: \begin{abstract}
58: In the paper very efficient, linear in number of arcs, algorithms
59: for determining Hummon and Doreian's arc weights SPLC and SPNP in
60: citation network are proposed, and some theoretical properties
61: of these weights are presented. The nonacyclicity problem in
62: citation networks is discussed. An approach to identify
63: on the basis of arc weights an important small subnetwork is proposed
64: and illustrated on the citation networks of SOM (self organizing maps)
65: literature and US patents.
66: \\[4pt]
67: \textbf{Keywords:} large network, acyclic, citation network,
68: main path, CPM path, arc weight,
69: algorithm, self organizing maps, patent
70: \end{abstract}
71:
72: \section{Introduction}
73:
74: The citation network analysis started with the
75: paper of Garfield et al. (1964) \cite{Gar64} in which the introduction
76: of the notion of citation network is attributed to Gordon Allen.
77: In this paper, on the example of Asimov's history of DNA \cite{Asimov},
78: it was shown that the analysis "\textit{demonstrated a high
79: degree of coincidence between an historian's account of events and the citational
80: relationship between these events}". An early overview of possible
81: applications of graph theory in citation network analysis was made
82: in 1965 by Garner \cite{3D}.
83:
84: The next important step was made by Hummon and Doreian (1989)
85: \cite{HumDor89,HumDor90,HuDoFr90}. They proposed three indices
86: (NPPC, SPLC, SPNP) --
87: weights of arcs that provide us with automatic way to identify
88: the (most) important part of the citation network -- the main path
89: analysis.
90:
91: In this paper we make a step further. We show how to efficiently
92: compute the Hummon and Doreian's weights, so that they can be used
93: also for analysis of very large citation networks with several
94: thousands of vertices. Besides this some theoretical properties
95: of the Hummon and Doreian's weights are presented.
96:
97: The proposed methods are implemented in \texttt{\textbf{Pajek}} --
98: a program, for Windows (32 bit), for analysis of \emph{large networks}.
99: It is freely available, for noncommercial use, at its homepage
100: \cite{pajek}.
101:
102:
103:
104: For basic notions of graph theory see Wilson and Watkins \cite{GT}.
105:
106: \section{Citation Networks}
107:
108: In a given set of units $\Units $ (articles, books, works,
109: \ldots) we introduce a $citing$ relation
110: $R \subseteq \Units \times \Units $
111: \[ u R v \equiv v \mbox{ cites } u \]
112: which determines a \emph{citation network} $\Net = (\Units ,R)$.
113:
114: In Table~\ref{citnets} some characteristics of real life citation networks
115: are presented. Most of these networks were obtained from the Eugene Garfield's
116: collection of citation data \cite{Gar64,Gar02} produced using
117: \textit{\textbf{HistCite}} Software (formerly called \textit{\textbf{HistComp}}
118: -- \textit{comp}iled \textit{Hist}oriography program)
119: \cite{Gar01}. All of these networks are the result of searches in the Web of Science
120: and are used with the permission of ISI of Philadelphia,
121: \texttt{\textbf{www.isinet.com}}. These networks in \texttt{\textbf{Pajek}}'s format
122: are available from \texttt{\textbf{Pajek}}'s web site \cite{data}.
123:
124: In Table~\ref{citnets}: $n = |\Units|$ is the number of vertices;
125: $m = |R|$ is the number of arcs;
126: $m_0$ is the number of loops;
127: $n_0$ is the number of isolated vertices;
128: $n_C$ is the size of the largest weakly connected component;
129: $k_C$ is the number of nontrivial weakly connected components;
130: $h$ is the depth of network (minimum number of levels);
131: $\Delta_{in}$ is the maximum input degree;
132: and $\Delta_{out}$ is the maximum output degree.
133: The last three columns contain the numbers of strongly connected components
134: (cyclic parts) of size 2, 3 and 4.
135:
136:
137: \begin{table}
138: \caption{Citation network characteristics\label{citnets}}
139: \begin{center}\footnotesize
140: \begin{tabular}{l|r|r|r|r|r|r|r|r|r|r|r|r|}
141: network & $n$ & $m$ & $m_0$ & $n_0$ & $n_C$ & $k_C$ & $h$&$\Delta_{in}$&$\Delta_{out}$&$2$& $3$ & $4$ \\ \hline
142: DNA & 40 & 60 & 0 & 1 & 35 & 3 & 11 & 7 & 5 & 0 & 0 & 0 \\
143: Coupling & 223 & 657 & 1 & 5 & 218 & 1 & 16 & 19 & 134 & 0 & 0 & 0 \\
144: Small world & 396 & 1988 & 0 & 163 & 233 & 1 & 16 & 60 & 294 & 0 & 0 & 0 \\
145: Small \& Griffith & 1059 & 4922 & 1 & 35 & 1024 & 1 & 28 & 89 & 232 & 2 & 0 & 0 \\
146: Cocitation & 1059 & 4929 & 1 & 35 & 1024 & 1 & 28 & 90 & 232 & 2 & 0 & 0 \\
147: Scientometrics & 3084 & 10416 & 1 & 355 & 2678 & 21 & 32 & 121 & 105 & 5 & 2 & 1 \\
148: Kroto & 3244 & 31950 & 1 & 0 & 3244 & 1 & 32 & 166 & 3243 & 6 & 0 & 0 \\
149: SOM & 4470 & 12731 & 2 & 698 & 3704 & 27 & 24 & 51 & 735 & 11 & 0 & 0 \\
150: Zewail & 6752 & 54253 & 1 & 101 & 6640 & 5 & 75 & 166 & 227 & 38 & 1 & 2 \\
151: Lederberg & 8843 & 41609 & 7 & 519 & 8212 & 35 & 63 & 135 & 1098 & 54 & 4 & 0 \\
152: Desalination & 8851 & 25751 & 7 & 1411 & 7143 & 115 & 27 & 73 & 137 & 12 & 0 & 1 \\
153: US patents & 3774768 & 16522438 & 1 & 0 & 3764117 & 3627 & 32 & 779 & 770 & 0 & 0 & 0 \\ \hline
154: \end{tabular}
155: \end{center}
156: \end{table}
157:
158: A citing relation is usually \emph{irreflexive},
159: $\forall u \in \Units : \lnot u R u$,
160: and (almost) \emph{acyclic} -- no vertex is reachable from
161: itself by a nontrivial path, or formally
162: $\forall u \in \Units \forall k \in \NN^+ : \lnot u R^k u$.
163: In the following we shall assume that it has this property.
164: We shall postpone the question how to deal with nonacyclic
165: citation networks till the end of the theoretical part of
166: the paper.
167:
168: For a relation $Q \subseteq \Units \times \Units $ we denote by
169: $\inv{Q}$ its \emph{inverse} relation,
170: $u \inv{Q} v \equiv v Q u$,
171: and by
172: \[ Q(u) = \{ v \in \Units : u Q v \} \]
173: the set of successors of unit $u \in \Units $.
174: If $Q$ is acyclic then also $\inv{Q}$ is acyclic.
175: This means that the network $\inv{\Net} = (\Units, \inv{R})$,
176: $u \inv{R} v \equiv u \mbox{ cites } v$, is a network of the same
177: type as the original citation network $\Net = (\Units,R)$.
178: Therefore it is just a matter of 'taste' which relation
179: to select.
180:
181: Let $I = \{ (u,u) : u \in \Units \}$ be the \emph{identity} relation
182: on $\Units $ and $\tracl{Q} = \bigcup_{k \in \NN^+} Q^k$
183: the \emph{transitive closure} of relation $Q$. Then
184: $Q$ is acyclic iff $\tracl{Q} \cap I = \emptyset$.
185: The relation $\trecl{Q} = \tracl{Q} \cup I$ is the
186: \emph{transitive and reflexive closure} of relation $Q$.
187:
188:
189: Since the set of units $\Units $ is finite and $R$ is acyclic we
190: know from the theory of relations that:
191: \begin{itemize}
192: \item The set of units $\Units $ can be \emph{topologically ordered} --
193: there exists a surjective mapping (permutation) $i : \Units \to 1 .. |\Units |$
194: with the property
195: \[ u R v \Rightarrow i(u) < i(v) \]
196: \item Let $\Min R = \{ u \in \Units : \inv{R}(u) = \emptyset \}$ be the set
197: of \emph{minimal} elements and
198: $\Max R = \{ u \in \Units : R(u) = \emptyset \}$
199: the set of \emph{maximal} elements. Then $\Min R \ne \emptyset$ and
200: $\Max R \ne \emptyset$.
201: \item Every unit $u \in \Units$ and every arc $(u,v) \in R$ belong
202: to at least one path from $\Min R $ to $\Max R$:\\
203: $\forall u \in \Units : R^\star (u) \cap \Max R \ne \emptyset$ \\
204: $\forall u \in \Units : {\inv{R}}^\star (u) \cap \Min R \ne \emptyset$
205: \end{itemize}
206:
207: \begin{figure}
208: \begin{center}
209: \includegraphics[width=60mm,viewport=10 4 217 280]{./pics/networkc.eps}
210: \caption{Citation Network in Standard Form\label{net}}
211: \end{center}
212: \end{figure}
213:
214:
215: To simplify the presentation we transform a citation network $\Net = (\Units ,R)$ to its
216: \emph{standard form} $\Net' = (\Units',R')$ (see Figure~\ref{net})
217: by extending the set of units
218: $\Units ' := \Units \cup \{ s, t \}$, $s, t \notin \Units $ with a common
219: \emph{source} (initial unit) $s$ and a common \emph{sink}
220: (terminal unit) $t$, and by adding the corresponding arcs to relation $R$
221: \[ R' := R \ \cup\ \{s\} \times \Min R\ \cup\ \Max R \times \{t\}
222: \ \cup\ \{ (t,s) \} \]
223: This eliminates problems with networks with several connected components
224: and/or several initial/terminal units.
225: In the following we shall assume that the citation network
226: $\Net = (\Units,R)$ is in the standard form.
227: Note that, to make the theory smoother, we added to $R'$ also the
228: 'feedback' arc $(t,s)$, thus destroying its acyclicity.
229:
230:
231: \section{Analysis of Citation Networks}
232:
233: An approach to the analysis of citation network is to determine
234: for each unit / arc its \emph{importance} or \emph{weight}. These values
235: are used afterward to determine the essential substructures in the
236: network.
237: In this paper we shall focus on the methods of assigning weights
238: $w : R \to \RR^+_0$
239: to arcs proposed by Hummon and Doreian \cite{HumDor89,HumDor90}:
240: \begin{itemize}
241: \item \emph{node pair projection count} (NPPC) method:
242: $w_d(u,v) = |\trecl{\inv{R}}(u)|\cdot|\trecl{R}(v)|$
243: \item \emph{search path link count} (SPLC) method: $w_l(u,v)$ equals
244: the number of "\textit{all possible search paths through the network
245: emanating from an origin node}" through the arc $(u,v) \in R$,
246: \cite[p. 50]{HumDor89}.
247: \item \emph{search path node pair} (SPNP) method:
248: $w_p(u,v)$ "\textit{accounts for
249: all connected vertex pairs along the paths through the arc $(u,v) \in R$}",
250: \cite[p. 51]{HumDor89}.
251: \end{itemize}
252:
253: \subsection{Computing NPPC weights}
254:
255: To compute $w_d$ for sets of units of moderate size (up to some thousands of units)
256: the matrix representation of $R$ can be used and its transitive
257: closure computed by Roy-Warshall's algorithm \cite{algo}. The quantities
258: $|\trecl{R}(v)|$ and $|\trecl{\inv{R}}(u)|$ can be obtained
259: from closure matrix as row/column sums.
260: An $O(nm)$ algorithm for computing $w_d$ can be constructed using
261: Breath First Search from each $u \in \Units$ to determine
262: $|\trecl{\inv{R}}(u)|$ and $|\trecl{R}(v)|$.
263: Since it is of order at least $O(n^2)$ this algorithm is not suitable
264: for larger networks (several ten thousands of vertices).
265:
266: \subsection{Search path count method}
267:
268: To compute the SPLC and SPNP weights we introduce a related
269: \textit{search path count} (SPC) method for which the
270: weights $N(u,v)$, $u R v$ count the number of
271: different paths from $s$ to $t$ (or from $\Min R$ to $\Max R$)
272: through the arc $(u,v)$.
273:
274: To compute $N(u,v)$ we introduce two auxiliary quantities:
275: let $N^-(v)$ denotes the number of different $s$-$v$ paths,
276: and $N^+(v)$ denotes the number of different $v$-$t$ paths.
277:
278: Every $s$-$t$ path $\pi$ containing the arc $(u,v) \in R$
279: can be uniquely expressed in the form
280: \[ \pi = \sigma \circ (u,v) \circ \tau \]
281: where $\sigma$ is a $s$-$u$ path and $\tau$ is a $v$-$t$ path.
282: Since every pair $(\sigma,\tau)$ of
283: $s$-$u$ / $v$-$t$ paths gives a corresponding
284: $s$-$t$ path it follows:
285: \[ N(u,v) = N^-(u)\cdot N^+(v), \qquad (u,v) \in R \]
286: where
287: \[
288: N^-(u) =
289: \cases{
290: 1 & $u = s$ \cr
291: \sum_{v : v R u} N^-(v) \quad & otherwise}
292: \]
293: and
294: \[
295: N^+(u) =
296: \cases{
297: 1 & $u = t$ \cr
298: \sum_{v : u R v} N^+(v) \quad & otherwise}
299: \]
300: This is the basis of an efficient algorithm for computing
301: the weights $N(u,v)$ --
302: after the topological sort of the network \cite{algo}
303: we can compute, using the above relations in topological order,
304: the weights in time of order $O(m)$.
305: The topological order ensures that all the quantities in
306: the right side expressions of the above equalities are already
307: computed when needed. The counters $N(u,v)$ are used as SPC
308: weights $w_c(u,v) = N(u,v)$.
309:
310: \subsection{Computing SPLC and SPNP weights}
311:
312: The description of SPLC method in \cite{HumDor89} is not very
313: precise. Analyzing the table of SPLC weights from
314: \cite[p. 50]{HumDor89} we see that we have to consider
315: \textbf{each} vertex as an origin of search paths.
316: This is equivalent to apply the SPC method on the
317: extended network
318: $\Net_l = (\Units',R_l)$
319: \[ R_l := R'\ \cup\ \{ s \} \times (\Units \setminus \cup R(s) ) \]
320:
321: It seems that there are some errors in the table of SPNP
322: weights in \cite[p. 51]{HumDor89}. Using the definition
323: of the SPNP weights we can again reduce their computation
324: to SPC method applied on the extended network
325: $\Net_p = (\Units',R_p)$
326: \[ R_p := R \ \cup\ \{ s \} \times \Units \ \cup \ \Units
327: \times \{ t \} \ \cup\ \{ (t,s) \} \]
328: in which every unit $u \in U$ is additionaly linked
329: from the source $s$ and to the sink $t$.
330:
331:
332: \subsection{Computing the numbers of paths of length $k$}
333:
334: We could use also a direct approach to determine the
335: weights $w_p$. Let $L^-(u)$ be the number of different
336: paths terminating in $u$
337: and $L^+(u)$ the number of different
338: paths originating in $u$.
339: Then for $uRv$ it holds $ w_p(u,v) = L^-(u)\cdot L^+(v)$.
340:
341: The procedure to determine $L^-(u)$ and $L^+(u)$ can be compactly described using two
342: families of polynomial generating functions\\
343: \[ P^-(u;x) = \sum_{k=0}^{h(u)} p^-(u,k) x^k \qquad
344: \mbox{and} \qquad P^+(u;x) = \sum_{k=0}^{h^-(u)} p^+(u,k) x^k, \quad u \in \Units \]
345: where $h(u)$ is the depth of vertex $u$ in network $(\Units,R)$, and
346: $h^-(u)$ is the depth of vertex $u$ in network $(\Units,\inv{R})$,
347: The coefficient $p^-(u,k)$ counts the number of paths of length $k$ to $u$,
348: and $p^+(u,k)$ counts the number of paths of length $k$ from $u$.
349:
350: Again, by the basic principles of combinatorics
351: \[
352: P^-(u;x) =
353: \cases{
354: 0 & $u=s$ \cr
355: 1 + x \cdot \sum_{v : v R u} P^-(v;x) \quad & otherwise}
356: \]
357: and
358: \[
359: P^+(u;x) =
360: \cases{
361: 0 & $u=t$ \cr
362: 1 + x \cdot \sum_{v : u R v} P^+(v;x) \quad & otherwise}
363: \]
364: and both families can be determined using the definitions and
365: computing the polynomials in the (reverse for $P^+$) topological
366: ordering of $\Units$. The complexity of this procedure is at most
367: $O(hm)$. Finally
368: \[ L^-(u) = P^-(u;1) \qquad \mathrm{and} \qquad
369: L^+(v) = P^+(v;1) \]
370: In real life citation networks the depth $h$ is relatively small as can be seen
371: from the Table~\ref{citnets}.
372:
373: The complexity of this approach is higher than the complexity of the
374: method proposed in subsection 3.3 -- but we get more detailed information about paths.
375: May be it would make sense to consider 'aging' of references by
376: $ L^-(u) = P^-(u;\alpha)$, for selected $\alpha$, $0 < \alpha \leq 1$.
377:
378: \subsection{Vertex weights}
379:
380: The quantities used to compute the arc weights $w$ can be used
381: also to define the corresponding vertex weights $t$
382: \begin{eqnarray*}
383: t_d(u) & = & |\trecl{\inv{R}}(u)|\cdot|\trecl{R}(u)| \\
384: t_c(u) & = & N^-(u)\cdot N^+(u) \\
385: t_l(u) & = & N'^-(u)\cdot N'^+(u) \\
386: t_p(u) & = & L^-(u)\cdot L^+(u)
387: \end{eqnarray*}
388: They are counting the number of paths of selected type through
389: the vertex $u$.
390:
391: \subsection{Implementation details}
392:
393: In our first implementation of the SPNP method the values of
394: $L^-(u)$ and $L^+(u)$ for some large networks (Zewail and Lederberg)
395: exceeded the range of Delphi's \texttt{LargeInt} (20 decimal places).
396: We decided to use the \texttt{Extended} real numbers
397: (range $= 3.6 \times 10^{-4951}\ ..\ 1.1 \times 10^{4932}$,
398: 19-20 significant digits) for counters. This
399: range is safe also for very large citation networks.
400:
401:
402: To see this, let us denote $N^*(k) = \max_{u: h(u)=k} N^-(u)$.
403: Note that $h(s) = 0$ and $u R v \Rightarrow h(u) < h(v)$.
404: Let $u^* \in \Units$ be a unit on which the maximum is attained
405: $N^*(k) = N^-(u^*)$. Then
406: \begin{eqnarray*}
407: N^*(k) & = & \sum_{v:v R u^*} N^-(v) \leq \sum_{v:v R u^*} N^*(h(v)) \leq \sum_{v:v R u^*} N^*(k-1) = \\
408: & = & \deg_{in}(u^*) \cdot N^*(k-1) \leq \Delta_{in}(k) \cdot N^*(k-1)
409: \end{eqnarray*}
410: where $\Delta_{in}(k)$ is the maximal input degree at depth $k$. Therefore
411: $N^*(h) \leq \prod_{k=1}^h \Delta_{in}(k) \leq \Delta_{in}^h$. A similar inequality
412: holds also for $N^+(u)$. From both it follows
413: \[ N(u,v) \leq \Delta_{in}^{h(u)} \cdot \Delta_{out}^{h^-(v)} \leq \Delta^{H-1} \]
414: where $H = h(t)$ and $\Delta = \max(\Delta_{in}, \Delta_{out})$.
415: Therefore for $H \leq 1000$ and $\Delta \leq 10000$ we get
416: $N(u,v) \leq \Delta^{H-1} \leq 10^{4000}$ which is still in the range of
417: \texttt{Extended} reals. Note also that in the derivation of this inequality
418: we were very generous -- in real-life networks $N(u,v)$ will be much smaller
419: than $\Delta^{H-1}$.
420:
421: Very large/small numbers that result as weights in large networks are
422: not easy to use. One possibility to overcome this problem is to use the
423: logarithms of the obtained weights -- logarithmic transformation
424: is monotone and therefore preserve the ordering of weights (importance
425: of vertices and arcs). The transformed values are also more convenient
426: for visualization with line thickness of arcs.
427:
428: \section{Properties of weights}
429:
430: \subsection{General properties of weights}
431:
432:
433: Directly from the definitions of weights we get
434: \[ w_k(u,v;R) = w_k(v,u;\inv{R}), \qquad k=d,c,p \]
435: and
436: \[ w_c(u,v) \leq w_l(u,v) \leq w_p(u,v) \]
437:
438: % 5. avgust 2002
439: Let $\Net_A = (\Units_A, R_A)$ and $\Net_B = (\Units_B, R_B)$,
440: $\Units_A \cap \Units_B = \emptyset$ be two citation networks,
441: and $\Net_1 = (\Units'_A, R'_A)$
442: and $\Net_2 = ((\Units_A \cup \Units_B)', (R_A \cup R_B)')$
443: the corresponding standardized networks of the first network
444: and of the union of both networks. Then it holds for all
445: $u,v \in \Units_A$ and for all $p,q \in R_A$
446: \[ \frac{t_k^{(1)}(u)}{t_k^{(1)}(v)} = \frac{t_k^{(2)}(u)}{t_k^{(2)}(v)}, \qquad
447: \mbox{and} \qquad
448: \frac{w_k^{(1)}(p)}{w_k^{(1)}(q)} = \frac{w_k^{(2)}(p)}{w_k^{(2)}(q)}, \qquad k=d,c,l,p \]
449: where $t^{(1)}$ and $w^{(1)}$ is a weight on network $\Net_1$, and
450: $t^{(2)}$ and $w^{(2)}$ is a weight on network $\Net_2$.
451: This means that adding or removing components in a network
452: do not change the ratios (ordering) of the weights inside components.
453:
454: Let $\Net_1 = (\Units,R_1)$ and $\Net_2 = (\Units,R_2)$ be two citation networks over the same
455: set of units $\Units$ and $R_1 \subseteq R_2$ then
456: \[ w_k(u,v;R_1) \leq w_k(u,v;R_2), \qquad k=d,c,p \]
457:
458: \subsection{NPPC weights}
459:
460: In an acyclic network for every arc $(u,v) \in R$ hold
461: \[ \trecl{\inv{R}}(u) \cap \trecl{R}(v) = \emptyset \quad \mathrm{and} \quad
462: \trecl{\inv{R}}(u) \cup \trecl{R}(v) \subseteq \Units \]
463: therefore $|\trecl{\inv{R}}(u)| + |\trecl{R}(v)| \leq n$ and,
464: using the inequality $\sqrt{ab} \leq \frac{1}{2} (a+b)$, also
465: \[ w_d(u,v) = |\trecl{\inv{R}}(u)| \cdot |\trecl{R}(v)| \leq \frac{1}{4} n^2 \]
466:
467: Close to the source or sink the weights $w_d$ are small,
468: since the sets $\trecl{R}(u)$ (and $\trecl{\inv{R}}(u)$) are monotonic
469: along the paths in a sense
470: \[ u \tracl{R} v \Rightarrow \trecl{R}(u) \subset \trecl{R}(v) \]
471: The weights $w_d$ are larger in the 'middle' of the network.
472:
473: A more uniform (but less sensitive)
474: weight would be $w_s(u,v) = |\trecl{\inv{R}}(u)| + |\trecl{R}(v)|$
475: or in the normalized form $w'_s(u,v) = \frac{1}{n} w_s(u,v)$.
476:
477: \subsection{SPC weights}
478:
479:
480: For the flow $N(u,v)$ the \emph{Kirchoff's node law} holds:
481:
482: For every node $v$ in a citation network in standard
483: form it holds
484: \[ \mbox{incoming flow} = \mbox{outgoing flow} = t_c(v)\]
485:
486: \noindent\textbf{Proof:}
487: \[ \sum_{x:xRv} N(x,v) = \sum_{x:xRv} N^-(x)\cdot N^+(v) =
488: (\sum_{x:xRv} N^-(x))\cdot N^+(v) = N^-(v)\cdot N^+(v) \]
489: \[ \sum_{y:vRy} N(v,y) = \sum_{y:vRy} N^-(v)\cdot N^+(y) =
490: N^-(v)\cdot\sum_{y:vRy} N^+(y) = N^-(v)\cdot N^+(v) \]
491: \Qed
492:
493:
494: From the Kirchoff's node law it follows that
495: the \emph{total flow} through the citation network equals
496: $N(t,s)$. This gives us a natural way to normalize the weights
497: \[ w(u,v) = \frac{N(u,v)}{N(t,s)} \quad \Rightarrow \quad
498: 0 \leq w(u,v) \leq 1 \]
499: If $C$ is a minimal arc-cut-set
500: \[ \sum_{(u,v) \in C} w(u,v) = 1 \]
501:
502: Let $\DK{n} = \{ (u,v): u,v \in 1..n \land u < v \}$ be the
503: complete acyclic directed graph on $n$ vertices then the value
504: of $N(u,v;\DK{n})$ is maximum over all citation networks on $n$
505: units. It is easy to verify that
506: \[ N(1,n;\DK{n}) = 2^{n-2} \]
507: and in general
508: \[ N(i,j;\DK{n}) = 2^{j-i-1}, i < j \]
509: From this result we see that the exhaustive search algorithm proposed
510: in Hummon and Doreian \cite{HumDor89,HumDor90} can require
511: exponential time to compute the arc weights $w$.
512:
513: % oceni, za koliko se vrednost razlikujejo - ali res ni bistvene razlike ?
514:
515: \section{Nonacyclic citation networks}
516:
517: The problem with cycles is that if there is a cycle in a network
518: then there is also an infinite number of trails between some
519: units. There are some standard approaches to overcome the problem:
520: \begin{itemize}
521: \item to introduce some 'aging' factor which makes the total weight of all trails
522: converge to some finite value;
523: \item to restrict the definition of a weight to some finite subset of
524: trails -- for example paths or geodesics.
525: \end{itemize}
526: But, new problems arise: What is the right value of the 'aging' factor?
527: Is there an efficient algorithm to count the restricted trails?
528:
529: \begin{figure}
530: \begin{center}
531: \includegraphics[width=140mm,viewport=0 0 376 186,clip]{./pics/preprint.eps}
532: \caption{Preprint transformation\label{preprint}}
533: \end{center}
534: \end{figure}
535:
536:
537: The other possibility, since a citation network is usually almost acyclic,
538: is to transform it into an acyclic network
539: \begin{itemize}
540: \item by identification (shrinking) of cyclic groups (nontrivial strong
541: components), or
542: \item by deleting some arcs, or
543: \item by transformations such as the 'preprint' transformation
544: (see Figure~\ref{preprint}) which is based on the following idea:
545: Each paper from a strong component is duplicated with its 'preprint'
546: version. The papers inside strong component cite preprints.
547: \end{itemize}
548:
549: Large strong components in citation network are unlikely --
550: their presence usually indicates an error in the data.
551: An exception from this rule is the
552: citation network of High Energy Particle Physics literature \cite{HEP}
553: from \textbf{arXiv}. In it different versions of the same paper
554: are treated as a unit. This leads to large strongly connected
555: components. The idea of preprint transformation can be used also in
556: this case to eliminate cycles.
557:
558:
559: \section{First Example: SOM citation network}
560:
561: The purpose of this example is not the analysis of the selected
562: citation network on SOM (self-organizing maps) literature \cite{Gar02,SOM,SOMLVQ},
563: but to present typical steps and results in citation network analysis.
564: We made our analysis using program \texttt{\textbf{Pajek}}.
565:
566:
567: First we test the network for acyclicity.
568: Since in the SOM network there are 11 nontrivial strong components
569: of size 2, see Table~\ref{citnets},
570: we have to transform the network into acyclic one. We decided to do
571: this by shrinking each component into a single vertex. This operation
572: produces some loops that should be removed.
573:
574: Now, we can compute the citation weights. We selected the
575: SPC (search path count) method. It returns the following results:
576: the network with citation weights on arcs, the main path
577: network and the vector with vertex weights.
578:
579:
580: \begin{figure}[!]
581: \begin{center}
582: \includegraphics[height=140mm,viewport=150 20 710 775,clip=]{./pics/mainP.eps}\quad
583: \includegraphics[height=140mm,viewport=320 20 580 775,clip=]{./pics/CPM.eps}
584: \caption{Main path and CPM path in SOM network with SPC weights\label{main}}
585: \end{center}
586: \end{figure}
587:
588:
589: In a citation network, a \emph{main path} (sub)network is
590: constructed starting from the source vertex
591: and selecting at each step in the end vertex/vertices the arc(s)
592: with the highest weight, until a sink vertex is reached.
593:
594: Another possibility is to apply on the network $\Net = (\Units,R,w)$
595: the critical path method (CPM) from operations research.
596:
597: First we draw the main path network. The arc weights are represented
598: by the thickness of arcs. To produce a nice picture
599: of it we apply the Pajek's macro \texttt{Layers} which contains a
600: sequence of operations for determining a layered layout of an
601: acyclic network (used also in analysis of genealogies represented
602: by p-graphs). Some experiments with settings of
603: different options are needed to obtain a right picture,
604: see left part of Figure~\ref{main}. In its right part the
605: CPM path is presented.
606:
607: We see that the upper parts of both paths are identical, but
608: they differ in the continuation. The arcs in the CPM path are
609: thicker.
610:
611: We could display also the complete SOM network using
612: essentially the same procedure as for the displaying of
613: main path. But the obtained picture would be too complicated
614: (too many vertices and arcs). We have to identify some
615: simpler and important subnetworks inside it.
616:
617: Inspecting the distribution of values of weights on arcs (lines)
618: we select a threshold 0.007 and determine the corresponding
619: \emph{arc-cut} -- delete all arcs with weights
620: lower than selected threshold and afterwards delete also all
621: isolated vertices (degree $= 0$).
622:
623: Now, we are ready to draw the reduced network. We first produce
624: an automatic layout.
625: We notice some small unimportant components. We preserve only
626: the large main component, draw it and improve the obtained layout
627: manually. To preserve the level structure we use the option
628: that allows only the horizontal movement of vertices.
629:
630: \begin{figure}[!]
631: \begin{center}
632: \includegraphics[width=160mm,viewport=70 15 795 765,clip=]{./pics/som07LH.eps}
633: \caption{Main subnetwork at level 0.007\label{maina}}
634: \end{center}
635: \end{figure}
636:
637:
638: Finally we label the 'most important vertices'
639: with their labels. A vertex is considered important if it is an
640: endpoint of an arc with the weight above the selected
641: threshold (in our case 0.05).
642:
643: The obtained picture of SOM 'main subnetwork'
644: is presented in Figure~\ref{maina}.
645: We see that the SOM field evolved in
646: two main branches. From CARPENTER-1987 the strongest (main path)
647: arc is leading to the right branch that after some steps disappears.
648: The left, more vital branch is detected by the CPM path.
649: Further investigation of this is left
650: to the readers with additional knowledge about the SOM field.
651:
652:
653: \begin{table}[!]
654: \caption{15 Hubs and Authorities \label{huau}}
655: \begin{center}\small
656: \begin{tabular}{r|l|l|l|l|}
657: Rank & $h$ & Hub Id & $a$ & Authority Id \\ \hline
658: 1 & 0.06442 & CLARK-JW-1991-V36-P1259 & 0.85214 & HOPFIELD-JJ-1982-V79-P2554 \\
659: 2 & 0.06366 & \#GARDNER-E-1988-V21-P257 & 0.33427 & KOHONEN-T-1982-V43-P59 \\
660: 3 & 0.05794 & HUANG-SH-1994-V17-P212 & 0.14531 & KOHONEN-T-1990-V78-P1464 \\
661: 4 & 0.05721 & GULATI-S-1991-V33-P173 & 0.12398 & CARPENTER-GA-1987-V37-P54 \\
662: 5 & 0.05513 & SHUBNIKOV-EI-1997-V64-P989 & 0.10376 & \#GARDNER-E-1988-V21-P257 \\
663: 6 & 0.05496 & MARSHALL-JA-1995-V8-P335 & 0.09353 & HOPFIELD-JJ-1986-V233-P625 \\
664: 7 & 0.05488 & VEMURI-V-1993-V36-P203 & 0.07882 & MCELIECE-RJ-1987-V33-P461 \\
665: 8 & 0.05409 & CHENG-B-1994-V9-P2 & 0.07656 & KOHONEN-T-1988-V1-P3 \\
666: 9 & 0.05360 & BUSCEMA-M-1998-V33-P17 & 0.07372 & RUMELHART-DE-1985-V9-P75 \\
667: 10 & 0.05258 & XU-L-1993-V6-P627 & 0.07271 & KOSKO-B-1988-V18-P49 \\
668: 11 & 0.05249 & WELLS-DM-1998-V41-P173 & 0.07246 & ANDERSON-JA-1977-V84-P413 \\
669: 12 & 0.05233 & SCHYNS-PG-1991-V15-P461 & 0.07033 & AMARI-SI-1977-V26-P175 \\
670: 13 & 0.05173 & SMITH-KA-1999-V11-P15 & 0.06709 & KOSKO-B-1987-V26-P4947 \\
671: 14 & 0.05149 & BONABEAU-E-1998-V9-P1107 & 0.05802 & PERSONNAZ-L-1985-V46-PL359 \\
672: 15 & 0.05126 & KOHONEN-T-1990-V78-P1464 & 0.05702 & GROSSBERG-S-1987-V11-P23 \\ \hline
673: \end{tabular}
674: \end{center}
675: \end{table}
676:
677:
678: As a complementary information we can determine
679: Kleinberg's hubs and authorities vertex weights \cite{ha}.
680: Papers that are cited by many other papers are called authorities;
681: papers that cite many other documents are
682: called hubs.
683: Good authorities are those that are cited by good hubs
684: and good hubs cite good authorities. The 15 highest
685: ranked hubs and authorities are presented in Table~\ref{huau}.
686: We see that the main authorities are located in eighties
687: and the main hubs in nineties.
688: Note that, since we are using the relation
689: $u R v \equiv u \mbox{ is cited by } v$, we have to
690: interchange the roles of hubs and authorities produced by
691: \texttt{\textbf{Pajek}}.
692:
693: An elaboration of the hubs and authorities approach to the analysis
694: of citation networks complemented with visualization can be found in
695: Brandes and Willhalm (2002) \cite{BW}.
696:
697: \section{Second Example: US patents}
698:
699: The network of US patents from 1963 to 1999 \cite{patents} is an
700: example of very large network (3774768 vertices and 16522438 arcs)
701: that, using some special options in \texttt{\textbf{Pajek}},
702: can still be analyzed on PC with at least 1 G memory.
703: The SPC weights are determined in a range of 1 minute.
704: This shows that the proposed approach can be used also for
705: very large networks.
706:
707: The obtained main path and CPM path are presented in Figure~\ref{mainpat}.
708: Collecting from the
709: \textbf{\textit{United States Patent and Trademark Office}} \cite{uspto}
710: the basic data about the patents from both paths, see
711: Table~\ref{patinfo}-\ref{patinfoD}, we see that they deal with
712: 'liquid crystal displays'.
713:
714: \begin{figure}[!]
715: \begin{center}
716: \includegraphics[height=175mm,viewport=65 20 375 635,clip=]{./pics/main.eps}\quad
717: \includegraphics[height=175mm,viewport=0 20 280 785,clip=]{./pics/CPMpath.eps}
718: \caption{Main path and CPM path subnetwork of Patents\label{mainpat}}
719: \end{center}
720: \end{figure}
721:
722: \begin{table}
723: \caption{Patents on the liquid-crystal display\label{patinfo}}
724: \begin{center}
725: %\scriptsize
726: \renewcommand{\arraystretch}{0.83}
727: \begin{tabular}{|r|r|l|}
728: \hline
729: patent & date & author(s) and title \\
730: \hline
731: 2544659 & Mar 13, 1951 & Dreyer.
732: Dichroic light-polarizing sheet and the like and the\\
733: & & formation and use thereof\\
734:
735: 2682562 & Jun 29, 1954 & Wender, et al.
736: Reduction of aromatic carbinols\\
737:
738: 3322485 & May 30, 1967 & Williams.
739: Electro-optical elements utilazing an organic\\
740: & & nematic compound\\
741:
742: 3512876 & May 19, 1970 & Marks.
743: Dipolar electro-optic structures\\
744:
745: 3636168 & Jan 18, 1972 & Josephson.
746: Preparation of polynuclear aromatic compounds\\
747:
748: 3666948 & May 30, 1972 & Mechlowitz, et al.
749: Liquid crystal termal imaging system\\
750: & & having an undisturbed image on a disturbed background\\
751:
752: 3675987 & Jul 11, 1972 & Rafuse.
753: Liquid crystal compositions and devices \\
754:
755: 3691755 & Sep 19, 1972 & Girard.
756: Clock with digital display\\
757:
758: 3697150 & Oct 10, 1972 & Wysochi.
759: Electro-optic systems in which an electrophoretic-\\
760: & & like or dipolar material is dispersed throughout a liquid\\
761: & & crystal to reduce the turn-off time\\
762:
763: 3731986 & May 8, 1973 & Fergason.
764: Display devices utilizing liquid crystal light\\
765: & & modulation \\
766:
767: 3740717 & Jun 19, 1973 & Huener, et al.
768: Liquid crystal display \\
769:
770: 3767289 & Oct 23, 1973 & Aviram, et al.
771: Class of stable trans-stilbene compounds,\\
772: & & some displaying nematic mesophases at or near room\\
773: & & temperature and others in a range up to 100$^\circ$C\\
774:
775: 3773747 & Nov 20, 1973 & Steinstrasser.
776: Substituted azoxy benzene compounds\\
777:
778: 3795436 & Mar 5, 1974 & Boller, et al.
779: Nematogenic material which exhibit the Kerr\\
780: & & effect at isotropic temperatures \\
781:
782: 3796479 & Mar 12, 1974 & Helfrich, et al.
783: Electro-optical light-modulation cell\\
784: & & utilizing a nematogenic material which exhibits the Kerr\\
785: & & effect at isotropic temperatures\\
786:
787: 3806230 & Apr 23, 1974 & Haas.
788: Liquid crystal imaging system having optical storage\\
789: & & capabilities\\
790:
791: 3809458 & May 7, 1974 & Huener, et al.
792: Liquid crystal display\\
793:
794: 3872140 & Mar 18, 1975 & Klanderman, et al.
795: Liquid crystalline compositions and\\
796: & & method \\
797:
798: 3876286 & Apr 8, 1975 & Deutscher, et al.
799: Use of nematic liquid crystalline substances\\
800:
801: 3881806 & May 6, 1975 & Suzuki.
802: Electro-optical display device \\
803:
804: 3891307 & Jun 24, 1975 & Tsukamoto, et al.
805: Phase control of the voltages applied to\\
806: & & opposite electrodes for a cholesteric to nematic phase\\
807: & & transition display \\
808:
809: 3947375 & Mar 30, 1976 & Gray, et al.
810: Liquid crystal materials and devices \\
811:
812: 3954653 & May 4, 1976 & Yamazaki.
813: Liquid crystal composition having high dielectric\\
814: & & anisotropy and display device incorporating same \\
815:
816: 3960752 & Jun 1, 1976 & Klanderman, et al.
817: Liquid crystal compositions \\
818:
819: 3975286 & Aug 17, 1976 & Oh.
820: Low voltage actuated field effect liquid crystals\\
821: & & compositions and method of synthesis \\
822:
823: 4000084 & Dec 28, 1976 & Hsieh, et al.
824: Liquid crystal mixtures for electro-optical\\
825: & & display devices \\
826:
827: 4011173 & Mar 8, 1977 & Steinstrasser.
828: Modified nematic mixtures with\\
829: & & positive dielectric anisotropy \\
830:
831: 4013582 & Mar 22, 1977 & Gavrilovic.
832: Liquid crystal compounds and electro-optic\\
833: & & devices incorporating them \\
834:
835: 4017416 & Apr 12, 1977 & Inukai, et al.
836: P-cyanophenyl 4-alkyl-4'-biphenylcarboxylate,\\
837: & & method for preparing same and liquid crystal compositions\\
838: & & using same \\
839:
840: \hline
841: \end{tabular}
842: \end{center}
843: \end{table}
844:
845: \begin{table}
846: \caption{Patents on the liquid-crystal display\label{patinfoB}}
847: \begin{center}
848: \renewcommand{\arraystretch}{0.83}
849: \begin{tabular}{|r|r|l|}
850: \hline
851: patent & date & author(s) and title \\
852: \hline
853:
854: 4029595 & Jun 14, 1977 & Ross, et al.
855: Novel liquid crystal compounds and electro-optic\\
856: & & devices incorporating them \\
857:
858: 4032470 & Jun 28, 1977 & Bloom, et al.
859: Electro-optic device \\
860:
861: 4077260 & Mar 7, 1978 & Gray, et al.
862: Optically active cyano-biphenyl compounds and\\
863: & & liquid crystal materials containing them \\
864:
865: 4082428 & Apr 4, 1978 & Hsu.
866: Liquid crystal composition and method \\
867:
868: 4083797 & Apr 11, 1978 & Oh.
869: Nematic liquid crystal compositions \\
870:
871: 4113647 & Sep 12, 1978 & Coates, et al.
872: Liquid crystalline materials \\
873:
874: 4118335 & Oct 3, 1978 & Krause, et al.
875: Liquid crystalline materials of reduced viscosity \\
876:
877: 4130502 & Dec 19, 1978 & Eidenschink, et al.
878: Liquid crystalline cyclohexane derivatives \\
879:
880: 4149413 & Apr 17, 1979 & Gray, et al.
881: Optically active liquid crystal mixtures and\\
882: & & liquid crystal devices containing them \\
883:
884: 4154697 & May 15, 1979 & Eidenschink, et al.
885: Liquid crystalline hexahydroterphenyl\\
886: & & derivatives \\
887:
888: 4195916 & Apr 1, 1980 & Coates, et al.
889: Liquid crystal compounds \\
890:
891: 4198130 & Apr 15, 1980 & Boller, et al.
892: Liquid crystal mixtures \\
893:
894: 4202791 & May 13, 1980 & Sato, et al.
895: Nematic liquid crystalline materials \\
896:
897: 4229315 & Oct 21, 1980 & Krause, et al.
898: Liquid crystalline cyclohexane derivatives \\
899:
900: 4261652 & Apr 14, 1981 & Gray, et al.
901: Liquid crystal compounds and materials and \\
902: & & devices containing them \\
903:
904: 4290905 & Sep 22, 1981 & Kanbe.
905: Ester compound \\
906:
907: 4293434 & Oct 6, 1981 & Deutscher, et al.
908: Liquid crystal compounds \\
909:
910: 4302352 & Nov 24, 1981 & Eidenschink, et al.
911: Fluorophenylcyclohexanes, the preparation\\
912: & & thereof and their use as components of liquid crystal dielectrics \\
913:
914: 4330426 & May 18, 1982 & Eidenschink, et al.
915: Cyclohexylbiphenyls, their preparation and\\
916: & & use in dielectrics and electrooptical display elements \\
917:
918: 4340498 & Jul 20, 1982 & Sugimori.
919: Halogenated ester derivatives \\
920:
921: 4349452 & Sep 14, 1982 & Osman, et al.
922: Cyclohexylcyclohexanoates\\
923:
924: 4357078 & Nov 2, 1982 & Carr, et al.
925: Liquid crystal compounds containing an alicyclic \\
926: & & ring and exhibiting a low dielectric anisotropy and liquid\\
927: & & crystal materials and devices incorporating such compounds \\
928:
929: 4361494 & Nov 30, 1982 & Osman, et al.
930: Anisotropic cyclohexyl cyclohexylmethyl ethers \\
931:
932: 4368135 & Jan 11, 1983 & Osman.
933: Anisotropic compounds with negative or positive\\
934: & & DC-anisotropy and low optical anisotropy \\
935:
936: 4386007 & May 31, 1983 & Krause, et al.
937: Liquid crystalline naphthalene derivatives \\
938:
939: 4387038 & Jun 7, 1983 & Fukui, et al.
940: 4-(Trans-4'-alkylcyclohexyl) benzoic acid \\
941: & & 4'"-cyano-4"-biphenylyl esters \\
942:
943: 4387039 & Jun 7, 1983 & Sugimori, et al.
944: Trans-4-(trans-4'-alkylcyclohexyl)-cyclohexane\\
945: & & carboxylic acid 4'"-cyanobiphenyl ester \\
946:
947: 4400293 & Aug 23, 1983 & Romer, et al.
948: Liquid crystalline cyclohexylphenyl derivatives \\
949:
950: 4415470 & Nov 15, 1983 & Eidenschink, et al.
951: Liquid crystalline fluorine-containing \\
952: & & cyclohexylbiphenyls and dielectrics and electro-optical display\\
953: & & elements based thereon \\
954:
955: 4419263 & Dec 6, 1983 & Praefcke, et al.
956: Liquid crystalline cyclohexylcarbonitrile\\
957: & & derivatives \\
958:
959: 4422951 & Dec 27, 1983 & Sugimori, et al.
960: Liquid crystal benzene derivatives \\
961:
962: 4455443 & Jun 19, 1984 & Takatsu, et al.
963: Nematic halogen Compound \\
964:
965: 4456712 & Jun 26, 1984 & Christie, et al.
966: Bismaleimide triazine composition \\
967:
968: 4460770 & Jul 17, 1984 & Petrzilka, et al.
969: Liquid crystal mixture \\
970:
971: 4472293 & Sep 18, 1984 & Sugimori, et al.
972: High temperature liquid crystal substances of\\
973: & & four rings and liquid crystal compositions containing the same \\
974:
975: \hline
976: \end{tabular}
977: \end{center}
978: \end{table}
979:
980: \begin{table}
981: \caption{Patents on the liquid-crystal display\label{patinfoC}}
982: \begin{center}
983: \renewcommand{\arraystretch}{0.83}
984: \begin{tabular}{|r|r|l|}
985: \hline
986: patent & date & author(s) and title \\
987: \hline
988:
989: 4472592 & Sep 18, 1984 & Takatsu, et al.
990: Nematic liquid crystalline compounds \\
991:
992: 4480117 & Oct 30, 1984 & Takatsu, et al.
993: Nematic liquid crystalline compounds \\
994:
995: 4502974 & Mar 5, 1985 & Sugimori, et al.
996: High temperature liquid-crystalline ester\\
997: & & compounds \\
998:
999: 4510069 & Apr 9, 1985 & Eidenschink, et al.
1000: Cyclohexane derivatives \\
1001:
1002: 4514044 & Apr 30, 1985 & Gunjima, et al.
1003: 1-(Trans-4-alkylcyclohexyl)-2-(trans-4'-(p-sub\-\\
1004: & & stituted phenyl) cyclohexyl)ethane and liquid crystal mixture \\
1005:
1006: 4526704 & Jul 2, 1985 & Petrzilka, et al.
1007: Multiring liquid crystal esters \\
1008:
1009: 4550981 & Nov 5, 1985 & Petrzilka, et al.
1010: Liquid crystalline esters and mixtures \\
1011:
1012: 4558151 & Dec 10, 1985 & Takatsu, et al.
1013: Nematic liquid crystalline compounds \\
1014:
1015: 4583826 & Apr 22, 1986 & Petrzilka, et al.
1016: Phenylethanes \\
1017:
1018: 4621901 & Nov 11, 1986 & Petrzilka, et al.
1019: Novel liquid crystal mixtures \\
1020:
1021: 4630896 & Dec 23, 1986 & Petrzilka, et al.
1022: Benzonitriles \\
1023:
1024: 4657695 & Apr 14, 1987 & Saito, et al.
1025: Substituted pyridazines \\
1026:
1027: 4659502 & Apr 21, 1987 & Fearon, et al.
1028: Ethane derivatives \\
1029:
1030: 4695131 & Sep 22, 1987 & Balkwill, et al.
1031: Disubstituted ethanes and their use in liquid\\
1032: & & crystal materials and devices \\
1033:
1034: 4704227 & Nov 3, 1987 & Krause, et al.
1035: Liquid crystal compounds \\
1036:
1037: 4709030 & Nov 24, 1987 & Petrzilka, et al.
1038: Novel liquid crystal mixtures \\
1039:
1040: 4710315 & Dec 1, 1987 & Schad, et al.
1041: Anisotropic compounds and liquid crystal\\
1042: & & mixtures therewith \\
1043:
1044: 4713197 & Dec 15, 1987 & Eidenschink, et al.
1045: Nitrogen-containing heterocyclic compounds \\
1046:
1047: 4719032 & Jan 12, 1988 & Wachtler, et al.
1048: Cyclohexane derivatives \\
1049:
1050: 4721367 & Jan 26, 1988 & Yoshinaga, et al.
1051: Liquid crystal device \\
1052:
1053: 4752414 & Jun 21, 1988 & Eidenschink, et al.
1054: Nitrogen-containing heterocyclic compounds \\
1055:
1056: 4770503 & Sep 13, 1988 & Buchecker, et al.
1057: Liquid crystalline compounds \\
1058:
1059: 4795579 & Jan 3, 1989 & Vauchier, et al.
1060: 2,2'-difluoro-4-alkoxy-4'-hydroxydiphenyls and\\
1061: & & their derivatives, their production process and\\
1062: & & their use in liquid crystal display devices \\
1063:
1064: 4797228 & Jan 10, 1989 & Goto, et al.
1065: Cyclohexane derivative and liquid crystal\\
1066: & & composition containing same \\
1067:
1068: 4820839 & Apr 11, 1989 & Krause, et al.
1069: Nitrogen-containing heterocyclic esters \\
1070:
1071: 4832462 & May 23, 1989 & Clark, et al.
1072: Liquid crystal devices \\
1073:
1074: 4877547 & Oct 31, 1989 & Weber, et al.
1075: Liquid crystal display element \\
1076:
1077: 4957349 & Sep 18, 1990 & Clerc, et al.
1078: Active matrix screen for the color display of\\
1079: & & television pictures, control system and process for producing\\
1080: & & said screen \\
1081:
1082: 5016988 & May 21, 1991 & Iimura.
1083: Liquid crystal display device with a birefringent\\
1084: & & compensator \\
1085:
1086: 5016989 & May 21, 1991 & Okada.
1087: Liquid crystal element with improved contrast and\\
1088: & & brightness \\
1089:
1090: 5122295 & Jun 16, 1992 & Weber, et al.
1091: Matrix liquid crystal display \\
1092:
1093: 5124824 & Jun 23, 1992 & Kozaki, et al.
1094: Liquid crystal display device comprising a \\
1095: & & retardation compensation layer having a maximum principal\\
1096: & & refractive index in the thickness direction \\
1097:
1098: 5171469 & Dec 15, 1992 & Hittich, et al.
1099: Liquid-crystal matrix display \\
1100:
1101: 5175638 & Dec 29, 1992 & Kanemoto, et al.
1102: ECB type liquid crystal display device having\\
1103: & & birefringent layer with equal refractive indexes in the thickness\\
1104: & & and plane directions\\
1105:
1106: \hline
1107: \end{tabular}
1108: \end{center}
1109: \end{table}
1110:
1111: \begin{table}
1112: \caption{Patents on the liquid-crystal display\label{patinfoD}}
1113: \begin{center}
1114: \renewcommand{\arraystretch}{0.83}
1115: \begin{tabular}{|r|r|l|}
1116: \hline
1117: patent & date & author(s) and title \\
1118: \hline
1119:
1120: 5243451 & Sep 7, 1993 & Kanemoto, et al.
1121: DAP type liquid crystal device with cholesteric\\
1122: & & liquid crystal birefringent layer\\
1123:
1124: 5283677 & Feb 1, 1994 & Sagawa, et al.
1125: Liquid crystal display with ground regions \\
1126: & & between terminal groups\\
1127:
1128: 5308538 & May 3, 1994 & Weber, et al.
1129: Supertwist liquid-crystal display \\
1130:
1131: 5319478 & June 7, 1994 & Funfschilling, et al.
1132: Light control systems with a circular polarizer\\
1133: & & and a twisted nematic liquid crystal having a minimum path\\
1134: & & difference of .lambda./2\\
1135:
1136: 5374374 & Dec 20, 1994 & Weber, et al.
1137: Supertwist liquid-crystal display \\
1138:
1139: 5408346 & Apr 18, 1995 & Trissel, et al.
1140: Optical collimating device employing cholesteric\\
1141: & & liquid crystal and a non-transmissive reflector\\
1142:
1143: 5539578 & Jul 23, 1996 & Togino, et al.
1144: Image display apparatus\\
1145:
1146: 5543077 & Aug 6, 1996 & Rieger, et al.
1147: Nematic liquid-crystal composition \\
1148:
1149: 5555116 & Sep 10, 1996 & Ishikawa, et al.
1150: Liquid crystal display having adjacent\\
1151: & & electrode terminals set equal in length \\
1152:
1153: 5683624 & Nov 4, 1997 & Sekiguchi, et al.
1154: Liquid crystal composition \\
1155:
1156: 5771124 & Jun 23, 1998 & Kintz, et al.
1157: Compact display system with two stage magnification\\
1158: & & and immersed beam splitter\\
1159:
1160: 5855814 & Jan 5, 1999 & Matsui, et al.
1161: Liquid crystal compositions and liquid crystal\\
1162: & & display elements\\
1163:
1164: 5991084 & Nov 23, 1999 & Hildebrand, et al.
1165: Compact compound magnified virtual image\\
1166: & & display with a reflective/transmissive optic\\
1167:
1168: 6005720 & Dec 21, 1999 & Watters, et al.
1169: Reflective micro-display system \\ \hline
1170:
1171:
1172: \end{tabular}
1173: \end{center}
1174: \end{table}
1175:
1176:
1177:
1178: But, in this network there should be thousands of 'themes'.
1179: How to identify them?
1180: Using the arc weights we can define a \emph{theme} as a connected
1181: small subnetwork of size in the interval $k$ .. $K$
1182: (for example, between $k = \frac{1}{3}h$ and $K = 3h$)
1183: with stronger internal cohesion relatively to its
1184: neighborhood.
1185:
1186: To find such subnetworks we use again the arc-cuts.
1187: We select a treshold $t$ and delete all arcs with weight
1188: lower than $t$. In the so reduced network we determine (weakly)
1189: connected components. The components of size in range $k .. K$,
1190: we call them $(k,K)$-\emph{islands},
1191: represent the themes since:
1192: \begin{itemize}
1193: \item they are connected and of selected size,
1194: \item all arcs linking them to their outside neighbors have weight lower
1195: than $t$, and
1196: \item each vertex of an island is linked with some other
1197: vertex in the same island with an arc with a weight
1198: at least $t$.
1199: \end{itemize}
1200: We discard components of size smaller than $k$ as 'noninteresting'.
1201:
1202: The components of size larger then $K$ are too large. They contain
1203: several themes. To identify them we repeat the procedure on the
1204: network of these components with a higher threshold value $t'$.
1205: Recently we developed an algorithm, named \emph{Islands} \cite{Islands},
1206: that by 'continuosly' changing the threshold identifies all maximal
1207: $(k,K)$-islands.
1208:
1209: We determined for SPC weights all (2,90)-islands in the US Patents
1210: network. The reduced network of islands has 470137 vertices, 307472 arcs and for
1211: different $k$: $C_2 = $187610, $C_5 = $8859,$C_{30} = $101,
1212: $C_{50} = $30 islands. The detailed island size frequency distribution
1213: is given in Table~\ref{fris} and presented in a log-log scale
1214: in Figure~\ref{power} that shows that it obeys the power law.
1215:
1216: \begin{table}
1217: \caption{Island size frequency distribution\label{fris}}
1218: \begin{center}
1219: {\renewcommand{\baselinestretch}{0.7}\small
1220: \begin{verbatim}
1221: [1] 0 139793 29670 9288 3966 1827 997 578 362 250
1222: [11] 190 125 104 71 47 37 36 33 21 23
1223: [21] 17 16 8 7 13 10 10 5 5 5
1224: [31] 12 3 7 3 3 3 2 6 6 2
1225: [41] 1 3 4 1 5 2 1 1 1 1
1226: [51] 2 3 3 2 0 0 0 0 0 1
1227: [61] 0 0 0 0 1 0 0 2 0 0
1228: [71] 0 0 1 1 0 0 0 1 0 0
1229: [81] 2 0 0 0 0 1 2 0 0 7
1230: \end{verbatim}
1231: }
1232: \end{center}
1233: \end{table}
1234:
1235: \begin{figure}[!]
1236: \begin{center}
1237: \includegraphics[width=140mm,viewport=0 10 500 470,clip=]{./pics/size.eps}
1238: \caption{Island size frequency distribution \label{power}}
1239: \end{center}
1240: \end{figure}
1241:
1242: \begin{figure}[!]
1243: \begin{center}
1244: \includegraphics[width=160mm,viewport=45 25 620 780,clip=]{./pics/islandMa.eps}
1245: \caption{Main island 'liquid-crystal display' \label{M}}
1246: \end{center}
1247: \end{figure}
1248:
1249:
1250: \begin{figure}[!]
1251: \begin{center}
1252: \includegraphics[width=160mm,viewport=65 20 660 690,clip=]{./pics/island50.eps}
1253: \caption{Island 'producing a foam' \label{A}}
1254: \end{center}
1255: \end{figure}
1256:
1257:
1258: \begin{table}
1259: \caption{Some patents from the 'foam' island\label{Ainfo}}
1260: \begin{center}
1261: \begin{tabular}{|r|r|l|}
1262: \hline
1263: patent & date & author(s) and title \\
1264: \hline
1265: 4060439 & Nov 29, 1977 & Rosemund, et al.
1266: Polyurethane foam composition and method of\\
1267: & & making same\\
1268:
1269: 4292369 & Sep 29, 1981 & Ohashi, et al.
1270: Fireproof laminates\\
1271:
1272: 4357430 & Nov 2, 1982 & VanCleve.
1273: Polymer/polyols, methods for making same and\\
1274: & & polyurethanes based thereon\\
1275:
1276: 4459334 & Jul 10, 1984 & Blanpied, et al.
1277: Composite building panel\\
1278:
1279: 4496625 & Jan 29, 1985 & Snider , et al.
1280: Alkoxylated aromatic amine-aromatic polyester\\
1281: & & polyol blend and polyisocyanurate foam therefrom\\
1282:
1283: 4544679 & Oct 1, 1985 & Tideswell, et al.
1284: Polyol blend and polyisocyanurate foam\\
1285: & & produced therefrom\\
1286:
1287: 4714717 & Dec 22, 1987 & Londrigan, et al.
1288: Polyester polyols modified by low molecular\\
1289: & & weight glycols and cellular foams therefrom\\
1290:
1291: 4927863 & May 22, 1990 & Bartlett, et al.
1292: Process for producing closed-cell polyurethane \\
1293: & & foam compositions expanded with mixtures of blowing agents\\
1294:
1295: 4996242 & Feb 26, 1991 & Lin.
1296: Polyurethane foams manufactured with mixed\\
1297: & & gas/liquid blowing agents\\
1298:
1299: 5169873 & Dec 8, 1992 & Behme, et al.
1300: Process for the manufacture of foams with the aid\\
1301: & & of blowing agents containing fluoroalkanes and fluorinated\\
1302: & & ethers, and foams obtained by this process\\
1303:
1304: 5187206 & Feb 16, 1993 & Volkert, et al.
1305: Production of cellular plastics by the\\
1306: & & polyisocyanate polyaddition process, and low-boiling,\\
1307: & & fluorinated or perfluorinated, tertiary alkylamines\\
1308: & & as blowing agent-containing emulsions for this purpose\\
1309:
1310: 5308881 & May 3, 1994 & Londrigan, et al.
1311: Surfactant for polyisocyanurate foams\\
1312: & & made with alternative blowing agents\\
1313:
1314: 5558810 & Sep 24, 1996 & Minor, et al.
1315: Pentafluoropropane compositions\\
1316:
1317: \hline
1318: \end{tabular}
1319: \end{center}
1320: \end{table}
1321:
1322: \begin{figure}[!]
1323: \begin{center}
1324: \includegraphics[width=160mm,viewport=45 20 560 680,clip=]{./pics/island38.eps}
1325: \caption{Island 'fiber optics and bags' \label{C}}
1326: \end{center}
1327: \end{figure}
1328:
1329: \begin{table}
1330: \caption{Some patents from 'fiber optics and bags' island\label{Cinfo}}
1331: \begin{center}
1332: \begin{tabular}{|r|r|l|}
1333: \hline
1334: patent & date & author(s) and title \\
1335: \hline
1336: 4461536 & Jul 24, 1984 & Shaw, et al.
1337: Fiber coupler displacement transducer\\
1338:
1339: 4511582 & Apr 16, 1985 & Bair.
1340: Phenanthrene derivatives\\
1341:
1342: 4530800 & Jul 23, 1985 & Bair.
1343: Perylene derivatives\\
1344:
1345: 4589728 & May 20, 1986 & Dyott, et al.
1346: Optical fiber polarizer\\
1347:
1348: 4676378 & Jun 30, 1987 & Baxley, et al.
1349: Bag pack\\
1350:
1351: 4719047 & Jan 12, 1988 & Bair.
1352: Anthracene derivatives\\
1353:
1354: 4784453 & Nov 15, 1988 & Shaw, et al.
1355: Backward-flow ladder architecture and method\\
1356:
1357: 4785938 & Nov 22, 1988 & Benoit, Jr., et al.
1358: Thermoplastic bag pack\\
1359:
1360: 4810052 & Mar 7, 1989 & Fling.
1361: Fiber optic bidirectional data bus tap\\
1362:
1363: 4811417 & Mar 7, 1989 & Prince, et al.
1364: Handled bag with supporting slits in handle\\
1365:
1366: 4829090 & May 9, 1989 & Bair.
1367: Chrysene derivatives\\
1368:
1369: 4981216 & Jan 1, 1991 & Wilfong, Jr.
1370: Easy opening bag pack and supporting rack\\
1371: & & system and fabricating method\\
1372:
1373: 4997249 & Mar 5, 1991 & Berry, et al.
1374: Variable weight fiber optic transversal filter\\
1375:
1376: 5188235 & Feb 23, 1993 & Pierce, et al.
1377: Bag pack\\
1378:
1379: 5307935 & May 3, 1994 & Kemanjian.
1380: Packs of self opening plastic bags and method of\\
1381: & & fabricating the same\\
1382:
1383: 5363965 & Nov 15, 1994 & Nguyen.
1384: Self-opening thermoplastic bag system\\
1385:
1386: \hline
1387: \end{tabular}
1388: \end{center}
1389: \end{table}
1390:
1391: The main island has 90 vertices and contains middle parts of the main
1392: path and the CPM path. They also have a short common part.
1393: Again, the greedy strategy of the main path leads to a less vital branch.
1394: Considering the basic data about the patents from
1395: Table~\ref{patinfo}-\ref{patinfoC}, we see that also the main island
1396: deals with 'liquid crystal displays'.
1397:
1398: For additional illustration of results obtained by Islands
1399: algorithm we selected two smaller islands at lower levels --
1400: see Figure~\ref{A} (50 vertices) and Figure~\ref{C} (38 vertices).
1401: Retreiving the basic data about some patents in these islands from
1402: \textbf{\textit{United States Patent and Trademark Office}},
1403: see Table~\ref{Ainfo} and Table~\ref{Cinfo}, we can label the
1404: corresponding theme of the first island as 'producing a foam'.
1405: The theme of the second island deals initially with 'fiber optics',
1406: but in the upper part it switches to 'bag pack system'.
1407:
1408:
1409: \section{Conclusions}
1410:
1411: In the paper we proposed an approach to the analysis of citation
1412: networks that can be used also for very large networks with millions
1413: of vertices and arcs.
1414:
1415: On test cases, the methods SPC, SPLC, NPPC produced almost the
1416: same results. Since the method SPC has additional
1417: 'nice' properties it could be considered as a 'first choice' --
1418: but, to make a grounded recommendation,
1419: additional experiences should be gained from the analyses of real-life
1420: large citation networks.
1421:
1422: The granularity of the results strongly depends on the range
1423: for 'interesting themes' $k$ .. $K$ -- varying these two parameters
1424: we get larger or smaller sets of themes.
1425:
1426: Instead of arc-cuts we could consider also vertex-cuts
1427: with respect to $p$-cores on SPC weights \cite{pCores}
1428: with a $p$-function
1429: \[ p(v,W) = \max( \sum_{u \in W : u R v} w(u,v),
1430: \sum_{u \in W : v R u} w(v,u) ) \]
1431:
1432: The subnetworks approach only filters out the structurally important
1433: subnetworks thus providing a researcher with a smaller manageable
1434: structures which can be further analyzed using more sophisticated
1435: and/or substantial methods.
1436:
1437: \section{Acknowledgments}
1438:
1439: The search path count algorithm was developed during my visit in
1440: Pittsburgh in 1991 and presented at the Network seminar
1441: \cite{Bat91}. It was presented to the broader audience
1442: at EASST'94 in Budapest \cite{Bat94}. In 1997 it was
1443: included in program \texttt{\textbf{Pajek}} \cite{pajek}.
1444: The 'preprint' transformation was developed as a part of the
1445: contribution for the Graph drawing contest 2001 \cite{GD01}.
1446: The algorithm for the path length counts was developed in August 2002
1447: and the Islands algorithm in August 2003.
1448:
1449: The author would like to thank Patrick Doreian and Norm Hummon
1450: for introducing him into the field of citation network analysis,
1451: Eugene Garfield for making available the data on real-life
1452: networks and providing some relevant references,
1453: and Andrej Mrvar and Matja\v{z} Zaver\v{s}nik
1454: for implementing the algorithms in \texttt{\textbf{Pajek}}.
1455:
1456: This work was supported by the Ministry of Education, Science and Sport of
1457: Slovenia, Project 0512-0101.
1458:
1459: \newpage
1460:
1461: \begin{thebibliography}{99}
1462:
1463: \bibitem{Asimov} Asimov I.: The Genetic Code,
1464: New American Library, New York, 1963.
1465:
1466: \bibitem{Bat91} Batagelj V.: Some Mathematics of Network Analysis.
1467: Network Seminar, Department of Sociology,
1468: University of Pittsburgh, January 21, 1991.
1469:
1470: \bibitem{Bat94} Batagelj V.: An Efficient Algorithm for Citation Networks
1471: Analysis. Paper presented at EASST'94, Budapest, Hungary,
1472: August 28-31, 1994.
1473:
1474: \bibitem{pajek} Batagelj V., Mrvar A.: \texttt{\textbf{Pajek}} -- program for
1475: analysis and visualization of large networks. \\
1476: \url{http://vlado.fmf.uni-lj.si/pub/networks/pajek/}\\
1477: \url{http://vlado.fmf.uni-lj.si/pub/networks/pajek/howto/extreme.htm}
1478:
1479: \bibitem{GD01} Batagelj V., Mrvar A.:
1480: Graph Drawing Contest 2001 Layouts \\
1481: \url{http://vlado.fmf.uni-lj.si/pub/GD/GD01.htm}
1482:
1483: \bibitem{pCores} Batagelj V., Zaver\v{s}nik M.:
1484: Generalized Cores. Submitted, 2002.\\
1485: \url{http://arxiv.org/abs/cs.DS/0202039}
1486:
1487: \bibitem{Islands} Batagelj V., Zaver\v{s}nik M.:
1488: Islands -- identifying themes in large networks. In preparation, August 2003.
1489: % \url{http://arxiv.org/abs/cs.DS/0202039}
1490:
1491: \bibitem{BW} Brandes U., Willhalm T.:
1492: Visualization of
1493: bibliographic networks with a reshaped landscape metaphor.
1494: Joint Eurographics -- IEEE TCVG Symposium on Visualization,
1495: D. Ebert, P. Brunet, I. Navazo (Editors), 2002.\\
1496: \url{http://algo.fmi.uni-passau.de/\symbol{126}brandes/}\\
1497: \url{\strut\qquad publications/bw-vbnrl-02.pdf}
1498:
1499: \bibitem{algo} Cormen T.H., Leiserson C.E., Rivest R.L., Stein C.:
1500: Introduction to Algorithms, Second Edition. MIT Press, 2001.
1501:
1502: \bibitem{Gar64} Garfield E, Sher IH, and Torpie RJ.:
1503: The Use of Citation Data in Writing the History of Science.
1504: Philadelphia: The Institute for Scientific Information, December 1964.\\
1505: \url{http://www.garfield.library.upenn.edu/papers/}\\
1506: \url{\strut\qquad useofcitdatawritinghistofsci.pdf}
1507:
1508: \bibitem{Gar01} Garfield E.:
1509: From Computational Linguistics to Algorithmic Historiography,
1510: paper presented at the Symposium in Honor of Casimir Borkowski
1511: at the University of Pittsburgh School of Information Sciences,
1512: September 19, 2001.\\
1513: \url{http://garfield.library.upenn.edu/papers/pittsburgh92001.pdf}
1514:
1515: \bibitem{Gar02} Garfield E., Pudovkin A.I., Istomin, V.S.:
1516: \textit{\textbf{Histcomp}} -- (\textit{comp}iled \textit{Hist}oriography program)\\
1517: \url{http://garfield.library.upenn.edu/histcomp/guide.html}\\
1518: \url{http://www.garfield.library.upenn.edu/histcomp/index.html}
1519:
1520: \bibitem{3D} Garner R.:
1521: A computer oriented, graph theoretic analysis of citation index structures.
1522: Flood B. (Editor), Three Drexel information science studies, Philadelphia:
1523: Drexel University Press 1967.\\
1524: \url{http://www.garfield.library.upenn.edu/rgarner.pdf}
1525:
1526: \bibitem{HumDor89} Hummon N.P., Doreian P.:
1527: Connectivity in a Citation Network: The Development of DNA Theory.
1528: Social Networks, {\bf 11}(1989) 39-63.
1529:
1530: \bibitem{HumDor90} Hummon N.P., Doreian P.:
1531: Computational Methods for Social Network Analysis.
1532: Social Networks, {\bf 12}(1990) 273-288.
1533:
1534: \bibitem{HuDoFr90} Hummon N.P., Doreian P., Freeman L.C.:
1535: Analyzing the Structure of the Centrality-Productivity Literature
1536: Created Between 1948 and 1979.
1537: Knowledge: Creation, Diffusion, Utilization, {\bf 11}(1990)4, 459-480.
1538:
1539: \bibitem{ha} Kleinberg J.:
1540: Authoritative sources in a hyperlinked environment.
1541: In Proc 9th ACMSIAM Symposium on Discrete Algorithms, 1998, p. 668-677.\\
1542: \url{http://www.cs.cornell.edu/home/kleinber/auth.ps}\\
1543: \url{http://citeseer.nj.nec.com/kleinberg97authoritative.html}
1544:
1545: \bibitem{GT} Wilson, R.J., Watkins, J.J.:
1546: \emph{Graphs: An Introductory Approach}.
1547: New York: John Wiley and Sons, 1990.
1548:
1549: \bibitem{data} Pajek's datasets -- citation networks:\\
1550: \url{http://vlado.fmf.uni-lj.si/pub/networks/data/cite/}
1551:
1552: \bibitem{HEP} KDD Cup 2003:\\
1553: \url{http://www.cs.cornell.edu/projects/kddcup/index.html}\\
1554: \url{http://arxiv.org/}
1555:
1556: \bibitem{patents} Hall, B.H., Jaffe, A.B. and Tratjenberg M.:
1557: The NBER U.S. Patent Citations Data File. NBER Working Paper 8498 (2001).\\
1558: \url{http://www.nber.org/patents/}
1559:
1560: \bibitem{uspto} The United States Patent and Trademark Office. \\
1561: \url{http://patft.uspto.gov/netahtml/srchnum.htm}
1562:
1563: \bibitem{SOMLVQ}
1564: Bibliography on the Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ)\\
1565: \url{http://liinwww.ira.uka.de/bibliography/Neural/SOM.LVQ.html}
1566:
1567: \bibitem{SOM}
1568: Neural Networks Research Centre: Bibliography of SOM papers.\\
1569: \url{http://www.cis.hut.fi/research/refs/}
1570:
1571: \end{thebibliography}
1572: \end{document}
1573:
1574: