1: \documentclass[twocolumn,showpacs,preprintnumbers,amsmath,amssymb]{revtex4}
2:
3:
4: \usepackage{graphicx}% Include figure files
5: \usepackage{dcolumn}% Align table columns on decimal point
6: \usepackage{bm}% bold math
7:
8: \begin{document}
9:
10: \title{Effect of initial configuration on network-based recommendation}
11: \author{Tao Zhou$^{1,2,3}$}
12: \email{zhutou@ustc.edu}
13: \author{Luo-Luo Jiang$^2$}
14: \author{Ri-Qi Su$^{2,3}$}
15: \author{Yi-Cheng Zhang$^{1,3}$}
16: \email{yi-cheng.zhang@unifr.ch}
17: \affiliation{%
18: $^1$Department of Physics, University of Fribourg, Chemin du Muse
19: 3, CH-1700 Fribourg, Switzerland \\
20: $^2$Department of Modern Physics and Nonlinear Science Center,
21: University of Science and Technology of China, Hefei Anhui,
22: 230026, PR China \\
23: $^3$Information Economy and Internet Research Laboratory, University
24: of Electronic Science and Technology of China, Chengdu Sichuan,
25: 610054, PR China
26: }%
27:
28: \date{\today}
29:
30: \begin{abstract}
31: In this paper, based on a weighted object network, we propose a
32: recommendation algorithm, which is sensitive to the configuration of
33: initial resource distribution. Even under the simplest case with
34: binary resource, the current algorithm has remarkably higher
35: accuracy than the widely applied global ranking method and
36: collaborative filtering. Furthermore, we introduce a free parameter
37: $\beta$ to regulate the initial configuration of resource. The
38: numerical results indicate that decreasing the initial resource
39: located on popular objects can further improve the algorithmic
40: accuracy. More significantly, we argue that a better algorithm
41: should simultaneously have higher accuracy and be more personal.
42: According to a newly proposed measure about the degree of
43: personalization, we demonstrate that a degree-dependent initial
44: configuration can outperform the uniform case for both accuracy and
45: personalization strength.
46: \end{abstract}
47:
48: \pacs{89.75.Hc, 87.23.Ge, 05.70.Ln}
49:
50: \maketitle
51:
52: \emph{Introduction}. --- The exponential growth of the Internet
53: \cite{Faloutsos1999} and World-Wide-Web \cite{Broder2000} confronts
54: people with an information overload: they are facing too many data
55: and sources to be able to find out those most relevant for them.
56: Thus far, the most promising way to efficiently filter out the
57: information overload is to provide personal recommendations. That is
58: to say, using the personal information of a user (i.e., the
59: historical track of this user's activities) to uncover his habits
60: and to consider them in the recommendation. For instances,
61: Amazon.com uses one's purchase history to provide individual
62: suggestions. If you have bought a textbook on statistical physics,
63: Amazon may recommend you some other statistical physics books. Based
64: on the well-developed \emph{Web 2.0} technology, recommendation
65: systems are frequently used in web-based movie-sharing
66: (music-sharing, book-sharing, etc.) systems, web-based selling
67: systems, and so on. Motivated by the significance to the economy and
68: society, recommendation algorithms are being extensively
69: investigated in the engineering community \cite{Adomavicius2005}.
70: Various kinds of algorithms have been proposed, including
71: correlation-based methods \cite{Konstan1997,Herlocker2004},
72: content-based methods \cite{Balab97,Pazzani99}, the spectral
73: analysis \cite{Maslov00}, principle component analysis
74: \cite{Goldberg2001}, and so on.
75:
76: Very recently some physical dynamics, including heat conduction
77: process \cite{Zhang2007a} and mass diffusion
78: \cite{Zhou2007,Zhang2007b}, have found applications in personal
79: recommendation. These physical approaches have been demonstrated to
80: be both highly efficient and of low computational complexity
81: \cite{Zhang2007a,Zhou2007,Zhang2007b}. In this paper, we introduce a
82: network-based recommendation algorithm with degree-dependent initial
83: configuration. Compared with uniform initial configuration, the
84: prediction accuracy can be remarkably enhanced by using the
85: degree-dependent configuration. More significantly, besides the
86: prediction accuracy, we present novel measurements to judge how
87: personal the recommendation results are. The algorithm providing
88: more personal recommendations has, in principle, greater ability to
89: uncover the individual habits. Since mainstream interests are more
90: easily uncovered, a user may appreciate a system more if it can
91: recommend the unpopular objects he/she enjoys. Therefore, we argue
92: that those two kinds of measurements, accuracy and degree of
93: personalization, are complementary to each other in evaluating a
94: recommendation algorithm. Numerical simulations show that the
95: optimal initial configuration subject to accuracy can also generate
96: more personal recommendations.
97:
98:
99: \emph{Method}. --- A recommendation system consists of users and
100: objects, and each user has collected some objects. Denoting the
101: object-set as $O=\{o_1,o_2,\cdots,o_n\}$ and user-set as
102: $U=\{u_1,u_2,\cdots,u_m\}$, the recommendation system can be fully
103: described by an $n\times m$ adjacent matrix $A=\{a_{ij}\}$, where
104: $a_{ij}=1$ if $o_i$ is collected by $u_j$, and $a_{ij}=0$ otherwise.
105: A reasonable assumption is that the objects you have collected are
106: what you like, and a recommendation algorithm aims at predicting
107: your personal opinions (to what extent you like or hate them) on
108: those objects you have not yet collected. Mathematically speaking,
109: for a given user, a recommendation algorithm generates a ranking of
110: all the objects he/she has not collected before. The top $L$ objects
111: are recommended to this user, with $L$ the length of the
112: recommendation list.
113:
114: \begin{figure}
115: \scalebox{0.8}[0.8]{\includegraphics{graph1}} \caption{(Color
116: online) The ranking score $\langle r\rangle$ vs. $\beta$. The
117: optimal $\beta$, corresponding to the minimal $\langle r\rangle
118: \approx 0.098$, is $\beta_{\texttt{opt}} \approx -0.8$. All the data
119: points shown in the main plot is obtained by averaging over five
120: independent runs with different data-set divisions. The inset shows
121: the numerical results of every separate run, where each curve
122: represents one random division of data-set.}
123: \end{figure}
124:
125: Based on the user-object relations $A$, an object network can be
126: constructed, where each node represents an object, and two objects
127: are connected if and only if they have been collected simultaneously
128: by at least one user. We assume a certain amount of resource (e.g.
129: recommendation power) is associated with each object, and the weight
130: $w_{ij}$ represents the proportion of the resource $o_j$ would like
131: to distribute to $o_i$. For example, in the book-selling system, the
132: weight $w_{ij}$ contributes to the strength of book $o_i$
133: recommendation to a customer provided he has bought book $o_j$.
134: Following a network-based resource-allocation process where each
135: object distributes its initial resource equally to all the users who
136: have collected it, and then each user sends back what he/she has
137: received to all the objects he/she has collected (also equally), the
138: weight $w_{ij}$ (the fraction of initial resource $o_j$ eventually
139: gives to $o_i$) can be expressed as:
140: \begin{equation}
141: w_{ij}=\frac{1}{k(o_j)}\sum^m_{l=1}\frac{a_{il}a_{jl}}{k(u_l)},
142: \end{equation}
143: where $k(o_j)=\sum^n_{i=1}a_{ji}$ and $k(u_l)=\sum^m_{i=1}a_{il}$
144: denote the \emph{degrees} of object $o_j$ and $u_l$, respectively.
145: Clearly, the weight between two unconnected objects is zero.
146: According to the definition of the weighted matrix $W=\{w_{ij}\}$,
147: if the initial resource vector is $\mathbf{f}$, the final resource
148: distribution is $\mathbf{f}'=W\mathbf{f}$.
149:
150: The general framework of the proposed network-based recommendation
151: is as follows: (i) construct the weighted object network (i.e.
152: determine the matrix $W$) from the known user-object relations; (ii)
153: determine the initial resource vector $\mathbf{f}$ for each user;
154: (iii) get the final resource distribution via
155: $\mathbf{f}'=W\mathbf{f}$; (iv) recommend those uncollected objects
156: with highest final resource. Note that the initial configuration
157: $\mathbf{f}$ is determined by the user's personal information, thus
158: for different users, the initial configuration is different. From
159: now on, for a given user $u_i$, we use $\mathbf{f}^i$ to emphasize
160: this personal configuration.
161:
162:
163:
164: \emph{Numerical results}. --- For a given user $u_i$, the $j$th
165: element of $\mathbf{f}^i$ should be zero if $a_{ji}=0$. That is to
166: say, one should not put any recommendation power (i.e. resource)
167: onto an uncollected object. The simplest case is to set a uniform
168: initial configuration as
169: \begin{equation}
170: f^i_j=a_{ji}.
171: \end{equation}
172: Under this configuration, all the objects collected by $u_i$ have
173: the same recommendation power. In despite of its simplicity, it can
174: outperform the two most widely applied recommendation algorithms,
175: \emph{global ranking method} (GRM) \cite{ex1} and
176: \emph{collaborative filtering} (CF) \cite{ex2}.
177:
178: To test the algorithmic accuracy, we use a benchmark data-set,
179: namely \emph{MovieLens} \cite{ex3}. The data consists of 1682 movies
180: (objects) and 943 users, and users vote movies using discrete
181: ratings 1-5. We therefore applied a coarse-graining method similar
182: to that used in Ref. \cite{Blattner2007}: a movie has been collected
183: by a user if and only if the giving rating is at least 3 (i.e. the
184: user at least likes this movie). The original data contains $10^5$
185: ratings, 85.25\% of which are $\geq 3$, thus after coarse gaining
186: the data contains 85250 user-object pairs. To test the
187: recommendation algorithms, the data set is randomly divided into two
188: parts: The training set contains 90\% of the data, and the remaining
189: 10\% of data constitutes the probe. The training set is treated as
190: known information, while no information in the probe set is allowed
191: to be used for prediction.
192:
193: \begin{figure}
194: \scalebox{0.8}[0.8]{\includegraphics{graph2}} \caption{(Color
195: online) The average degree of all recommended movies vs. $\beta$.
196: The black solid, red dash and blue dot curves represent the cases
197: with typical lengths $L=10$, 50 and 100, respectively. All the data
198: points are obtained by averaging over five independent runs with
199: different data-set divisions. }
200: \end{figure}
201:
202: A recommendation algorithm should provide each user with an ordered
203: queue of all its uncollected objects. For an arbitrary user $u_i$,
204: if the relation $u_i-o_j$ is in the probe set (according to the
205: training set, $o_j$ is an uncollected object for $u_i$), we measure
206: the position of $o_j$ in the ordered queue. For example, if there
207: are 1000 uncollected movies for $u_i$, and $o_j$ is the 10th from
208: the top, we say the position of $o_j$ is 10/1000, denoted by
209: $r_{ij}=0.01$. Since the probe entries are actually collected by
210: users, a good algorithm is expected to give high recommendations to
211: them, thus leading to small $r$. Therefore, the mean value of the
212: position value $\langle r\rangle$ (called \emph{ranking score}
213: \cite{Zhou2007}), averaged over all the entries in the probe, can be
214: used to evaluate the algorithmic accuracy: the smaller the ranking
215: score, the higher the algorithmic accuracy, and vice verse.
216: Implementing the three algorithms mentioned above, the average
217: values of ranking scores over five independent runs (one run here
218: means an independently random division of data set) are 0.107,
219: 0.122, and 0.140 for network-based recommendation, collaborative
220: filtering, and global ranking method, respectively. Clearly, even
221: under the simplest initial configuration, subject to the algorithmic
222: accuracy, the network-based recommendation outperforms the other two
223: algorithms.
224:
225: \begin{figure}
226: \scalebox{0.8}[0.8]{\includegraphics{graph3}} \caption{(Color
227: online) $S$ vs. $\beta$. The black solid, red dash and blue dot
228: curves represent the cases with typical lengths $L=10$, 50 and 100,
229: respectively. All the data points are obtained by averaging over
230: five independent runs with different data-set divisions. }
231: \end{figure}
232:
233: Consider the initial resource located on object $o_i$ as its
234: assigned recommendation power. In the whole recommendation process,
235: the total power given to $o_i$ is $p_i=\sum_jf^j_i$, where the
236: superscript $j$ runs over all the users $u_j$. Under uniform initial
237: configuration (see Eq. (2)), the total power of $o_i$ is
238: $p_i=\sum_jf^j_i=\sum_ja_{ij}=k(o_i)$. That is to say, the total
239: recommendation power assigned to an object is proportional to its
240: degree, thus the impact of high-degree objects (e.g. popular movies)
241: is enhanced. Although it already has a good algorithmic accuracy,
242: this uniform configuration may be oversimplified, and depressing the
243: impact of high-degree objects in an appropriate way could, perhaps,
244: further improve the accuracy. Motivated by this, we propose a more
245: complicated distribution of initial resource to replace Eq. (2):
246: \begin{equation}
247: f^i_j=a_{ji}k^\beta(o_j),
248: \end{equation}
249: where $\beta$ is a tunable parameter. Compared with the uniform
250: case, $\beta=0$, a positive $\beta$ strengthens the influence of
251: large-degree objects, while a negative $\beta$ weakens the influence
252: of large-degree objects. In particular, the case $\beta=-1$
253: corresponds to an identical allocation of recommendation power
254: ($p_i=1$) for each object $o_i$.
255:
256: Fig. 1 reports the algorithmic accuracy as a function of $\beta$.
257: The curve has a clear minimum around $\beta=-0.8$. Compared with the
258: uniform case, the ranking score can be further reduced by 9\% at the
259: optimal value. It is indeed a great improvement for recommendation
260: algorithms. Note that $\beta_{\texttt{opt}}$ is close to -1, which
261: indicates that the more homogeneous distribution of recommendation
262: power among objects may lead to a more accurate prediction.
263:
264: Besides accuracy, another significant ingredient one should take
265: into account to for a personal recommendation algorithm is how
266: personal this algorithm is. For example, suppose there are 10
267: perfect movies not yet known for user $u_i$, 8 of which are widely
268: popular, while the other two fit a certain specific taste of $u_i$.
269: An algorithm recommending the 8 popular movies is very nice for
270: $u_i$, but he may feel even better about a recommendation list
271: containing those two unpopular movies. Since there are countless
272: channels to obtain information on popular movies (TV, the Internet,
273: newspapers, radio, etc.), uncovering very specific preference,
274: corresponding to unpopular objects, is much more significant than
275: simply picking out what a user likes from the top of the list. To
276: measure this factor, we go simultaneously in two directions.
277: Firstly, given the length $L$ of recommendation list, the popularity
278: can be measured directly by averaging the degree $\langle k\rangle$
279: over all the recommended objects. One can see from Fig. 2 that the
280: average degree is positively correlated with $\beta$, thus
281: depressing the recommendation power of high-degree objects gives
282: more opportunity to unpopular objects. Also for $L=10$, 50 and 100,
283: the corresponding $\langle k\rangle$ are 353.50, 258.00 and 214.09
284: (GRM), as well as 84.62, 87.95 and 83.79 (CF). Since GRM always
285: recommends the most popular objects, it is clear that $\langle
286: k\rangle_{\texttt{GRM}}$ is the largest. On the other hand, CF
287: mainly depends the similarity between users. Thus one user may be
288: recommended an object collected by another user having very similar
289: habits to him, even though this object may be very unpopular. This
290: is the reason why $\langle k\rangle_{\texttt{CF}}$ is the smallest.
291: Secondly, one can measure the strength of personalization via the
292: Hamming distance. If the overlapped number of objects in $u_i$ and
293: $u_j$'s recommendation lists is $Q$, their Hamming distance is
294: $H_{ij}=1-Q/L$. Generally speaking, a more personal recommendation
295: list should have larger Hamming distances to other lists.
296: Accordingly, we use the mean value of Hamming distance $S=\langle
297: H_{ij}\rangle$, averaged over all the user-user pairs, to measure
298: the strength of personalization. Fig. 3 plots $S$ vs. $\beta$ and,
299: in accordance with the numerical results shown in Fig. 2, depressing
300: the influence of high-degree objects makes the recommendations more
301: personal. For $L=10$, 50 and 100, the corresponding $S$ are 0.508,
302: 0.397 and 0.337 (GRM), as well as 0.654, 0.501 and 0.421 (CF). Note
303: that, $S_{\texttt{GRM}}$ is obviously larger than zero, because the
304: collected objects will not appear in the recommendation list, thus
305: different users have different recommendation lists. Since CF has
306: the potential to enhance the user-user similarity, $S_{\texttt{CF}}$
307: is remarkably smaller than that corresponding to negative $\beta$ in
308: network-based recommendation.
309:
310: In a word, without any increase in the algorithmic complexity, using
311: an appropriate negative $\beta$ in our algorithm outperforms the
312: uniform case (i.e. $\beta=0$) for all three criteria: more accurate,
313: less popular, and more personalized.
314:
315: \emph{Conclusions}. --- In this paper, we propose a recommendation
316: algorithm based on a weighted object network. This algorithm is
317: sensitive to the configuration of initial resource distribution.
318: Even under the simplest case with binary resource, the current
319: algorithm has remarkably higher accuracy than the widely applied GRM
320: and CF. Since the computational complexity of this algorithm is much
321: less than that of CF \cite{ex4}, it has great potential significance
322: in practice. Furthermore, we introduce a free parameter $\beta$ to
323: regulate the initial configuration of resource. Numerical results
324: indicate that decreasing the initial resource located on popular
325: objects further improves the algorithmic accuracy: In the optimal
326: case ($\beta_{\texttt{opt}} \approx -0.8$), the distribution of
327: total initial resource located on each object is very homogeneous
328: ($p_i\sim k^{0.2}(o_i)$). Besides the ranking score, there have been
329: many measures suggested to evaluate the accuracy of personal
330: recommendation algorithms
331: \cite{Zhou2007,Billsus1998,Sarwar2000,Huang2004}, including
332: \emph{hitting rate}, \emph{precision}, \emph{recall},
333: \emph{F-measure}, and so on. However, thus far, there has been no
334: consideration of the degree of personalization. In this paper, we
335: suggest two measures, $\langle k\rangle$ and $S$, to address this
336: issue. We argue that to evaluate the performance of a recommendation
337: algorithm, one should take into account not only the accuracy, but
338: also the degree of personalization and popularity of recommended
339: objects. Even under this more strict criterion, the case with
340: $\beta_{\texttt{opt}} \approx -0.8$ outperforms the uniform case.
341: Theoretical physics provides us some beautiful and powerful tools in
342: dealing with this long-standing challenge in modern information
343: science: how to do a personal recommendation. We believe the current
344: work can enlighten readers in this interesting direction.
345:
346:
347: We acknowledge Runran Liu for very valuable discussion and comments
348: on this work. This work is partially supported by SBF (Switzerland)
349: for financial support through project C05.0148 (Physics of Risk),
350: and the Swiss National Science Foundation (205120-113842). TZhou
351: acknowledges NNSFC under Grant No. 10635040.
352:
353:
354: \begin{thebibliography} {1}
355:
356: \bibitem{Faloutsos1999} M. Faloutsos, P. Faloutsos, and C. Faloutsos, Comput. Comm. Rev. {\bf 29}, 251 (1999).
357: \bibitem{Broder2000} A. Broder, \emph{et al.}, Comput. Netw. {\bf 33}, 309 (2000).
358: \bibitem{Adomavicius2005} G. Adomavicius, and A. Tuzhilin, IEEE
359: Trans. Know. \& Data Eng. {\bf 17}, 734 (2005).
360: \bibitem{Konstan1997} J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker, L. R. Gordon, and J. Riedl, Commun. ACM {\bf 40}, 77 (1997).
361: \bibitem{Herlocker2004} J. L. Herlocker, J. A. Konstan, K. Terveen, and J. T. Riedl, ACM Trans. Inform. Syst. {\bf 22}, 5 (2004).
362: \bibitem{Balab97} M. Balabanovi\'c and Y. Shoham, Commun. ACM {\bf 40}, 66 (1997).
363: \bibitem{Pazzani99} M. J. Pazzani, Artif. Intell. Rev. {\bf 13}, 393 (1999).
364: \bibitem{Maslov00} S. Maslov, and Y.-C. Zhang,
365: Phys. Rev. Lett. {\bf 87}, 248701 (2001).
366: \bibitem{Goldberg2001} K. Goldberg, T. Roeder, D. Gupta, and C.
367: Perkins, Inf. Ret. {\bf 4}, 133 (2001).
368: \bibitem{Zhang2007a} Y.-C. Zhang, M. Blattner, and Y.-K. Yu, Phys.
369: Rev. Lett. {\bf 99}, 154301 (2007).
370: \bibitem{Zhou2007} T. Zhou, J. Ren, M. Medo, and Y.-C. Zhang, Phys. Rev. E {\bf 76}, 046115 (2007).
371: \bibitem{Zhang2007b} Y.-C. Zhang, M. Medo, J. Ren, T. Zhou, T. Li, and F. Yang, EPL {\bf 80}, 68003 (2007).
372: \bibitem{ex1} The global ranking method sorts all the objects in the descending order
373: of degree and recommends those with highest degrees.
374: \bibitem{ex2} The collaborative filting is based on measuring the
375: similarity between users. For two users $u_i$ and $u_j$, their
376: similarity can be simply determined by
377: $s_{ij}=\sum^n_{l=1}a_{li}a_{lj} / \texttt{min}\{k(u_i),k(u_j)\}$.
378: For any user-object pair $u_i-o_j$, if $u_i$ has not yet collected
379: $o_j$ (i.e., $a_{ji}=0$), the predicted score, $v_{ij}$ (to what
380: extent $u_i$ likes $o_j$), is given as $v_{ij}=\sum^m_{l=1,l\neq
381: i}s_{li}a_{jl} / \sum^m_{l=1,l\neq i}s_{li}$. For any user $u_i$,
382: all the nonzero $v_{ij}$ with $a_{ji}=0$ are sorted in descending
383: order, and those objects in the top are recommended.
384: \bibitem{ex3} The MovieLens data can be downloaded from the web-site of
385: \emph{GroupLens Research} (http://www.grouplens.org).
386: \bibitem{Blattner2007} M. Blattner, Y. -C. Zhang, and S. Maslov, Physica A {\bf 373}, 753 (2007).
387: \bibitem{ex4} Instead of calculating all the elements in $W$, one
388: can implement the current algorithm by directly diffusing the
389: resource of each user. Ignoring the degree-degree correlation in
390: user-object relations, the algorithmic complexity is
391: $\mathbb{O}(m\langle k_u\rangle\langle k_o\rangle)$, where $\langle
392: k_u\rangle$ and $\langle k_o\rangle$ denote the average degree of
393: users and objects. Correspondingly, the algorithmic complexity of
394: collaborative filtering is $\mathbb{O}(m^2\langle
395: k_u\rangle+mn\langle k_o\rangle)$, where the first term accounts for
396: the calculation of similarity between users, and the second term
397: accounts for the calculation of the predictions.
398: \bibitem{Billsus1998} D. Billsus and M. J. Pazzani, \emph{Proc. 15th Int.
399: Conf. Machine Learning}, pp. 46-54 (1998).
400: \bibitem{Sarwar2000} B. Sarwar, G. Karypis, J. Konstan, and J.
401: Riedl, \emph{Proc. ACM Conf. Electronic Commerce}, pp. 158-167
402: (2000).
403: \bibitem{Huang2004} Z. Huang, H. Chen, and D. Zeng, ACM Trans. Inf.
404: Syst. {\bf 22}, 116 (2004).
405:
406:
407:
408:
409: \end{thebibliography}
410:
411:
412:
413:
414:
415:
416:
417:
418: \end{document}
419: