0112:cs0112013/cs0112013

1: \documentclass[12pt]{article}

2: \usepackage{amsthm}

3:

4: \theoremstyle{definition}

5: \newtheorem{definition}{Definition}

6:

7: \title{A Data Mining Framework for Optimal Product Selection in Retail Supermarket Data: The Generalized PROFSET Model}

8:

9: \author{Tom Brijs\footnote{Tom Brijs is a research fellow of the Fund for Scientific Research Flanders.}

10: \quad Bart Goethals \\

11: Gilbert Swinnen \quad Koen Vanhoof \quad Geert Wets \\

12: University of Limburg

13: }

14:

15: \date{}

16:

17: \begin{document}

18: \maketitle

19:

20: \begin{abstract}

21: In recent years, data mining researchers have developed efficient

22: association rule algorithms for retail market basket analysis.

23: Still, retailers often complain about how to adopt association

24: rules to optimize concrete retail marketing-mix decisions. It is

25: in this context that, in a previous paper, the authors have

26: introduced a product selection model called

27: PROFSET.\footnote{PROFSET stands for PROFitability per SET because

28: the optimization model is based on the calculation of the

29: profitability per frequent set in order to determine the

30: cross-selling potential between products.} This model selects the

31: most interesting products from a product assortment based on

32: their cross-selling potential given some retailer defined

33: constraints.  However this model suffered from an important

34: deficiency: it could not deal effectively with supermarket data,

35: and no provisions were taken to include retail category

36: management principles.  Therefore, in this paper, the authors

37: present an important generalization of the existing model in

38: order to make it suitable for supermarket data as well, and to

39: enable retailers to add category restrictions to the model.

40: Experiments on real world data obtained from a Belgian

41: supermarket chain produce very promising results and demonstrate

42: the effectiveness of the generalized PROFSET model.

43: \end{abstract}

44:

45: \section{Introduction}

46:

47: Since almost all mid to large size retailers today possess

48: electronic sales transaction systems, retailers realize that

49: competitive advantage will no longer be achieved by the mere use

50: of these systems for purposes of inventory management or

51: facilitating customer check-out. In contrast, competitive

52: advantage will be gained by those retailers who are able to

53: extract the knowledge hidden in the data, generated by those

54: systems, and use it to optimize their marketing decision making.

55: In this context, knowledge about how customers are using the

56: retail store is of critical importance and distinctive

57: competencies will be built by those retailers who best succeed in

58: extracting actionable knowledge from these data.  Association rule

59: mining \cite{ais} can help retailers to efficiently extract this

60: knowledge from large retail databases.  We assume some

61: familiarity with the basic notions of association rule mining.

62:

63: In recent years, a lot of effort in the area of retail market

64: basket analysis has been invested in the development of

65: techniques to increase the interestingness of association rules.

66: Currently, in essence three different research tracks to study

67: the interestingness of association rules can be distinguished.

68:

69: First, a number of objective measures of interestingness have

70: been developed in order to filter out non-interesting association

71: rules based on a number of statistical properties of the rules,

72: such as support and confidence \cite{ais}, interest

73: \cite{correlation}, intensity of implication \cite{implic},

74: J-measure \cite{nar}, and correlation \cite{prune}. Other

75: measures are based on the syntactical properties of the rules

76: \cite{p_analysis}, or they are used to discover the

77: least-redundant set of rules \cite{redundancy}. Second, it was

78: recognized that domain knowledge may also play an important role

79: in determining the interestingness of association rules.

80: Therefore, a number of subjective measures of interestingness

81: have been put forward, such as unexpectedness

82: \cite{unexpectedness}, actionability \cite{actipat} and rule

83: templates \cite{interest}. Finally, the most recent stream of

84: research advocates the evaluation of the interestingness of

85: associations in the light of the micro-economic framework of the

86: retailer \cite{papadimitriou}. More specifically, a pattern in

87: the data is considered interesting only to the extent in which it

88: can be used in the decision-making process of the enterprise to

89: increase its utility.

90:

91: It is in this latter stream of research that the authors have

92: previously developed a model for product selection called PROFSET

93: \cite{profset}, that takes into account both quantitative and

94: qualitative elements of retail domain knowledge in order to

95: determine the set of products that yields maximum cross-selling

96: profits. The key idea of the model is that products should not be

97: selected based on their individual profitability, but rather on

98: the \emph{total} profitability that they generate, including

99: profits from cross-selling. However, in its previous form, one

100: major drawback of the model was its inability to deal with

101: supermarket data (i.e., large baskets). To overcome this

102: limitation, in this paper we will propose an important

103: generalization of the existing PROFSET model that will

104: effectively deal with large baskets.  Furthermore, we generalize

105: the model to include category management principles specified by

106: the retailer in order to make the output of the model even more

107: realistic.

108:

109: The remainder of the paper is organized as follows.  In

110: Section~\ref{overview} we will focus on the limitations of the

111: previous PROFSET model for product selection.  In

112: Section~\ref{general}, we will introduce the generalized PROFSET

113: model.  Section~\ref{impl} will be devoted to the empirical

114: implementation of the model and its results on real-world

115: supermarket data.  Finally, Section~\ref{concl} will be reserved

116: for conclusions and further research.

117:

118: \section{The PROFSET Model} \label{overview}

119:

120: The key idea of the PROFSET model is that when evaluating the

121: business value of a product, one should not only look at the

122: individual profits generated by that product (the na\"{i}ve

123: approach), but one must also take into account the profits due to

124: cross-selling effects with other products in the assortment.

125: Therefore, to evaluate product profitability, it is essential to

126: look at frequent sets rather than at individual product items

127: since the former represent frequently co-occuring product

128: combinations in the market baskets of the customer. As was also

129: stressed by Cabena et al.\ \cite{cabena}, one disadvantage of

130: associations discovery is that there is no provision for taking

131: into account the business value of an association.  The PROFSET

132: model was a first attempt to solve this problem. Indeed, in terms

133: of the associations discovered, the sale of an expensive bottle

134: of wine with oysters accounts for as much as the sale of a carton

135: of milk with cereal.  This example illustrates that, when

136: evaluating the interestingness of associations, the

137: micro-economic framework of the retailer should be incorporated.

138: PROFSET was developed to maximize cross-selling opportunities by

139: evaluating the profit margin generated per frequent set of

140: products, rather than per product.  In the next Section we will

141: discuss the limitations of the previous PROFSET model.  More

142: details can be found elsewhere \cite{profset}.

143:

144: \subsection{Limitations} \label{limit}

145:

146: The previous PROFSET model was specifically developed for market

147: basket data from automated convenience stores.  Data sets of this

148: origin are characterized by small market baskets (size 2 or 3)

149: because customers typically do not purchase many items during a

150: single shopping visit. Therefore, the profit margin generated per

151: frequent purchase combination $(X)$ could accurately be

152: approximated by adding the profit margins of the market baskets

153: $(T_j)$ containing the same set of items, i.e.\, $X =T_j$.

154: However, for supermarket data, the existing formulation of the

155: PROFSET model poses significant problems since the size of market

156: baskets typically exceeds the size of frequent itemsets. Indeed,

157: in supermarket data, frequent itemsets mostly do not contain more

158: than 7 different products, whereas the size of the average market

159: basket is typically 10 to 15.  As a result, the existing profit

160: allocation heuristic cannot be used anymore since it would cause

161: the model to heavily underestimate the profit potential from

162: cross-selling effects between products.  However, getting rid of

163: this heuristic is not trivial and it will be discussed in detail

164: in Section~\ref{profalloc}.

165:

166: A second limitation of the existing PROFSET model relates to

167: principles of category management. Indeed, there is an increasing

168: trend in retailing to manage product categories as separate

169: strategic business units \cite{sagit}.  In other words, because

170: of the trend to offer more products, retailers can no longer

171: evaluate and manage each product individually.  Instead, they

172: define product categories and define marketing actions (such as

173: promotions or store layout) on the level of these categories. The

174: generalized PROFSET model takes this domain knowledge into

175: account and therefore offers the retailer the ability to specify

176: product categories and place restrictions on them.

177:

178: \section{The Generalized PROFSET Model} \label{general}

179:

180: In this section, we will highlight the improvements being made to

181: the previous PROFSET model \cite{profset}.

182:

183: \subsection{Profit Allocation} \label{profalloc}

184:

185: Avoiding the equality constraint $X = T_j$ results in different

186: possible profit allocation systems.  Indeed, it is important to

187: recognize that the margin of transaction $T_j$ can potentially be

188: allocated to different frequent subsets of that transaction.  In

189: other words, how should the margin $m(T_j)$ be allocated to one

190: or more different frequent subsets of $T_j$?

191:

192: The idea here is that we would like to know the purchase

193: intentions of the customer who bought $T_j$. Unfortunately, since

194: the customer has already left the store, we do not possess this

195: information.  However, if we can assume that some items occur

196: more frequently together than others because they are considered

197: complementary by customers, then frequent itemsets may be

198: interpreted as purchase intentions of customers. Consequently,

199: there is the additional problem of finding out which and how many

200: purchase intentions are represented in a particular transaction

201: $T_j$.  Indeed, a transaction may contain several frequent subsets

202: of different sizes, so it is not straightforward to determine

203: which frequent sets represent the underlying purchase intentions

204: of the customer at the time of shopping.  Before proposing a

205: solution to this problem, we will first define the concept of a

206: maximal frequent subset of a transaction.

207:

208: \begin{definition}

209: Let $F$ be the collection of all frequent subsets of a sales

210: transaction $T_j$. Then $X \in F$ is called \emph{maximal},

211: denoted as $X_{\it max}$, if and only if $\forall Y \in F : |Y|

212: \leq |X|$.

213: \end{definition}

214:

215: Using this definition, we will adopt the following rationale to

216: allocate the margin $m(T_j)$ of a sales transaction $T_j$.

217:

218: If there exists a frequent set $X = T_j$, then we allocate

219: $m(T_j)$ to $M(X)$, just as in the previous PROFSET model.

220: However, if there is no such frequent set, then one maximal

221: frequent subset $X$ will be drawn from all maximal frequent

222: subsets according to the probability distribution $\Theta_{T_j}$,

223: with

224: $$\Theta_{T_j}(X_{\it max}) = \frac{\mbox{support}(X_{max})}{\sum_{Y_{\it max} \in T_j} \mbox{support}(Y_{\it max})}$$

225: After this, the margin $m(X)$ is assigned to $M(X)$ and the

226: process is repeated for $T_j \setminus X$.  In summary:

227: \begin{center}

228: \parbox{\columnwidth}{

229: \begin{tabbing}

230: \quad \= \quad \= \kill

231: \textbf{for} every transaction $T_j$  \textbf{do} \{ \+ \\

232:  \textbf{while} ($T_j$ contains frequent sets) \textbf{do} \{ \+ \\

233:  Draw X from all maximal frequent subsets \\

234:  using probability distribution $\Theta_{T_j}$; \\ [\medskipamount]

235:  $M(X) := M(X) + m(X)$ \\

236:  with $m(X)$ the profit margin of $X$ in $T_j$; \\  [\medskipamount]

237:  $T_j := T_j \setminus X$; \- \\

238: \} \- \\

239: \} \\

240: \textbf{return} all $M(X)$;

241: \end{tabbing}

242: }

243: \end{center}

244: Say, during profit allocation, we are given a transaction

245: $$T = \{\mbox{cola}, \mbox{peanuts}, \mbox{cheese}\}.$$

246: Table~\ref{subsets} contains all frequent subsets of $T$ for a

247: particular transaction da\-ta\-base.

248: \begin{table}

249: \centering \caption{Frequent Subsets of $T_{100}$} \label{subsets}

250: \begin{tabular}{|lccc|}

251:   \hline

252:   \textbf{Frequent Sets} & \textbf{Support} & \textbf{Maximal} & \textbf{Unique} \\

253:   \hline

254:   \{cola\} & $10\%$ & No & No \\

255:   \{peanuts\} & $5\%$ & No & No \\

256:   \{cheese\} & $8\%$ & No & No \\

257:   \{cola, peanuts\} & $2\%$ & Yes & No \\

258:   \{peanuts, cheese\} & $1\%$ & Yes & No \\ \hline

259: \end{tabular}

260: \end{table}

261: In this example, there is no \emph{unique} maximal frequent subset

262: of $T$.  Indeed, there are two maximal frequent subsets of $T$,

263: namely \{cola, peanuts\} and \{peanuts, cheese\}. Consequently,

264: it is not obvious to which maximal frequent subset the profit

265: margin $m(T)$ should be allocated.  Moreover, we would not

266: allocate the entire profit margin $m(T)$ to the selected itemset,

267: but rather the proportion $m(X)$ that corresponds to the items

268: contained in the selected maximal subset.

269:

270: Now how can one determine to which of both frequent subsets of

271: $T$ this margin should be allocated?  As we have already

272: discussed, the crucial idea here is that it really depends on

273: what has been the purchase intentions of the customer who

274: purchased $T$. Unfortunately, one can never know exactly since we

275: haven't asked the customer at the time of purchase. However, the

276: support of the frequent subsets of $T$ may provide some

277: probabilistic estimation.  Indeed, if the support of a frequent

278: subset is an indicator for the probability of occurrence of this

279: purchase combination, then according to the data, customers buy

280: the maximal subset \{cola, peanuts\} two times more frequently

281: than the maximal subset \{peanuts, cheese\}. Consequently, we can

282: say that it is more likely that the customer's purchase intention

283: has been \{cola, peanuts\} instead of \{peanuts, cheese\}. This

284: information is used to construct the probability distribution

285: $\Theta_{T_j}$, reflecting the relative frequencies of the

286: frequent subsets of $T$.  Now, each time a sales transaction

287: \{cola, peanuts, cheese\} is encountered in the data, a random

288: draw from the probability distribution $\Theta_{T_j}$ will

289: provide the \emph{most probable} purchase intention (i.e.

290: frequent subset) for that transaction. Consequently, on average

291: in two of the three times this transaction is encountered,

292: maximal subset \{cola, peanuts\} will be selected and

293: $m(\{\mbox{cola}, \mbox{peanuts}\})$ will be allocated to

294: $M(\{\mbox{cola}, \mbox{peanuts}\})$. After this, $T$ is split up

295: as follows: $T := T \setminus \{\mbox{cola}, \mbox{peanuts}\}$

296: and the process of assigning the remaining margin is repeated as

297: if the new $T$ were a separate transaction, until $T$ does not

298: contain a frequent set anymore.

299:

300: \subsection{Category Management Restrictions} \label{category}

301:

302: As pointed out in Section~\ref{limit}, a second limitation of the

303: previous PROFSET model is its inability to include category

304: management restrictions.  This sometimes causes the model to

305: exclude even all products from one or more categories because

306: they do not contribute enough to the overall profitability of the

307: optimal set.  This often contradicts with the mission of

308: retailers to offer customers a wide range of products, even if

309: some of those categories or products are not profitable enough.

310: Indeed, customers expect supermarkets to carry a wide variety of

311: products and cutting away categories/departments would be against

312: the customers' expectations about the supermarket and would harm

313: the store's image.  Therefore, we want to offer the retailer the

314: ability to include category restrictions into the generalized

315: PROFSET model.

316:

317: This can be accomplished by adding an additional index $k$ to the

318: product variable $Q_i$ to account for category membership, and by

319: adding constraints on the category level. Several kinds of

320: category restrictions can be introduced: which and how many

321: categories should be included in the optimal set, or how many

322: products from each category should be included.  The relevance of

323: these restrictions can be illustrated by the following common

324: practices in retailing.  First, when composing a promotion

325: leaflet, there is only limited space to display products and

326: therefore it is important to optimize the product composition in

327: order to maximize cross-selling effects between products and

328: avoid product cannibalization.  Moreover, according to the

329: particular retail environment, the retailer will include or

330: exclude specific products or product categories in the leaflet.

331: For example, the supermarket in this study attempts to

332: differentiate from the competition by the following image

333: components: \emph{fresh}, \emph{profitable} and \emph{friendly}.

334: Therefore, the promotion leaflet of the retailer emphasizes

335: product categories that support this image, such as fresh

336: vegetables and meat, freshly-baked bread, ready-made meals, and

337: others. Second, product category constraints may reflect shelf

338: space allocations to products.  For instance, large categories

339: have more product facings than smaller categories.  These kind of

340: constraints can easily be included in the generalized PROFSET

341: model as will be discussed hereafter.

342:

343: \subsection{The Generalized PROFSET Model}

344:

345: Bundling the improvements suggested in Sections~\ref{profalloc}

346: and~\ref{category} results in the generalized PROFSET model as

347: presented below.

348:

349: Let categories $C_1, \ldots, C_n$ be sets of items, $L$ the set

350: of frequent itemsets, and let $P_X$, $Q_i \in \{0,1\}$ be the

351: decision variables for which the optimization routine must find

352: the optimal values. $P_X$ specifies whether an itemset $X$ will

353: positively contribute to the value of the objective function, and

354: $Q_i$ equals 1 as soon as any itemset $X$ in which it is included

355: is set to 1 ($P_X = 1$) by the optimization routine. Let ${\rm

356: Cost}_i$ be the inventory and handling cost of item $i$. The

357: objective of the following formula is to maximize all profits

358: from cross-selling effects between products:

359:

360: $$\mbox{max}\left( \sum_{X \in L} M(X) P_X - \sum_{c=1}^{n}\sum_{i \in C_c} {\rm Cost}_i Q_i\right)$$

361:

362: which is subject to the following constraints

363: \begin{eqnarray}

364: \label{c1}

365: \sum_{c=1}^{n} \sum_{i \in C_c} Q_i = \mbox{ItemMax} \\

366: \label{c2}

367: \forall X \in L,\:\forall i \in X:Q_i \geq P_X \\

368: \label{c3} \forall C_c: \sum_{i \in C_c} Q_i \geq

369: \mbox{ItemMin}_{C_c}

370: \end{eqnarray}

371:

372: Constraint~\ref{c1} determines how many items are allowed to be

373: included in the optimal set. The $\mbox{ItemMax}$ parameter,

374: specified by the retailer, will depend on the retail environment

375: in which the model is being used.  For instance, it may be the

376: number of eye-catchers (products obtaining special display space)

377: in the supermarket or the number of facings in a promotion

378: leaflet. Constraint~\ref{c2} is analogous to the one in the

379: previous PROFSET model and specifies the relationship between the

380: frequent sets and the products contained in them. Finally,

381: constraint~\ref{c3} specifies the number of categories and the

382: number of products that are allowed, within each category, to

383: enter the optimal set.

384:

385: \section{Empirical Study} \label{impl}

386:

387: The empirical study is based on a data set of $18\,182$ market

388: baskets obtained from a sales outlet of a Belgian supermarket

389: chain over a period of 1 month.  The store carries $9\,965$

390: different products grouped in 281 product categories.  The

391: average market basket contains $10.6$ different product items.  In

392: total, $3\,381$ customers own a loyalty card of the supermarket under

393: study.

394:

395: First, frequent sets and association rules were discovered from

396: the market baskets with a minimum absolute support threshold of

397: 30 transactions. The motivation behind this is that a product or

398: set of products should have been sold at least, approximately,

399: once a day to be called frequent. Slightly more than $87\%$ of

400: the products are sold less than once a day.

401:

402: The retailer in question is interested in finding the optimal set

403: of eye-catchers such that the profit from cross-selling these

404: eye-catchers is maximized. Hence, this should be represented by

405: the objective function as described in the previous section.

406: However, because of limited shelf-space for each product

407: category, the retailer specified that each product category can

408: only delegate one product to the optimal set, represented by the

409: category constraint (i.e. constraint~\ref{c3}). Subsequently, it

410: is the goal of the generalized PROFSET model to select the most

411: profitable set of products in terms of cross-selling

412: opportunities between the delegates of each category.

413:

414: For 54 $(24,7\%)$ of the 218 product categories, the generalized

415: PROFSET model selects a different product than the one with the

416: highest individual profit ranking within each category.  This

417: suggests that for these products, there must be some

418: cross-selling opportunity with eye-catchers from other categories

419: which cause these products to get \emph{promoted} in the

420: profitability ranking.

421:

422: Due to space limitations Table~\ref{profit} shows the relative

423: improvements in cross-selling profit for only some categories,

424: expressed as the percentage of improvement in cross-selling

425: profits by choosing the optimal products from the generalized

426: PROFSET model instead of selecting the product with the highest

427: individual profitability within each category.

428:

429: \begin{table}

430: \caption{Cross-selling profit improvements} \label{profit}

431: \centering

432: \begin{tabular}{|lc|}

433:   \hline

434:   \textbf{Category} & \textbf{Improvement} \\

435:   \hline

436: Washing-up liquid & 21\% \\

437: Baby food & 49\% \\

438: Margarine 1 & 189\% \\

439: Coffee biscuits & 14\% \\

440: Sandwich filling & 43\% \\

441: Candy bars & 588\% \\

442: Canned fish & N/A \\

443: Canned fruit & 3\% \\

444: Packed-up bread & 8\% \\

445: Newspapers and magazines & 55\% \\

446: $\ldots$ & $\ldots$ \\

447: \hline

448: \end{tabular}

449: \end{table}

450: \begin{table*}[t]

451: \centering \caption{Own and cross-selling profit figures (in BEF)

452: per product} \label{cross}

453: \begin{tabular}{|lccc|}

454:   \hline

455:   & \textbf{Own} & \textbf{Cross-selling} & \textbf{Total} \\

456:   \textbf{Product} & \textbf{profit} & \textbf{profit} & \textbf{profit} \\

457:   \hline

458:   1. {\sc milky way mini} & $37\,808$ & $2\,350$ & $40\,158$ \\

459:   2. {\sc melo cakes} & $34\,333$ & 0 & $34\,333$ \\

460:   3. {\sc Leo 3-pack} & $28\,728$ & 0 & $28\,728$ \\

461:   4. {\sc Leo 10-pack} 10+2 & $12\,028$ & $264\,228$ & $276\,256$ \\

462: \hline

463: \end{tabular}

464: \end{table*}

465:

466: It would lead us too far to discuss the profit improvements in

467: detail for all categories.  Therefore, we will highlight one of

468: the most striking results to illustrate the power of the model.

469: Analogous conclusions can be obtained for other categories.  Note

470: that N/A means that there is no alternative product available in

471: that category that has enough support to be frequent, such that

472: comparison with the product, selected by the generalized PROFSET

473: model, is not applicable. For instance, for the category candy

474: bars, the profit from cross-selling the selected eye-catcher of

475: this category with eye-catchers of other categories would increase

476: cross-selling profits by $588\%$. This can be observed in

477: Table~\ref{cross} (only relevant products are included).

478:

479:

480: Table~\ref{cross} illustrates that product 4 in the candy bars

481: category is ranked last when looking at its own profit. However,

482: due to large cross-selling effects with eye-catchers of other

483: product categories, this product becomes much more important when

484: looking at the total profit. This illustrates that for the

485: eye-catchers application, it is better to display product `Leo

486: 10-pack 10+2' than to display one of its competing products in

487: the same category. In contrast, if the objective would be the

488: selling volume of the individual product, then it would be better

489: to select product 1 as eye-catcher, but since the retailer wants

490: the customer to buy other products with it, product 4 will

491: definitely be the best choice.  The association rules discovered

492: during the mining phase validate these conclusions. \\

493: [\medskipamount]

494: {\sc milky way}$\Rightarrow${\sc vegetable/fruit} \\

495: (sup=$0.17\%$, conf=$50.82\%$) \\

496: {\sc meat product and Leo 10-pack} $\Rightarrow${\sc cheese

497: product} \\

498: (sup=$0.396\%$, conf=$55\%$)

499: \\ [\medskipamount] Note that the products included in the rules

500: are all eye-catchers such as determined by the generalized

501: PROFSET model.  The reason that the other items contained in the

502: association rules carry a rather abstract name, such as ``cheese

503: product'', is because this is a collective noun for products that

504: do not have an own barcode, like for instance different cheese

505: products that are weighed at the check-out after which they are

506: grouped into an abstract product name such as ``cheese product''.

507:

508: Finally, for those product categories that do not contain

509: frequent products, the generalized PROFSET model will choose the

510: product with the highest individual profit in order to maximize

511: the overall profitability of the eye-catcher set.

512:

513: \section{Further Research} \label{concl}

514:

515: The authors plan to test the proposed model in practice and

516: externally validate its performance based on a real world

517: experiment in cooperation with the Belgian supermarket chain.

518: Furthermore, additional improvements to the model will be

519: considered.  More specifically, it will be studied how promotion

520: coupons affect the composition of the optimal set of products and

521: whether it is possible to measure the effect of the value price

522: reduction on the cross-selling profitability of products.

523:

524: \bibliographystyle{plain}

525: \begin{thebibliography}{10}

526:

527: \bibitem{actipat}

528: G.~Adomavicius and A.~Tuzhilin.

529: \newblock Discovery of actionable patterns in databases: the action hierarchy

530:   approach.

531: \newblock In D.~Heckerman, H.~Mannila, and D.~Pregibon, editors, {\em

532:   Proceedings of the Third International Conference on Knowledge Discovery \&

533:   Data Mining}, pages 111--114. AAAI Press, 1997.

534:

535: \bibitem{ais}

536: R.~Agrawal, T.~Imielinski, and A.N. Swami.

537: \newblock Mining association rules between sets of items in large databases.

538: \newblock In {\em Proceedings of the 1993 {ACM SIGMOD} International Conference

539:   on Management of Data}, volume 22:2 of {\em SIGMOD Record}, pages 207--216.

540:   ACM Press, 1993.

541:

542: \bibitem{profset}

543: T.~Brijs, G.~Swinnen, K.~Vanhoof, and G.~Wets.

544: \newblock Using association rules for product assortment decisions: a case

545:   study.

546: \newblock In Heckerman et~al. \cite{kdd99}, pages 254--260.

547:

548: \bibitem{redundancy}

549: T.~Brijs, K.~Vanhoof, and G.~Wets.

550: \newblock Reducing redundancy in characteristic rule discovery by using integer

551:   programming techniques.

552: \newblock In {\em Intelligent Data Analysis Journal}, volume 4:3. Elsevier,

553:   2000.

554: \newblock To Appear.

555:

556: \bibitem{cabena}

557: P.~Cabena, P.~Hadjinian, R.~Stadler, J.~Verhees, and A.~Zanasi.

558: \newblock {\em Discovering Data Mining: From Concept to Implementation}.

559: \newblock Prentice Hall, 1997.

560:

561: \bibitem{sagit}

562: G.~Cuomo and A.~Pastore.

563: \newblock A category management application in the frozen food sector in italy:

564:   The unilever-sagit case.

565: \newblock In A.~Broadbridge, editor, {\em Proceedings of the 10th International

566:   Conference on Research in the Distributive Trades}, pages 225--233. Institute

567:   for Retail Studies: University of Stirling, 1999.

568:

569: \bibitem{implic}

570: S.~Guillaume, F.~Guillet, and J.~Philipp�.

571: \newblock Improving the discovery of association rules with intensity of

572:   implication.

573: \newblock In {\em Principles of Data Mining and Knowledge Discovery}, volume

574:   1510 of {\em Lecture Notes in Artificial Intelligence}, pages 318--327.

575:   Springer, 1998.

576:

577: \bibitem{kdd99}

578: D.~Heckerman, H.~Mannila, and D.~Pregibon, editors.

579: \newblock {\em Proceedings of the Fifth International Conference on Knowledge

580:   Discovery \& Data Mining}. AAAI Press, 1997.

581:

582: \bibitem{papadimitriou}

583: J.~Kleinberg, C.~Papadimitriou, and P.~Raghavan.

584: \newblock A microeconomic view of data mining.

585: \newblock In {\em Knowledge Discovery and Data Mining}, volume 2:4, pages

586:   254--260. Kluwer Academic Publishers, 1998.

587:

588: \bibitem{interest}

589: M.~Klemettinen, H.~Mannila, P.~Ronkainen, H.~Toivonen, and A.I. Verkamo.

590: \newblock Finding interesting rules from large sets of discovered association

591:   rules.

592: \newblock In Nabil~R. Adam, Bharat~K. Bhargava, and Yelena Yesha, editors, {\em

593:   Proceedings of the Third International Conference on Information and

594:   Knowledge Management}, pages 401--407. ACM Press, 1994.

595:

596: \bibitem{p_analysis}

597: B.~Liu and W.~Hsu.

598: \newblock Post-analysis of learned rules.

599: \newblock In {\em Proceedings of the Thirteenth National Conference on

600:   Artificial Intelligence}, Lecture Notes in Artificial Intelligence, pages

601:   828--834. AAAI Press/MIT Press, 1996.

602:

603: \bibitem{prune}

604: B.~Liu, W.~Hsu, and Y.~Ma.

605: \newblock Pruning and summarizing the discovered associations.

606: \newblock In Heckerman et~al. \cite{kdd99}, pages 125--134.

607:

608: \bibitem{unexpectedness}

609: B.~Padmanabhan and A.~Tuzhilin.

610: \newblock Unexpectedness as a measure of interestingness in knowledge

611:   discovery.

612: \newblock In {\em Decision Support Systems}, volume~27, pages 303--318.

613:   Elsevier Science, 1999.

614:

615: \bibitem{correlation}

616: C.~Silverstein, S.~Brin, and R.~Motwani.

617: \newblock Beyond market baskets: generalizing association rules to dependence

618:   rules.

619: \newblock In {\em Knowledge Discovery and Data Mining}, volume 2:1, pages

620:   39--68. Kluwer Academic Publishers, 1998.

621:

622: \bibitem{nar}

623: K.~Wang, S.H.W. Tay, and B.~Liu.

624: \newblock Interestingness-based interval merger for numeric association rules.

625: \newblock In R.~Agrawal, P.~Stolorz, and G.~Piatetsky-Shapiro, editors, {\em

626:   Proceedings of the Fourth International Conference on Knowledge Discovery \&

627:   Data Mining}, pages 121--127. AAAI Press, 1998.

628:

629: \end{thebibliography}

630:

631: \end{document}

632: