0303:cs0303025/music.tex

1: \documentclass{article}

2:

3: \usepackage{fullpage,latexsym}

4: \usepackage{epsfig}

5: \usepackage{pslatex}

6:

7:

8: \bibliographystyle{plain}

9:

10: \begin{document}

11:

12: \title{Algorithmic Clustering of Music}

13: \author{Rudi Cilibrasi\thanks{Supported in part by NWO.

14: Address:

15: CWI, Kruislaan 413,

16: 1098 SJ Amsterdam, The Netherlands.

17: Email: {\tt Rudi.Cilibrasi@cwi.nl}.}

18: \\CWI

19: \and

20: Paul Vitanyi\thanks{Supported in part by the

21: EU project RESQ, IST--2001--37559, the NoE QUIPROCONE

22: +IST--1999--29064,

23: the ESF QiT Programmme, and the EU Fourth Framework BRA

24:  NeuroCOLT II Working Group

25: EP 27150.

26: Address:

27: CWI, Kruislaan 413,

28: 1098 SJ Amsterdam, The Netherlands.

29: Email: {\tt Paul.Vitanyi@cwi.nl}.}\\CWI and University of Amsterdam

30: \and

31: Ronald de Wolf\thanks{Supported in part by EU project RESQ, IST-2001-37559.

32: Address:

33: CWI, Kruislaan 413,

34: 1098 SJ Amsterdam, The Netherlands.

35: Email: {\tt Ronald.de.Wolf@cwi.nl}.}\\

36: CWI}

37: \date{}

38: \maketitle

39:

40:

41: \begin{abstract}

42: We present a fully automatic method for music classification,

43: based only on compression of strings that represent the music

44: pieces. The method uses no background knowledge

45: about music whatsoever: it is completely

46: general and can, without change, be used in different areas

47: like linguistic classification and genomics. It is based on an ideal

48: theory of the information content in individual objects

49: (Kolmogorov complexity), information distance, and a

50: universal similarity metric. Experiments show that the method

51: distinguishes reasonably well between various musical genres

52: and can even cluster pieces by composer.

53: \end{abstract}

54:

55:

56:

57: \section{Introduction}

58:

59: All musical pieces are similar, but some are more similar than others.

60: Apart from being an infinite source of discussion (``Haydn is just

61: like Mozart --- no, he's not!''), such similarities are also

62: crucial for the design of efficient music information retrieval systems.

63: The amount of digitized music available on the internet has grown

64: dramatically in recent years, both in the public domain

65: and on commercial sites. Napster and its clones are prime examples.

66: Websites offering musical content in some form or other

67: (MP3, MIDI, \ldots) need a way to organize their wealth of material;

68: they need to somehow classify their files according to

69: musical genres and subgenres, putting similar pieces together.

70: The purpose of such organization is to enable users

71: to navigate to pieces of music they already know and like,

72: but also to give them advice and recommendations

73: (``If you like this, you might also like\ldots'').

74: Currently, such organization is mostly done manually by humans,

75: but some recent research has been looking into the possibilities

76: of automating music classification.

77:

78: A human expert, comparing different pieces of music with the aim to cluster

79: likes together, will generally look for certain specific similarities.

80: Previous attempts to automate this process do the same.

81: Generally speaking, they take a file containing a piece of music

82: and extract from it various specific numerical features,

83: related to pitch, rhythm, harmony etc.

84: One can extract such features using for instance

85: Fourier transforms~\cite{TC02} or wavelet transforms~\cite{GKCwavelet}.

86: The feature vectors corresponding to the various files are then

87: classified or clustered using existing classification software, based on

88: various standard statistical pattern recognition classifiers~\cite{TC02},

89: Bayesian classifiers~\cite{DTWml},

90: hidden Markov models~\cite{CVfolk},

91: ensembles of nearest-neighbor classifiers~\cite{GKCwavelet}

92: or neural networks~\cite{DTWml,Sneural}.

93: For example, one feature would be to look for rhythm in the sense

94: of beats per minute. One can make a histogram where each histogram

95: bin corresponds to a particular tempo in beats-per-minute and

96: the associated peak shows how frequent and strong that

97: particular periodicity was over the entire piece. In \cite{TC02}

98: we see a gradual change from a few high peaks to many low and spread-out

99: ones going from hip-hip, rock, jazz, to classical. One can use this

100: similarity type to try to cluster pieces in these categories.

101: However, such a method requires specific and detailed knowledge of

102: the problem area, since one needs to know what features to look for.

103:

104: Our aim is much more general.

105: We do not look for similarity in specific features known to

106: be relevant for classifying music;

107: instead we apply a general mathematical theory of similarity.

108: The aim is to capture, in a single similarity metric,

109: {\em every effective metric\/}:

110: effective versions of Hamming distance, Euclidean distance,

111: edit distances, Lempel-Ziv distance, and so on.

112: Such a metric would be able to simultaneously detect {\em all\/}

113: similarities between pieces that other effective metrics can detect.

114: Rather surprisingly, such a ``universal'' metric indeed exists.

115: It was developed in \cite{LBCKKZ01,Li01,Li03}, based on the

116: ``information distance'' of \cite{LiVi97,BGLVZ98}.

117: Roughly speaking, two objects are deemed close if

118: we can significantly ``compress'' one given the information

119: in the other, the idea being that if two pieces are more similar,

120: then we can more succinctly describe one given the other.

121: Here compression is based on the ideal mathematical notion of Kolmogorov

122: complexity, which unfortunately is not effectively computable.

123: It is well known that when a pure mathematical theory

124: is applied to the real world, for example in hydrodynamics

125: or in physics in general, we can in applications only approximate

126: the theoretical ideal. But still the theory gives a framework and foundation

127: for the applied science. Similarly here. We replace the ideal but

128: noncomputable Kolmogorov-based version by standard compression techniques.

129: We lose theoretical optimality in some cases, but gain an efficiently

130: computable similarity metric intended to

131:  approximate the theoretical ideal.

132: In contrast, a later and partially independent

133: compression-based approach of

134: \cite{BCL02a,BCL02b} for building language-trees---while

135: citing \cite{LiVi97,BGLVZ98}---is by {\em ad hoc\/} arguments

136: about empirical Shannon entropy and Kullback-Leibler distance

137: resulting in non-metric distances.

138:

139: Earlier research has demonstrated that this new universal similarity

140: metric works well on concrete examples in very different application

141: fields---the first completely automatic construction

142: of the phylogeny tree based on whole mitochondrial genomes,

143: \cite{LBCKKZ01,Li01,Li03} and

144: a completely automatic construction of a language tree for over 50

145: Euro-Asian languages \cite{Li03}.

146: Other applications, not reported in print,

147: are detecting plagiarism in student programming assignments

148: \cite{SID}, and phylogeny of chain letters.

149:

150: In this paper we apply this compression-based method to the classification of

151: pieces of music. We perform various experiments on sets of

152: mostly classical pieces given as MIDI (Musical Instrument Digital

153: Interface) files. This contrasts with most earlier research,

154: where the music was digitized in some wave format or other

155: (the only other research based on MIDI that we are aware

156: of is~\cite{DTWml}).

157: We compute the distances between all pairs of pieces,

158: and then build a tree containing those pieces in a way that

159: is consistent with those distances.

160: First, as proof of principle, we run the program on three

161: artificially generated data sets, where we know what

162: the final answer should be.

163: The program indeed classifies these perfectly.

164: Secondly, we show that our program can distinguish between various

165: musical genres (classical, jazz, rock) quite well.

166: Thirdly, we experiment with various sets of classical pieces.

167: The results are quite good (in the sense of conforming

168: to our expectations) for small sets of data,

169: but tend to get a bit worse for large sets.

170: Considering the fact that the method knows nothing

171: about music, or, indeed, about any of the other areas

172: we have applied it to elsewhere, one is reminded of Dr Johnson's

173: remark

174: %Boswell: I told him I had been that morning at a meeting of the people

175: %called Quakers, where I had heard a woman preach.

176: %Johnson: "Sir, a woman's preaching is like

177: about a dog's walking on his hind legs:

178: ``It is not done well; but you are surprised to find it done at all.''

179:

180: The paper is organized as follows.

181: We first give a  domain-independent overview of compression-based

182: clustering: the ideal distance metric based on Kolmogorov complexity,

183: and the quartet method that turns the matrix of distances into a tree.

184: In Section~\ref{secdetails} we give the details of the current application

185: to music, the specific file formats used etc.

186: In Section~\ref{secresults} we report the results of our experiments.

187: We end with some directions for future research.

188:

189:

190:

191: \section{Algorithmic Clustering}

192:

193: \subsection{Kolmogorov complexity}

194: Each object (in the application of this paper: each piece of music) is

195: coded as a string $x$ over a finite alphabet, say the binary

196: alphabet.

197: The integer $K(x)$ gives

198: the length of the shortest compressed binary version from which

199: $x$ can be fully reproduced,

200: also known as the {\em Kolmogorov complexity\/} of $x$.

201: ``Shortest'' means the minimum taken over every

202: possible decompression program, the

203: ones that are currently known as well as the ones that are possible

204: but currently unknown. We explicitly write only ``decompression''

205: because we do not even require that there is also a program that

206: compresses the original file to this compressed version---if there

207: is such a program then so much the better.

208: Technically, the definition of Kolmogorov complexity is as follows.

209: First, we fix a syntax for expressing all and only computations (computable

210: functions). This can be in the form of an enumeration of all

211: Turing machines, but also an enumeration of all syntactically correct

212: programs in some universal programming language like Java, Lisp, or C.

213: We then define the Kolmogorov complexity of a finite binary string

214: as the length of the shortest Turing machine, Java program, etc.

215: in our chosen syntax. Which syntax we take is unimportant, but

216: we have to stick to our choice. This choice attaches a definite positive

217: integer as the Kolmogorov complexity to each finite string.

218:

219: Though defined in terms of a

220: particular machine model, the Kolmogorov complexity

221: is machine-independent up to an additive

222: constant

223:  and acquires an asymptotically universal and absolute character

224: through Church's thesis, and from the ability of universal machines to

225: simulate one another and execute any effective process.

226:   The Kolmogorov complexity of an object can be viewed as an absolute

227: and objective quantification of the amount of information in it.

228:    This leads to a theory of {\em absolute} information {\em contents}

229: of {\em individual} objects in contrast to classic information theory

230: which deals with {\em average} information {\em to communicate}

231: objects produced by a {\em random source}.

232:

233: So $K(x)$ gives the length of the ultimate

234: compressed version, say $x^*$, of $x$.

235: This can be considered as the amount of information, number of bits,

236: contained in the string. Similarly, $K(x|y)$ is the minimal number of

237: bits (which we may think of as constituting a computer program)

238: required to reconstruct $x$ from $y$.

239: In a way $K(x)$ expresses the individual ``entropy'' of $x$---the

240: minimal number of bits to communicate $x$ when sender and

241: receiver have no knowledge where $x$ comes from. For example,

242: to communicate Mozart's ``Zauberfl\"ote'' from a library of a

243: million items requires at most 20 bits ($2^{20}\approx 1,000,000$),

244: but to communicate it from scratch requires megabits.

245: For more details on this pristine notion of individual

246: information content we refer to the textbook

247: \cite{LiVi97}.

248:

249:

250: \subsection{Distance-based classification}

251:

252: As mentioned, our approach is based on a new

253: very general similarity distance, classifying the objects in

254: clusters of objects that are close together according to this distance.

255: In mathematics, lots of different distances arise in all sorts of contexts,

256: and one usually requires these to be a `metric', since otherwise

257: undesirable effects may occur.

258: A metric is a distance function $D(\cdot,\cdot)$ that assigns

259: a non-negative distance $D(a,b)$ to any two objects $a$ and $b$, in such a way that

260: \begin{enumerate}

261: \item $D(a,b)=0$ only where $a=b$

262: \item $D(a,b)=D(b,a)$ (symmetry)

263: \item $D(a,b)\leq D(a,c)+D(c,b)$ (triangle inequality)

264: \end{enumerate}

265: A familiar example of a metric is the Euclidean metric,

266: the everyday distance $e(a,b)$ between two objects $a,b$

267: expressed in, say, meters.

268: Clearly, this distance satisfies the properties

269: $e(a,a)=0$, $e(a,b)=e(b,a)$, and $e(a,b) \leq e(a,c) + e(c,b)$

270: (Substitute $a=$ Amsterdam, $b=$ Brussels, and $c=$ Chicago.)

271: We are interested in ``similarity metrics''.

272: For example, if the objects are classical music pieces

273: then the function $D(a,b)=0$ if $a$ and $b$ are by the same composer

274: and $D(a,b)=1$ otherwise, is a similarity metric, albeit a somewhat elusive one.

275: This captures only one, but quite a significant, similarity aspect

276: between music pieces.

277:

278: In \cite{Li03}, a new theoretical approach

279: to a wide class of similarity metrics was proposed:

280: the ``normalized information distance'' is a metric, and it is

281: universal in the sense that this single metric uncovers all similarities

282: simultaneously that the metrics in the class uncover separately.

283: This should be understood in the sense that if two pieces of music

284: are similar (that is, close) according to the particular feature described by

285: a particular metric, then they are also similar (that is, close)

286: in the sense of the normalized information distance metric. This justifies

287: calling the latter {\em the\/} similarity metric.

288: Oblivious to the problem area concerned, simply using the distances

289: according to the similarity metric, our method fully automatically

290: classifies the objects concerned, be they music pieces,

291: text corpora, or genomic data.

292: %Here we apply this miracle method,

293: %the Kolmogorov Similarity Estimator (KSE), to the problem of clustering

294: %classic music pieces according to their composers.

295: %By the example metric above, therefore,

296: %the universal metric should capture whether two

297: %pieces are by the same composer.

298:

299: More precisely, the approach is as follows.

300: Each pair of such strings $x$ and $y$ is assigned a distance

301: \begin{equation}\label{eq.distance}

302: d(x,y) = \frac{\max\{K(x|y),K(y|x)\}}{\max\{K(x),K(y) \}} .

303: \end{equation}

304: There is a natural interpretation to $d(x,y)$: If, say, $K(y) \geq K(x)$

305: then we can rewrite

306: \[d(x,y) = \frac{K(y)-I(x:y)}{K(y)} , \]

307: where $I(x:y)$ is the information in $y$ about $x$ satisfying

308: the symmetry property $I(x:y)=I(y:x)$ up to a logarithmic additive error

309: \cite{LiVi97}.

310: That is, the distance $d(x,y)$ between $x$ and $y$ is the

311: number of bits of information that is not shared between the two strings

312: per bit of information that could be maximally shared between the two strings.

313:

314: It is clear that $d(x,y)$ is symmetric, and in \cite{Li03} it

315: is shown that it is indeed a metric. Moreover, it is universal

316: in the sense that every metric expressing some similarity

317: that can be computed from the objects concerned is comprised

318: (in the sense of minorized) by $d(x,y)$. It is these distances that we

319: will use, albeit in the form of a rough approximation: for

320: $K(x)$ we simply use standard compression software like `gzip', `bzip2', or

321: `compress'. To compute the conditional version, $K(x|y)$ we use

322: a sophisticated theorem, known as ``symmetry of algorithmic information''

323: in \cite{LiVi97}. This says

324: \begin{equation}\label{eq.condition}

325: K(y|x) \approx K(xy)-K(x),

326: \end{equation}

327: so to compute the conditional complexity $K(x|y)$ we can just take

328: the difference of the unconditional complexities $K(xy)$ and $K(y)$.

329: This allows us to approximate $d(x,y)$ for every pair $x,y$.

330:

331:

332: Our actual practice falls short of the ideal theory in at least

333: three respects:

334:

335: (i) The claimed universality of the similarity distance $d(x,y)$

336: holds only for indefinitely long sequences $x,y$. Once we consider

337: strings $x,y$ of definite length $n$, the similarity distance

338: is only universal with respect to ``simple'' computable normalized information

339: distances, where ``simple'' means that they are computable by programs

340: of length, say, logarithmic or polylogarithmic in $n$.

341: This reflects the fact that, technically speaking, the universality

342: is achieved by summing the weighted contribution of all

343: similarity distances in the class considered with respect

344: to the objects considered. Only similarity distances of which

345: the complexity is small (which means that the weight is large)

346: with respect to the size of the data concerned kick in.

347:

348: (ii) The Kolmogorov complexity is not computable, and it is

349: in principle impossible to compute how far off our approximation

350: is from the target value in any useful sense.

351:

352: (iii) To approximate the information distance in a practical sense

353: we use the standard compression program bzip2. While better compression

354: of a string will always  approximate the Kolmogorov complexity better,

355: this is, regrettably, not true for the (normalized) information distance.

356: Namely, using (\ref{eq.condition}) we consider the difference of

357: two compressed quantities. Different compressors may compress

358: the two quantities differently, causing an increase in the

359: difference even when both quantities are compressed better (but

360: not both as well). In the normalized information distance we

361: also have to deal with a ratio that causes the same problem.

362: Thus, a better compression program may not necessarily mean

363: that we also approximate the (normalized) information distance

364: better. This was borne out by the results of our experiments using

365: different compressors.

366:

367: Despite these caveats it turns out that the practice inspired by

368: the rigorous ideal theory performs quite well.

369: We feel this is an example that an {\em ad hoc\/}

370: approximation guided by a good theory is preferable above

371: {\em ad hoc\/} approaches without underlying theoretical foundation.

372:

373:

374: \subsection{The quartet method}

375:

376: The above approach allows us to compute the distance between

377: any pair of objects (any two pieces of music).

378: We now need to cluster the objects, so that objects that are similar

379: according to our metric are placed close together.

380: We do this by computing a phylogeny tree

381: based on these distances. Such a phylogeny tree can represent

382: evolution of species but more widely simply accounts for

383: closeness of objects from a set with a distance

384: metric. Such a tree will group objects in subtrees:

385: the clusters. To find the phylogeny tree there are many methods.

386: One of the most popular is the quartet method. The idea is as

387: follows: we consider every group of four elements from our set

388: of $n$ elements (in this case, musical pieces);

389: there are ${n \choose 4}$ such groups.

390: From each group $u,v,w,x$ we construct a tree of arity 3,

391: which implies that the tree consists of two subtrees of two

392: leaves each. Let us call such a tree a {\em quartet}.  There are

393: three possibilities denoted (i) $uv | wx$, (ii) $uw | vx$,

394: and (iii)  $ux | vw$, where a vertical bar divides the two pairs of leaf nodes

395: into two disjoint subtrees (Figure~\ref{figquart}).

396:

397: \begin{figure}[htb]

398: \begin{center}

399: \epsfig{file=quartet.eps,width=8cm}

400: \end{center}

401: \caption{The three possible quartets for the set of leaf labels {\em u,v,w,x} }\label{figquart}

402: \end{figure}

403:

404: The cost of a quartet is defined as the sum

405: of the distances between each pair of neighbors; that

406: is, $C_{uv|wx} = d(u,v) + d(w,x)$.  For any given tree $T$ and any group

407: of four leaf labels $u,v,w,x$, we say $T$ is $consistent$ with $uv | wx$

408: if and only if the path from $u$ to $v$ does not cross

409: the path from $w$ to $x$.  Note that exactly one of the three possible

410: quartets for any set of 4 labels must be consistent for any given tree.

411: \begin{figure}[htb]

412: \begin{center}

413: \epsfig{file=quartex.eps,width=5cm}

414: \end{center}

415: \caption{An example tree consistent with quartet $uv | wx$ }\label{figquartex}

416: \end{figure}

417: We may think of a large tree having many smaller quartet trees embedded

418: within its structure  (Figure~\ref{figquartex}).  The total cost of a large tree is defined to be the

419: sum of the costs of all consistent quartets.

420: First, generate a list of all possible quartets for all groups of labels

421: under consideration.  For each group of three possible quartets for a given

422: set of four labels, calculate a best (minimal) cost, and a worst (maximal)

423: cost.  Summing all best quartets yields the best (minimal) cost.

424: Conversely, summing all worst quartets yields the worst (maximal) cost.

425: The minimal and maximal values need not be attained by actual trees,

426: however the score of any tree will lie between these two values.

427: In order to be able to compare tree scores in a more uniform way,

428: we now rescale the score linearly such that the worst score maps to 0,

429: and the best score maps to 1, and term this the

430: {\em normalized tree benefit score} $S(T)$.

431: The goal of the quartet method is to find a full tree with a maximum value

432: of $S(T)$, which is to say, the lowest total cost.

433: This optimization problem is known to be NP-hard \cite{Ji01} (which means that

434: it is infeasible in practice) but we can sometimes solve it, and

435: always approximate it. The current

436: methods in \cite{Br00} are far too computationally intensive;

437: they run many months or years on moderate-sized problems

438: of 30 objects. We have designed a simple method based

439: on randomization and hill-climbing.  First, a random tree with $2n-2$ nodes

440: is created, consisting of $n$ leaf nodes (with 1 connecting edge) labeled

441: with the names of musical pieces, and $n-2$ non-leaf or {\em internal} nodes

442: labeled with the lowercase letter ``n'' followed by a unique integer identifier.  Each internal node has exactly three connecting edges.  For this

443: tree $T$, we calculate the total cost of all consistent quartets,

444: and invert and scale this value to

445: find $S(T)$.  Typically, a random tree will be consistent with around

446: $\frac{1}{3}$ of all quartets.

447: Now, this tree is denoted the currently best known tree, and is used as

448: the basis for further searching.  We define a simple mutation on a tree

449: as one of the three possible transformations:

450: \begin{enumerate}

451: \item A {\em leaf swap}, which consists of randomly choosing two leaf nodes

452: and swapping them.

453: \item A {\em subtree swap}, which consists of randomly choosing two internal

454: nodes and swapping the subtrees rooted at those nodes.

455: \item A {\em subtree transfer}, whereby a randomly chosen subtree (possibly a leaf) is detached and reattached in another place, maintaining arity invariants.

456: \end{enumerate}

457: Each of these simple mutations keeps invariant the

458: number of leaf and internal nodes in the tree; only the structure and placements

459: change.  Define a full mutation as a sequence of at least one but potentially

460: many simple mutations, picked according to the following distribution.

461: First we pick the number $k$ of simple mutations that we will perform with

462: probability $2^{-k}$.  For each such simple mutation, we choose one of

463: the three types listed above with equal probability.  Finally, for each of

464: these simple mutations, we pick leaves or internal nodes, as necessary.  Notice

465: that trees which are close to the original tree (in terms of number of

466: simple mutation steps in between) are examined often, while trees that are

467: far away from the original tree will eventually be examined, but not very

468: frequently.

469: So in order to search for a better tree,

470: we simply apply a full mutation on $T$ to arrive at $T'$, and then

471: calculate $S(T')$.  If $S(T') > S(T)$, then keep $T'$ as the new best tree.

472: Otherwise, try a new different tree and repeat.  If $S(T')$ ever reaches

473: $1$, then halt, outputting the best tree.  Otherwise, run until it seems

474: no better trees are being found in a reasonable amount of time, in which

475: case the approximation is complete.

476:

477: \begin{figure}[htb]

478: \begin{center}

479: \epsfig{file=large-graph.eps,width=8cm,angle=270}

480: \end{center}

481: \caption{Progress of the 60-piece experiment over time}\label{figprogress}

482: \end{figure}

483:

484: Note that if a tree is ever found such that $S(T) = 1$, then we can stop

485: because we can be certain that this tree is optimal, as no tree could

486: have a lower cost.  In fact, this perfect tree result is achieved in our

487: artificial tree reconstruction experiment (Section~\ref{sect.artificial})

488: reliably in less than ten minutes.  For real-world data, $S(T)$ reaches

489: a maximum somewhat

490: less than $1$, presumably reflecting inconsistency in the distance matrix

491: data fed as input to the algorithm, or indicating a search space too large

492: to solve exactly.

493: On many typical problems of up to 40 objects this tree-search gives a tree

494: with $S(T) \geq 0.9$ within half an hour.  For large numbers of objects,

495: tree scoring itself can be slow (as this takes order $n^4$ computation steps),

496: and the space of

497: trees is also large, so the algorithm may slow down substantially.

498: For larger experiments, we use a C++/Ruby implementation with MPI (Message

499: Passing Interface, a common standard used on massively parallel computers) on a

500: cluster of workstations in parallel to find trees more rapidly. We can

501: consider the graph of Figure~\ref{figprogress},

502: mapping the achieved $S(T)$ score as a function

503: of the number of trees examined.  Progress

504: occurs typically in a sigmoidal fashion towards a maximal value $\leq 1$.

505:

506: A problem with the outcomes is as follows: For natural

507: data sets we often see

508: some leaf nodes (data items) placed near the center of the tree as singleton

509: leaves attached to internal nodes, without sibling leaf

510: nodes.  This results in a more linear, stretched out, and less

511: balanced, tree. Such trees, even if they represent the underlying

512: distance matrix faithfully, are hard to fully understand

513: and may cause misunderstanding of represented relations and clusters.

514: To counteract this effect, and to bring out the clusters of

515: related items more visibly, we have added a penalty term of

516: the following form: For each internal node with exactly one leaf

517: node attached, the tree's score is reduced by 0.005.  This induces a

518: tendency in the

519: algorithm to avoid producing degenerate mostly-linear trees in the

520: face of data that is somewhat inconsistent, and creates balanced and

521: more illuminating clusters. It should be noted that the penalty term

522: causes the algorithm in some cases to settle for a slightly lower

523: $S(T)$ score than it would have without penalty term. Also the

524: value of the penalty term is heuristically chosen. The largest

525: experiment used 60 items, and we typically had only

526: a couple of orphans causing a penalty of only a few percent.

527: This should be set off against the final $S(T)$ score of above 0.85.

528:

529: Another practicality concerns the stopping criterion, at which $S(T)$

530: value we stop. Essentially we stopped when the $S(T)$

531: value didn't change after examining a large number of mutated trees.

532: An example is the progress of Figure~\ref{figprogress},

533:

534:

535:

536: \section{Details of Our Implementation}\label{secdetails}

537:

538: Initially, we downloaded 118 separate MIDI (Musical Instrument Digital

539: Interface, a versatile digital music format

540: available on the world-wide-web)

541: files selected from a range of classical composers, as well as some

542: popular music.   Each of these files was run through a preprocessor

543: to extract just MIDI Note-On

544: and Note-Off events.  These events were then converted to a player-piano

545: style representation, with time quantized in $0.05$ second intervals.

546: All instrument indicators, MIDI Control signals, and tempo variations were

547: ignored.  For each track in the MIDI file, we calculate two quantities:

548: An {\em average volume} and a {\em modal note}.

549: The average volume is calculated by averaging the volume (MIDI Note velocity)

550: of all notes in the track.  The modal note is defined to be the note

551: pitch that sounds most often in that track.  If this is not unique,

552: then the lowest such note is chosen.  The modal note is used as a

553: key-invariant reference point from which to represent all notes.

554: It is denoted by $0$, higher notes are denoted by positive numbers, and

555: lower notes are denoted by negative numbers.  A value of $1$ indicates

556: a half-step above the modal note, and a value of $-2$ indicates

557: a whole-step below the modal note.  The tracks are sorted according to

558: decreasing average volume, and then output in succession.  For each track,

559: we iterate through each time sample in order, outputting a single signed

560: 8-bit value for each currently sounding note.  Two special values are

561: reserved to represent the end of a time step and the end of a track.  This

562: file is then used as input to the compression stage for distance

563: matrix calculation and subsequent tree search.

564:

565: \section{Results}\label{secresults}

566:

567: \subsection{Three controlled experiments}\label{sect.artificial}

568:

569: With the natural data sets of music pieces that we use, one may have the preconception

570: (or prejudice) that music by Bach should be clustered together,

571: music by Chopin should be clustered together, and so should music by

572: rock stars. However, the preprocessed music files of a piece by Bach and

573: a piece by Chopin, or the Beatles, may resemble one another

574: more than two different

575: pieces by Bach---by accident or indeed by design and copying. Thus, natural

576: data sets may have ambiguous, conflicting, or counterintuitive

577: outcomes. In other words, the experiments on actual pieces have

578: the drawback of not having one clear ``correct'' answer that can

579: function as a benchmark for assessing our experimental outcomes.

580: Before describing the experiments we did with MIDI files of actual

581: music, we discuss three experiments that show that our

582: program indeed does what it is supposed to do---at least in

583: artificial situations where we know in advance what the correct answer is.

584: The similarity machine consists of two parts: (i) extracting a distance matrix

585: from the data, and (ii) constructing a tree

586: from the distance matrix using our novel quartet-based heuristic.

587:

588: \begin{figure}[htb]

589: \begin{center}

590: \epsfig{file=arttreereal.eps,width=13cm,height=10cm}

591: \end{center}

592: \caption{The tree that our algorithm reconstructed}\label{figarttreereal}

593: \end{figure}

594:

595: {\bf Testing the quartet-based tree construction:}

596: We first test whether the quartet-based tree construction

597: heuristic is trustworthy:

598: We generated a random ternary tree $T$ with 18 leaves, and derived

599: a distance metric from it by defining the distance between

600: two nodes as follows:

601: Given the length of the path from $a$ to $b$, in an integer number of

602: edges, as $L(a,b)$, let

603: \[d(a,b) = { {L(a,b)+1} \over 18},

604: \]

605:   except when

606: $a = b$, in which case $d(a,b) = 0$.  It is easy to verify that this

607: simple formula always gives a number between 0 and 1, and is monotonic

608: with path length.

609: Given only the $18\times 18$ matrix of these normalized distances,

610: our quartet method exactly reconstructed $T$ represented in

611: Figure~\ref{figarttreereal}, with $S(T)=1$.

612: %TODO: Rudi, Paul wants the distance matrix included here as well

613:

614: \begin{figure}[htb]

615: \begin{center}

616: \epsfig{file=taggedfiles.eps,width=15cm}

617: \end{center}

618: \caption{Classification of artificial files with repeated 1-kilobyte tags }\label{figtaggedfiles}

619: \end{figure}

620:

621: {\bf Testing the similarity machine on artificial data:}

622: Given that the tree reconstruction method is accurate

623: on clean consistent data, we tried whether the full procedure

624: works in an acceptable manner when we know what the outcome should

625: be like:

626: \begin{figure}[htb]

627: \begin{center}

628: \epsfig{file=filetypes.eps,width=15cm}

629: \end{center}

630: \caption{Classification of different file types}\label{figfiletypes}

631: \end{figure}

632: We randomly generated 22 separate 1-kilobyte blocks of data where

633: each byte was equally probable and called these {\em tags}.  Each tag

634: was associated with a different lowercase letter of the alphabet.  Next,

635: we generated 80-kilobyte files by starting with a block of purely random

636: bytes and applying one, two, three, or four different tags on it.

637: Applying a tag consists of ten repetitions of picking a random location

638: in the 80-kilobyte file, and overwriting that location with the universally

639: consistent tag that is indicated.  So, for instance, to create the file

640: referred to in the diagram by ``a'', we start with 80 kilobytes of random data,

641: then pick ten places to copy over this random data with the arbitrary

642: 1-kilobyte sequence identified as tag {\em a}.  Similarly, to create file ``ab'',

643: we start with 80 kilobytes of random data, then pick ten places to put

644: copies of tag {\em a}, then pick ten more places to put copies of tag {\em b} (perhaps

645: overwriting some of the {\em a} tags).  Because we never use more than four

646: different tags, and therefore never place more than 40 copies of tags, we

647: can expect that at least half of the data in each file is random and

648: uncorrelated with the rest of the files.  The rest of the file is

649: correlated with other files that also contain tags in common; the more

650: tags in common, the more related the files are.

651: The resulting tree is given in Figure~\ref{figtaggedfiles}; it can be

652: seen that clustering occurs exactly as we would expect.

653: The $S(T)$ score is 0.905.

654:

655: {\bf Testing the similarity machine on natural data:}

656: We test gross classification of files

657: based on markedly different file types.  Here, we chose several files:

658: \begin{enumerate}

659: \item Four mitochondrial gene sequences, from a black bear, polar bear,

660: fox, and rat.

661: \item Four excerpts from the novel { \em The Zeppelin's Passenger} by

662: E.~Phillips Oppenheim

663: \item Four MIDI files without further processing; two from Jimi Hendrix and

664: two movements from Debussy's Suite bergamasque

665: \item Two Linux x86 ELF executables (the {\em cp} and {\em rm} commands)

666: \item Two compiled Java class files.

667: \end{enumerate}

668: As expected, the program correctly classifies each of the different types

669: of files together with like near like. The result is reported

670: in Figure~\ref{figfiletypes} with $S(T)$ equal to 0.984.

671:

672:

673: \subsection{Genres: rock vs.~jazz vs.~classical}

674:

675: Before testing whether our program can see the distinctions

676: between various classical composers, we first

677: show that it can distinguish between three broader musical genres:

678: classical music, rock, and jazz. This should be easier than

679: making distinctions ``within'' classical music.

680: All musical pieces we used are listed in the tables in the appendix.

681: For the genre-experiment we used 12 classical pieces (the small set

682: from Table~\ref{tableclassicalpieces}, consisting of Bach, Chopin, and Debussy),

683: 12 jazz pieces (Table~\ref{tablejazzpieces}), and

684: 12 rock pieces (Table~\ref{tablerockpieces}).

685: The tree that our program came up with is given in Figure~\ref{figgenres}.

686: The $S(T)$ score is 0.858.

687:

688: \begin{figure}[htb]

689: \begin{center}

690: \epsfig{file=genres.eps,width=15cm,height=12cm}

691: \end{center}

692: \caption{Output for the 36 pieces from 3 genres}\label{figgenres}

693: \end{figure}

694:

695: The discrimination between the 3 genres is good but not perfect.

696: The upper branch of the tree contains 10 of the 12 jazz pieces,

697: but also Chopin's Pr\'elude no.~15 and a Bach Prelude.

698: The two other jazz pieces, Miles Davis' ``So what'' and John

699: Coltrane's ``Giant steps'' are placed elsewhere in the tree,

700: perhaps according to some kinship that now escapes us but can be

701: identified by closer studying of the objects concerned.

702: Of the rock pieces, 9 are placed close together in the rightmost branch,

703: while Hendrix's ``Voodoo chile'', Rush' ``Yyz'',

704: and Dire Straits' ``Money for nothing'' are further away.

705: In the case of the Hendrix piece this may be explained by the fact

706: that it does not fit well in a specific genre.

707: Most of the classical pieces are in the lower left part of the tree.

708: Surprisingly, 2 of the 4 Bach pieces are placed elsewhere.

709: It is not clear why this happens and may be considered an error

710: of our program, since we perceive the 4 Bach pieces to  be very close,

711: both structurally and melodically (as they all come from the mono-thematic

712: ``Wohltemperierte Klavier'').

713: However, Bach's is a seminal music and has been copied and cannibalized

714: in all kinds of recognizable or hidden manners; closer scrutiny could

715: reveal likenesses in its present company that are not now apparent to us.

716: In effect our similarity engine aims at the ideal of a perfect

717: data mining process, discovering unknown features in which the

718: data can be similar.

719:

720: \subsection{Classical piano music (small set)}

721:

722: \begin{figure}

723: \begin{center}

724: \epsfig{file=small.eps,width=15cm,height=10cm}

725: \end{center}

726: \caption{Output for the 12-piece set}\label{figsmallset}

727: \end{figure}

728:

729: In Table~\ref{tableclassicalpieces} we list all 60 classical piano pieces used,

730: together with their abbreviations. Some of these are complete

731: compositions, others are individual movements from larger compositions.

732: They all are piano pieces, but experiments on 34 movements of symphonies

733: gave very similar results (Section~\ref{secsymphonies}).

734: Apart from running our program on the whole set of 60 piano

735: pieces, we also tried it on two smaller sets: a small 12-piece set,

736: indicated by `(s)' in the table, and a medium-size 32-piece set,

737: indicated by `(s)' or `(m)'.

738:

739: The small set encompasses the 4 movements from Debussy's Suite bergamasque,

740: 4 movements of book 2 of Bach's Wohltemperierte Klavier, and 4 preludes from

741: Chopin's opus~28. As one can see in Figure~\ref{figsmallset},

742: our program does a pretty good job at clustering these pieces.

743: The $S(T)$ score is also high: 0.958.

744: The 4 Debussy movements form one cluster, as do the 4 Bach pieces.

745: The only imperfection in the tree, judged by what one would

746: intuitively expect, is that Chopin's Pr\'elude no.~15 lies a bit closer

747: to Bach than to the other 3 Chopin pieces.

748: This Pr\'elude no~15, in fact, consistently forms an odd-one-out

749: in our other experiments as well. This is an example of pure data mining,

750: since there is some musical truth

751: to this, as no.~15 is perceived as by far the most eccentric

752: among the 24 Pr\'eludes of Chopin's opus~28.

753:

754: \subsection{Classical piano music (medium set)}

755: \begin{figure}[hbt]

756: \begin{center}

757: \epsfig{file=medium.eps,width=15cm,height=10cm}

758: \end{center}

759: \caption{Output for the 32-piece set}\label{figmediumset}

760: \end{figure}

761:

762: The medium set adds 20 pieces to the small set:

763: 6 additional Bach pieces, 6 additional Chopins, 1 more Debussy piece,

764: and 7 pieces by Haydn. The experimental results are given in

765: Figure~\ref{figmediumset}. The $S(T)$ score is slightly lower than

766: in the small set experiment: 0.895.

767: Again, there is a lot of structure

768: and expected clustering. Most of the Bach pieces are together,

769: as are the four Debussy pieces from the Suite bergamasque.

770: These four should be together because they are movements from the same piece;

771: The fifth Debussy item is somewhat apart since it comes from another piece.

772: Both the Haydn and the Chopin

773: pieces are clustered in little sub-clusters of two or three pieces,

774: but those sub-clusters are scattered throughout the tree instead

775: of being close together in a larger cluster.

776: These small clusters may be an imperfection of the method,

777: or, alternatively point at musical similarities between the

778: clustered pieces that transcend the similarities induced by

779: the same composer. Indeed, this may point the way for further

780: musicological investigation.

781:

782:

783: \subsection{Classical piano music (large set)}

784:

785:

786: \begin{figure}

787: \begin{center}

788: \epsfig{file=large.eps,width=17cm,height=10cm}

789: \end{center}

790: \caption{Output for the 60-piece set}\label{figlargeset}

791: \end{figure}

792:

793: Figure~\ref{figlargeset} gives the output of a run of our program

794: on the full set of 60 pieces. This adds 10 pieces by Beethoven,

795: 8 by Buxtehude, and 10 by Mozart to the medium set.

796: The experimental results are given in Figure~\ref{figlargeset}.

797: The results are still far from random, but leave more to

798: be desired than the smaller-scale experiments.

799: Indeed, the $S(T)$ score has dropped further from that of the

800: medium-sized set to 0.844.

801: This may be an artifact of the interplay between the relatively small

802: size, and large number, of the files compared: (i) the distances

803: estimated are less accurate; (ii) the number of quartets

804: with conflicting requirements increases; and (iii) the computation

805: time rises to such an extent that the correctness score of the

806: displayed cluster graph within the set time limit

807: is lower than in the smaller samples.

808: Nonetheless, Bach and Debussy are still reasonably well clustered,

809: but other pieces (notably the Beethoven and Chopin ones)

810: are scattered throughout the tree. Maybe this means

811: that individual music pieces by these composers are more similar

812: to pieces of other composers than they are to each other?

813: The placement of the pieces is closer to intuition on a small level

814: (for example, most pairing of siblings corresponds to musical similarity in

815: the sense of the same composer) than on the larger level.

816: This is similar to the phenomenon of little sub-clusters

817: of Haydn or Chopin pieces that we saw in the medium-size experiment.

818:

819: \subsection{Clustering symphonies}\label{secsymphonies}

820:

821: Finally, we tested whether the method worked for more complicated

822: music, namely 34 symphonic pieces.  We took

823: two Haydn symphonies (no.~95 in one file, and the four movements of~104),

824: three Mozart symphonies (39, 40, 41),

825: three Beethoven symphonies (3, 4, 5),

826: of Schubert's Unfinished symphony, and of Saint-Saens Symphony no.~3.

827: The results are reported in Figure~\ref{figsymphonies},

828: with a quite reasonable $S(T)$ score of 0.860.

829:

830: \begin{figure}

831: \begin{center}

832: \epsfig{file=symphonies.eps,width=14cm,height=10cm}

833: \end{center}

834: \caption{Output for the set of 34 movements of symphonies}\label{figsymphonies}

835: \end{figure}

836:

837:

838:

839: \section{Future Work and Conclusion}

840:

841: Our research raises many questions worth looking into further:

842: \begin{itemize}

843: \item The program can be used as a data mining machine to discover

844: hitherto unknown similarities between music pieces of different

845: composers or indeed different genres. In this manner we can discover

846: plagiarism or indeed honest influences between music pieces and

847: composers. Indeed, it is thinkable that we can use the method

848: to discover seminality of composers, or separate music eras

849: and fads.

850: \item A very interesting application of our program

851: would be to select a plausible composer for a newly

852: discovered piece of music of which the composer is not known.

853: In addition to such a piece, this experiment would

854: require a number of pieces from known composers that

855: are plausible candidates.  We would just run our program

856: on the set of all those pieces, and see where the new

857: piece is placed.  If it lies squarely within a cluster

858: of pieces by composer such-and-such, then that would be

859: a plausible candidate composer for the new piece.

860: \item Each run of our program is different---even on

861: the same set of data---because of our use of randomness for

862: choosing mutations in the quartet method.

863: It would be interesting to investigate more precisely

864: how stable the outcomes are over different such runs.

865: \item At various points in our program, somewhat

866: arbitrary choices were made.

867: Examples are the compression algorithms we use

868: (all practical compression algorithms will fall short

869: of Kolmogorov complexity, but some less so than others);

870: the way we transform the MIDI files (choice of length

871: of time interval, choice of note-representation);

872: the cost function in the quartet method.

873: Other choices are possible and may or may not lead

874: to better clustering.\footnote{We compared the quartet-based

875: approach to the tree reconstruction

876: with alternatives.  One such alternative that we tried is to

877: compute the Minimum Spanning Tree (MST) from the matrix

878: of distances. MST has the advantage of being very efficiently

879: computable, but resulted in trees that were much worse than

880: the quartet method. It appears that the quartet method is extremely sensitive

881: in extracting information even from small differences in the entries

882: of the distance matrix, where other methods would be led to error.}

883: Ideally, one would like to have well-founded theoretical

884: reasons to decide such choices in an optimal way.

885: Lacking those, trial-and-error seems the only way

886: to deal with them.

887: \item The experimental results got decidedly worse when

888: the number of pieces grew.

889: Better compression methods may improve this situation, but the effect

890: is probably due to unknown scaling problems with the quartet

891: method or nonlinear scaling of possible similarities in a larger

892: group of objects (akin to the phenomenon described in the so-called

893: ``birthday paradox'': in a group of about two dozen people there

894: is a high chance that at least two of the people have the

895: same birthday). Inspection of the underlying distance matrices

896: makes us suspect the latter.

897: \item Our program is not very good at dealing with very

898: small data files (100 bytes or so), because significant

899: compression only kicks in for larger files.

900: We might deal with this by comparing various sets of such

901: pieces against each other, instead of individual ones.

902: \end{itemize}

903:

904: \subsection*{Acknowledgments}

905: We thank John Tromp for useful discussions.

906:

907: \begin{thebibliography}{99}

908:

909:

910:

911: \bibitem{BCL02a}

912: D.~Benedetto, E.~Caglioti, and V.~Loreto.

913: Language trees and zipping, {\em Physical Review Letters},

914: 88:4(2002) 048702.

915:

916: \bibitem{BCL02b}

917: Ph.~Ball.

918: Algorithm makes tongue tree, {\em Nature}, 22 January,

919: 2002.

920:

921: \bibitem{BGLVZ98}

922: C.H.~Bennett, P.~G\'acs, M. Li, P.M.B.~Vit\'anyi, and W.~Zurek.

923: Information Distance, {\em IEEE Transactions on Information Theory},

924: 44:4(1998), 1407--1423.

925:

926: \bibitem{Br00}

927: D.~Bryant, V.~Berry, P.~Kearney, M.~Li, T.~Jiang,

928: T.~Wareham and H.~Zhang. A practical algorithm for

929: recovering the best supported edges of an evolutionary tree.

930: {\em Proc. 11th  ACM-SIAM Symposium on Discrete Algorithms},

931: 287--296, 2000.

932:

933: %\bibitem{CF02}

934: %M.~Cooper and J.~Foote.

935: %Automatic music summarization via similarity analysis,

936: %{\em Proc.~IRCAM}, 2002.

937:

938: \bibitem{CVfolk}

939: W.~Chai and B.~Vercoe.

940: Folk music classification using hidden Markov models.

941: {\em Proc.~of International Conference on Artificial Intelligence}, 2001.

942:

943: \bibitem{DTWml}

944: R.~Dannenberg, B.~Thom, and D.~Watson.

945: A machine learning approach to musical style recognition,

946: {\em Proc.~International Computer Music Conference}, pp. 344-347, 1997.

947:

948: \bibitem{GKCwavelet}

949: M.~Grimaldi, A.~Kokaram, and P.~Cunningham.

950: Classifying music by genre using the wavelet packet transform

951: and a round-robin ensemble.

952: Technical report TCD-CS-2002-64, Trinity College Dublin, 2002.

953: http://www.cs.tcd.ie/publications/tech-reports/reports.02/TCD-CS-2002-64.pdf

954:

955: \bibitem{Ji01}

956: T.~Jiang, P.~Kearney, and M.~Li.

957: A Polynomial Time Approximation Scheme for Inferring Evolutionary Trees from

958: Quartet Topologies and its Application.

959: {\em SIAM J. Computing}, 30:6(2001), 1942--1961.

960:

961: \bibitem{LBCKKZ01}

962: M.~Li, J.H.~Badger, X.~Chen, S.~Kwong, P.~Kearney, and H.~Zhang.

963: An information-based sequence distance and its application

964: to whole mitochondrial genome phylogeny,

965: {\em Bioinformatics}, 17:2(2001), 149--154.

966:

967: \bibitem{Li01}

968: M.~Li and P.M.B.~Vit\'anyi.

969: Algorithmic Complexity,

970: pp.~376--382 in: {\em International Encyclopedia

971: of the Social \& Behavioral Sciences},

972: N.J.~Smelser and P.B.~Baltes, Eds., Pergamon, Oxford, 2001/2002.

973:

974: \bibitem{Li03}

975: M.~Li, X.~Chen, X.~Li, B.~Ma, P.~Vit\'anyi.

976: The similarity metric,

977: {\em Proc. 14th ACM-SIAM Symposium on Discrete Algorithms}, 2003.

978:

979: \bibitem{LiVi97}

980: M.~Li and P.M.B.~Vit\'anyi.

981: {\em An Introduction to Kolmogorov Complexity

982: and its Applications}, Springer-Verlag, New York, 2nd Edition, 1997.

983:

984: \bibitem{Sneural}

985: P.~Scott.

986: Music classification using neural networks, 2001.\\

987: http://www.stanford.edu/class/ee373a/musicclassification.pdf

988:

989: \bibitem{SID}

990: Shared Information Distance or Software Integrity

991: Detection, Computer Science, University of California, Santa Barbara,

992:  http://dna.cs.ucsb.edu/SID/

993:

994: \bibitem{TC02}

995: G.~Tzanetakis and P.~Cook, Music genre classification of audio signals,

996: {\em IEEE Transactions on Speech and Audio Processing},

997: 10(5):293--302, 2002.

998:

999: \end{thebibliography}

1000:

1001:

1002: \appendix

1003:

1004: \section{Appendix: The Music Pieces Used}

1005: \begin{table}[htb]

1006: \begin{center}

1007: \begin{tabular}{|l|l|l|} \hline

1008: Composer & Piece & Acronym\\ \hline\hline

1009: J.S.~Bach (10) & Wohltemperierte Klavier II: Preludes and fugues 1,2 & BachWTK2\{F,P\}\{1,2\} (s)\\

1010:           & Goldberg Variations: Aria, Variations 1,2  & BachGold\{Aria,V1,V2\} (m) \\

1011:           & Kunst der Fuge: Variations 1,2             & BachKdF\{1,2\} (m) \\

1012:           & Invention 1                                & BachInven1 (m) \\

1013: Beethoven (10) & Sonata no.~8 (Pathetique), 1st movement    & BeetSon8m1 \\

1014:           & Sonata no.~14 (Mondschein), 3 movements    & BeetSon14m\{1,2,3\}\\

1015:           & Sonata no.~21 (Waldstein), 2nd movement    & BeetSon21m2\\

1016:           & Sonata no.~23 (Appassionata)               & BeetSon23\\

1017:           & Sonata no.~26 (Les Adieux)                 & BeetSon26\\

1018:           & Sonata no.~29 (Hammerklavier)              & BeetSon29\\

1019:           & Romance no.~1                              & BeetRomance1\\

1020:           & F\"ur Elise                                & BeetFurElise\\

1021: Buxtehude (8) & Prelude and fugues, BuxWV 139,143,144,163  & BuxtPF\{139,143,144,163\} \\

1022:           & Toccata and fugue, BuxWV 165               & BuxtTF165 \\

1023:           & Fugue, BuxWV 174                           & BuxtFug174\\

1024:           & Passacaglia, BuxWV 161                     & BuxtPassa161\\

1025:           & Canzonetta, BuxWV 168                      & BuxtCanz168\\

1026: Chopin (10) & Pr\'eludes op.~28: 1, 15, 22, 24 & ChopPrel\{1,15,22,24\} (s)\\

1027:           & Etudes op.~10, nos.~1, 2, and 3            & ChopEtu\{1,2,3\} (m)\\

1028:           & Nocturnes nos.~1 and 2                     & ChopNoct\{1,2\} (m)\\

1029:           & Sonata no.~2, 3rd movement                 & ChopSon2m3 (m)\\

1030: Debussy (5) & Suite bergamasque, 4 movements             & DebusBerg\{1,2,3,4\} (s)\\

1031:           & Children's corner suite (Gradus ad Parnassum) & DebusChCorm1 (m)\\

1032: Haydn (7) & Sonatas nos.~27, 28, 37, and 38            & HaydnSon\{27,28,37,38\} (m)\\

1033:           & Sonata no.~40, movements 1,2               & HaydnSon40m\{1,2\} (m)\\

1034:           & Andante and variations                     & HaydnAndaVari (m)\\

1035: Mozart (10) & Sonatas nos.~1,2,3,4,6,19                & MozSon\{1,2,3,4,6,19\} \\

1036:           & Rondo from Sonata no.~16                   & MozSon16Rondo \\

1037:           & Fantasias K397, 475                        & MozFantK\{397,475\} \\

1038:           & Variations ``Ah, vous dirais-je madam''    & MozVarsDirais\\ \hline

1039: \end{tabular}

1040: \end{center}

1041: \caption{The 60 classical pieces used

1042: (`m' indicates presence in the medium set, `s' in the small and medium sets)}\label{tableclassicalpieces}

1043: \end{table}

1044:

1045: \begin{table}[htb]

1046: %http://www.fortunecity.com/tinpan/mingus/51/eng.html

1047: \begin{center}

1048: \begin{tabular}{|l|l|} \hline

1049: John Coltrane& Blue trane\\

1050:              & Giant steps\\

1051:              & Lazy bird\\

1052:              & Impressions\\

1053: Miles Davis  & Milestones\\

1054:              & Seven steps to heaven\\

1055:              & Solar\\

1056:              & So what\\

1057: George Gershwin & Summertime\\

1058: Dizzy Gillespie & Night in Tunisia\\

1059: Thelonious Monk & Round midnight\\

1060: Charlie Parker  & Yardbird suite\\ \hline

1061: \end{tabular}

1062: \end{center}

1063: \caption{The 12 jazz pieces used}\label{tablejazzpieces}

1064: \end{table}

1065:

1066: \begin{table}[htb]

1067: %http://www.fortunecity.com/tinpan/mingus/51/eng.html

1068: \begin{center}

1069: \begin{tabular}{|l|l|} \hline

1070: The Beatles  & Eleanor Rigby\\

1071:              & Michelle\\

1072: Eric Clapton & Cocaine\\

1073:              & Layla\\

1074: Dire Straits & Money for nothing\\

1075: Led Zeppelin & Stairway to heaven\\

1076: Metallica    & One\\

1077: Jimi Hendrix & Hey Joe\\

1078:              & Voodoo chile\\

1079: The Police   & Every breath you take\\

1080:              & Message in a bottle\\

1081: Rush         & Yyz\\ \hline

1082: \end{tabular}

1083: \end{center}

1084: \caption{The 12 rock pieces used}\label{tablerockpieces}

1085: \end{table}

1086:

1087:

1088: \end{document}

1089:

1090:

1091: