0007:cs0007007/cse.tex

1: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2: %

3: %

4: %  DATA SONIFICATION AND SOUND VISUALIZATION

5: %

6: %  January, 1999

7: %

8: %  Computational Science and Engineering

9: %

10: %  Preprint ANL/MCS-P738-0199

11: %

12: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

13: %%%%%%               Preamble                  %%%%%%

14: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

15: %

16:

17: \documentstyle[psfig,12pt]{article}

18: \pagestyle{plain}

19: \setlength{\textheight}{8.0in}

20: \setlength{\textwidth}{6.0in}

21: \setlength{\evensidemargin}{0.3in}

22: \setlength{\oddsidemargin}{0.3in}

23: \setlength{\topmargin}{0.0in}

24: \setlength{\parskip}{2ex}

25: \setlength{\parindent}{2em}

26: \newcommand{\Yes}{\rule{1.0in}{0.02in}}

27: \newcommand{\Yesm}{\hspace{-2em}\rule{1.3in}{0.02in}}

28: \newcounter{labelflag} \setcounter{labelflag}{0}

29: \newcommand{\labelon}{\setcounter{labelflag}{1}}

30: \newcommand{\Label}[1]{

31:                        \ifnum\thelabelflag=1

32:                           \ifmmode

33:                              \makebox[0in][l]{\qquad\fbox{\rm#1}}

34:                           \else

35:                              \marginpar{\vspace{0.7\baselineskip}

36:                                         \hspace{-1.1\textwidth}

37:                                         \fbox{\rm#1}}

38:                           \fi

39:                        \fi

40:                        \label{#1}

41:                       }

42: % \labelon                            % Keys are printed,

43: \newcommand{\be}{\begin{equation}}

44: \newcommand{\ee}{\end{equation}}

45:

46:

47: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

48:

49: \begin{document}

50:

51: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

52: %%%%%%      Title         %%%%%

53: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

54:

55:

56: \begin{center}

57: {\Large\bf Data Sonification and Sound Visualization}

58: \end{center}

59:

60: \noindent

61: Hans G.\ Kaper \\

62: \hspace*{2em}

63: \textit{

64: Mathematics and Computer Science Division,

65: Argonne National Laboratory

66: } \\

67: Sever Tipei \\

68: \hspace*{2em}

69: \textit{

70: School of Music,

71: University of Illinois

72: } \\

73: Elizabeth Wiebel \\

74: \hspace*{2em}

75: \textit{

76: Mathematics and Computer Science Division,

77: Argonne National Laboratory

78: }

79:

80: \medskip

81:

82: \begin{abstract}

83: This article describes a collaborative project

84: between researchers in the Mathematics and Computer

85: Science Division at Argonne National Laboratory

86: and the Computer Music Project of the University

87: of Illinois at Urbana-Champaign.

88: The project focuses on the use of sound for the

89: exploration and analysis of complex data sets

90: in scientific computing.

91: The article addresses digital sound synthesis

92: in the context of DIASS (Digital Instrument

93: for Additive Sound Synthesis) and sound

94: visualization in a virtual-reality environment

95: by means of M4CAVE.

96: It describes the procedures and preliminary results

97: of some experiments in scientific sonification

98: and sound visualization.

99: \end{abstract}

100:

101: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

102: %%%%%       Body           %%%%%

103: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

104:

105: \medskip

106:

107: \noindent

108: While most computational scientists routinely use

109: visual imaging techniques to explore and analyze

110: large data sets, they tend to be much less familiar

111: with the use of sound.

112: Yet, sound signals carry significant amounts of

113: information and can be used advantageously to

114: increase the bandwidth of the human/computer

115: interface.

116: The project described in this article focuses on

117: scientific sonification---the faithful rendering

118: of scientific data in sounds---and the visualization

119: of sounds in a virtual-reality environment.

120: The project, which grew out of an effort to apply

121: the latest supercomputing technology to

122: the process of music composition (see Box~1),

123: is a joint collaboration between

124: Argonne National Laboratory (ANL, Mathematics and

125: Computer Science Division) and the University of

126: Illinois at Urbana-Champaign (UIUC, Computer Music

127: Project).

128:

129: Digital sound synthesis is addressed in Section~1;

130: the discussion centers on DIASS

131: (Digital Instrument for Additive Sound Synthesis).

132: Section~2 describes some experiments in

133: scientific sonification.

134: Sound visualization in a virtual-reality (VR)

135: environment is discussed in Section~3;

136: here, the main tool is M4CAVE, a program to

137: visualize sounds from a score file.

138: Section~4 contains some general observations

139: about the project.

140:

141: \section{Digital Sound Synthesis}

142: %

143: Digital sound synthesis is a way to generate

144: a stream of numbers representing the sampled

145: values of an audio waveform.

146: To realize the sounds, one sends these samples

147: through a digital-to-analog converter (DAC),

148: which converts the numbers to a continuously

149: varying voltage that can be amplified and sent to

150: a loudspeaker.

151:

152: One way of viewing the digital sound-synthesis process

153: is to imagine a computer program that calculates

154: the sample values according to a mathematical formula

155: and sends those samples, one after the other, to the DAC.

156: All the calculations are carried out by a program,

157: which can be changed in arbitrary ways by the user.

158: From this point of view, digital synthesis is the same

159: as software synthesis.

160: Software synthesis contrasts with hardware synthesis,

161: where the calculations are carried out in special

162: circuitry.

163: Hardware synthesis has the advantage of high-speed

164: operation but lacks the flexibility of software

165: synthesis.

166: Software synthesis is the technique of choice

167: if one wishes to develop an instrument for

168: data sonification.

169:

170: With software synthesis, one can indeed realize any

171: imaginable sound---provided one has the time

172: to wait for the results.

173: With a sampling rate of 44,100 samples per second

174: the time available per sample is only 20 microseconds,

175: too short for real-time synthesis of reasonably complex sounds.

176: For this reason, most of today's synthesis programs

177: generate a sound file, which is then played through a DAC.

178: But data sonification in real time may become feasible

179: on tomorrow's high-performance computing architectures.

180: Our research effort focuses on the development of

181: a flexible and powerful digital instrument for

182: scientific sonification and on finding optimal ways

183: to convey information through the medium of sound.

184:

185: \subsection{DIASS -- A Digital Instrument}

186:

187: Two pieces of software consitute the main tools

188: of the project: DIASS,

189: a Digital Instrument for Additive Sound Synthesis,

190: and M4CAVE,

191: a program for the visualization of sound objects

192: in a multimedia environment.

193: Both are part of a comprehensive

194: {\em Environment for Music Composition},

195: which includes additional software for

196: computer-assisted composition and

197: automatic music notation.

198: Figure~\ref{f-env} gives a schematic overview

199: of the various elements of the {\em Environment\/};

200: C and S mark the data entry points for

201: composition and sonification, respectively.

202: %

203: \begin{figure}[htbp]

204: \hspace*{-0.0in}

205: \centering{\psfig{file=cse-fig1.eps,height=3in,width=3in}}

206: \caption{The

207: {\em Environment for Music Composition}.

208: \label{f-env}

209: }

210: \end{figure}

211: %

212:

213: In this section we describe the workings of DIASS;

214: we will describe M4CAVE after we have discussed

215: our ideas on scientific sonification.

216:

217: \subsubsection{The Instrument}

218: %

219: The DIASS instrument functions as part of

220: the M4C synthesis language developed

221: by Beauchamp and his associates

222: at the University of Illinois~\cite{M4C}.

223: Synthesis languages like M4C are designed

224: around the notion that the user creates

225: an instrument together with a score

226: that references the instrument.

227: The synthesis program reads the instrument, feeds it

228: the data from the score file, and computes the final

229: audio signal, which is then written to a sound file

230: for later playback~\cite{roads}.

231:

232: The M4C synthesis language is imbedded in the C language.

233: As part of the current project, the instrument

234: and relevant parts of M4C were redesigned

235: for a distributed-memory environment.

236: The parallel implementation uses the standard

237: MPI message-passing library~\cite{MPI}.

238:

239: Like all additive-synthesis instruments,

240: DIASS creates sounds through a summation

241: of simple sine waves.

242: The basic formula is

243: \[

244:   S (t) = \sum_i P_i (t)

245:   = \sum_i a_i (t) \sin (2 \pi f_i (t) t + \phi_i (t)) .

246: \]

247: The individual sine waves that make up a sound

248: are commonly designated as the ``partials'' of the sound,

249: hence the symbol $P$.

250: The sum extends over all partials that are

251: active at the time $t$;

252: $a_i$ is the amplitude, $f_i$ the frequency,

253: and $\phi_i$ the phase of the $i$th partial.

254: These variables can be modulated periodically or otherwise;

255: the modulations evolve on a slow time scale,

256: typically on the order of the duration of a sound.

257: Phase modulation is barely distinguishable from

258: frequency modulation, particularly in the case

259: of time-varying frequency spectra, and is not

260: implemented in DIASS.

261:

262: The audible frequencies range roughly

263: from 20 to 20,000~Hz, although in practice

264: the upper limit is one-half the sampling frequency

265: (Nyquist criterion).

266:

267: The partials in a sound need not be

268: in any harmonic relationship

269: (that is, $f_i$ need not be a multiple of some

270: fundamental frequency $f_0$),

271: nor do they need to share any other property.

272: The definition of a sound is purely operational.

273: What distinguishes one ``sound'' from another

274: is that certain operations are defined

275: at the level of a sound and affect all

276: the partials that make up the sound.

277:

278: The evolution of a partial can be subject

279: to many other controls, besides

280: amplitude and frequency modulation.

281: Moreover, these controls can affect a single partial

282: or all the partials in a sound.

283: For example, reverberation, which represents

284: the combined effects of the size and

285: acoustic characteristics of the hall,

286: affects all the partials in a sound simultaneously,

287: although not necessarily in the same way.

288: Furthermore, if a random element is present,

289: it must be applied at the level of a sound;

290: otherwise, a complex wave is perceived as

291: a collection of independent sine waves,

292: instead of a single sound.

293: Hence, it is important that all partials

294: in a sound access the same random number sequence

295: and that the controls of any partial

296: that changes its allegiance and moves

297: from one sound to another be adjusted accordingly.

298:

299: %

300: \begin{table}[htb]

301: \begin{center}

302: \caption{Static (S) and dynamic (D) control parameters in DIASS.

303:  \label{t-controls} }

304: \vspace*{2ex}

305: \begin{footnotesize}

306: \begin{tabular}{|| l | l | l ||}\hline

307: \multicolumn{1}{||c|}{Level} &

308: \multicolumn{1}{c|}{Description}&

309: \multicolumn{1}{c||}{Control Parameter} \\ \hline

310:

311: Partial & Carrier (sine) wave     & S: Starting time, duration, phase \\

312:         &                         & D: Amplitude, frequency \\

313:         & AM (tremolo) wave       & S: Wave type, phase \\

314:         &                         & D: Amplitude, frequency \\

315:         & FM (vibrato) wave       & S: Wave type, phase \\

316:         &                         & D: Amplitude, frequency \\

317:         & Amplitude transients    & S: Max size \\

318:         &                         & D: Shape \\

319:         & Amplitude transient rate& S: Max rate \\

320:         &                         & D: Rate shape \\

321:         & Frequency transients    & S: Max size \\

322:         &                         & D: Shape \\

323:         & Frequency transient rate& S: Max rate \\

324:         &                         & D: Rate shape \\

325: Sound   & Timbre                  & D: Partial-to-sound relation \\

326:         & Localization            & D: Panning \\

327:         & Reverberation           & S: Duration, decay rate, mix \\

328:         & Hall                    & S: Hall size, reflection coefficient \\ \hline

329: \end{tabular}

330: \end{footnotesize}

331: \end{center}

332: \end{table}

333: %

334: Table~\ref{t-controls} lists the control parameters

335: that can be applied in DIASS.

336: Some, like starting time and duration,

337: do not change for the duration of a sound;

338: they are static and determined by a single value.

339: Others are dynamic;

340: their evolution is controlled by an envelope---a

341: normalized function consisting of

342: linear and exponential segments---and a maximum size.

343: Not all control parameters are totally independent;

344: some occur only in certain combinations, and

345: some are designed to reinforce others.

346:

347: The control parameters give DIASS its flexibility

348: and make it an instrument suitable for data sonification.

349: On the other hand, the fact that the control parameters

350: act at the level of a partial as well as at the level

351: of a sound (or even at the level of a collection of sounds)

352: significantly increases its computational complexity.

353:

354: \subsubsection{The Score}

355: %

356: Input for DIASS consists of a raw score file

357: detailing the controls.

358: The raw score file is transformed

359: into a score file for the instrument---a

360: collection of ``Instrument cards'' (I-cards),

361: one for each partial, which are fed

362: to the instrument by M4C.

363: The transformation is accomplished

364: in a number of steps.

365:

366: Among the controls are certain global operations

367: (``macros''), which are defined at the level of a sound.

368: In a first pass, these global controls are expanded

369: into controls for the individual partials.

370: The next step consists of the application of

371: the loudness routines.

372: These routines operate at the sound level and ensure

373: that the sounds have the desired loudness.

374: The final step consists of the application of

375: the anticlip routines.

376: For various reasons, historical as well as technical,

377: sound samples are stored as 16-bit integers.

378: The anticlip routines guarantee that none of

379: the sample values produced by the instrument

380: from the score file exceeds 16 bits.

381: Because loudness and anticlip play a significant role

382: in sonification, we discuss the issues in more detail.

383:

384: \paragraph{Loudness.}

385: The perception of loudness is a subjective experience.

386: Although the perceived loudness of a sound is related

387: to the amplitudes of its constituent partials,

388: the relation is nonlinear and depends on

389: the frequencies of the partials.

390: At the most elementary level,

391: pure sinusoidal waves of low or high frequencies

392: require a higher energy flow and therefore a larger

393: amplitude to achieve the same loudness level

394: as similar waves at mid-range frequencies.

395: When waves of different frequencies

396: are superimposed to form a sound,

397: the situation becomes still more complicated.

398: The sum of two tones of the same frequency

399: produced by two identical instruments

400: played simultaneously is not perceived as twice

401: as loud as the tone produced by a single instrument.

402:

403: An algorithm for data sonification must reflect

404: these subjective experiences.

405: For example, when we sonify two degrees of freedom,

406: mapping one ($x_1$, say) to amplitude and the other

407: ($x_2$, say) to frequency, then we should perceive

408: equal loudness levels when $x_1$ has the same value,

409: irrespective of the values of $x_2$.

410: Also, when the variable $x_1$ increases or decreases,

411: we should perceive a proportional increase

412: or decrease in the loudness level.

413:

414: The loudness routines in DIASS incorporate

415: the relevant results of psychoacoustic

416: research~\cite{montreal}

417: and give the user full control over the perceived

418: loudness of a sound.

419: They also scale each partial so each sample value

420: fits in a 16-bit register (see Box~2).

421:

422: \paragraph{Anticlip.}

423: When several sounds coexist and their waveforms

424: are added, sample values may exceed 16 bits (overflow),

425: even when the individual waveforms stay

426: within the 16-bit limit.

427: Overflow gives rise to ``clipping''---a popping

428: noise---when the sound file is played.

429: The anticlip routines in DIASS check

430: the score for potential overflow

431: and rescale the sounds as necessary,

432: while preserving the ratio of perceived loudness levels.

433: Thus it is possible to produce an entire sound file

434: in a single run from the score file, even when

435: the sounds cover a wide dynamic range.

436:

437: To appreciate the difficulty inherent in the scaling processes,

438: consider the case of a sound cluster consisting of

439: numerous complex sounds, all very loud and resulting in clipping,

440: followed by a barely audible sound with only two or three partials.

441: If the cluster's amplitude is brought down to fit the register

442: capacity, and that of the soft tiny sound following it

443: is scaled proportionally,

444: the latter disappears under system noise.

445: On the other hand, if only the loud cluster is scaled,

446: the relationship between the two sound events

447: is completely distorted.

448: Many times in the past, individual sounds

449: or groups of sounds were generated separately

450: and then merged with the help of analog equipment

451: or an additional digital mixer.

452: The loudness and anticlip routines in DIASS

453: deal with this problem by adjusting both loud and

454: soft sounds so their perceived loudness matches

455: the desired relationship specified by the user,

456: and no clipping occurs (see Box~3).

457:

458: \subsubsection{The Editor}

459: %

460: Features like the loudness routines make DIASS

461: a fine-tuned, flexible, and precise instrument

462: suitable for data sonification.

463: Of course, they require the specification of

464: significant amounts of input data.

465: The editor in DIASS is designed to facilitate

466: this process.

467: It comes in a ``slow'' and a ``fast'' version.

468:

469: In the slow version, data are entered

470: one at a time, either in response to questions

471: from a menu or through a graphic user interface (GUI).

472: The process gives the user the opportunity

473: to build sounds step by step, experiment, and fine-tune

474: the instrument.

475: It is suitable for sound composition and for designing

476: prototype experiments in sonification.

477: The fast version uses the same code but reads

478: the responses to the menu questions from a script.

479: This version is used for sonification experiments.

480:

481: \subsubsection{Computing Requirements}

482: %

483: The sound synthesis software embodied in DIASS

484: is computationally intensive (see Box~4).

485: The instrument proper,

486: the engine that computes the samples,

487: has been implemented in a workstation environment

488: and on the IBM Scalable POWERparallel (SP) system.

489: Parallelism is implemented at the sound level

490: to minimize communication among the processors

491: and enable all partials of a sound to access

492: the same random number sequence.

493: In parallel mode, at least four processors are

494: used---one to distribute the tasks and

495: supervise the entire run (the ``master'' processor),

496: a second to mix the results (the ``mixer''),

497: and at least two ``slave'' nodes to compute

498: the samples one sound at a time.

499: Sounds are computed in their starting-time order,

500: irrespective of their duration or complexity.

501: (A smart load-balancing algorithm would take into account

502: the duration of the various sounds and the

503: number of their partials.)

504:

505: Performance depends greatly on the complexity

506: of the sounds---that is, on the number of partials per sound

507: and the number of active controls for each partial.

508: Typically, the time to generate a two-channel sound file

509: for a 2'26" musical composition

510: with 236 sounds and 4939 partials

511: ranges from almost two hours on four processors

512: to about 10 minutes on 34 processors of the SP.

513: Figure~\ref{f-speedup} gives some indication of

514: the speedups one observes in a multiprocessing

515: environment.

516: The three graphs correspond to three variants

517: of the same 2'26" piece with different complexity.

518: The time $T_p$ refers to a computation

519: on $p+2$ processors ($p$ ``slaves'');

520: all times are approximate, as they were

521: extracted from data given by LoadLeveler,

522: a not very sophisticated timing instrument

523: for the SP.

524: Speedup is measured relative to the performance

525: on four processors (two compute nodes).

526: One observes the typical linear speedup

527: until saturation sets in.

528: The more complex the piece (the more partials),

529: the later saturation sets in.

530: %

531: \begin{center}

532: \begin{figure}[htbp]

533: \hspace*{-0.2in}

534: \centering{\psfig{file=cse-fig2.ps,width=3.0in}}

535: \caption{Timing results for DIASS on an IBM SP.

536: \label{f-speedup}

537: }

538: \end{figure}

539: \end{center}

540: %

541: \vspace*{-0.3in}

542:

543: With a sampling rate of 44,100 samples per second

544: and two-channel output, a sound file occupies

545: 176~KB per second of sound,

546: so the sound file for the 2'26"~musical composition

547: takes close to 25.8~MB of memory.

548:

549: \section{Data Sonification}

550: %

551: Sonification is the faithful rendition of data in sounds.

552: When the data come from scientific experiments---actual

553: physical experiments or computational experiments---

554: we speak of ``scientific sonification.''

555: Scientific sonification is therefore the analog

556: of scientific visualization,

557: where we deal with aural

558: instead of visual images.

559: Because sounds can convey significant amounts

560: of information, sonification has the potential

561: to increase the bandwidth of the human/computer

562: interface.

563: Yet, its use in scientific computing has received

564: limited attention.

565: One reason is, of course, that our sense of vision

566: seems much more dominant than our sense of hearing.

567: Another important reason is the lack of

568: a suitable instrument for scientific sonification.

569: One of the goals of our project is to demonstrate

570: that, with an instrument like DIASS, one can probe

571: multidimensional datasets with surgical precision

572: and uncover structures that may be hidden to the eye.

573:

574: \subsection{Past Experiments}

575: %

576: An early experiment with scientific sonification

577: was done by Yeung~\cite{yeung}.

578: Seven chemical variables were matched with

579: seven variables of sound:

580: two with frequency, one each with loudness,

581: decay, direction, duration, and rest (silence

582: between sounds).

583: His test subjects (professional chemists) were

584: able to understand the different patterns

585: of sound representations and correctly classify

586: the chemicals with a 90\% accuracy rate before and

587: a 98\% accuracy rate after training.

588: His experiment showed that motivated expert users

589: can easily adapt to complex auditory displays.

590:

591: Recently, a successful application of

592: scientific sonification was reported in

593: physics by Pereverzev et al.~\cite{nature}.

594: The authors were able to detect quantum oscillations

595: between two weakly coupled reservoirs

596: of superfluid ${}^3$He using sound,

597: where oscilloscope traces failed

598: to reveal structure.

599:

600: Several other experiments reported in the literature

601: refer to situations where sounds are used in

602: combination with visual images for data analysis.

603: Bly~\cite{bly} ran discriminant analysis experiments

604: using sound and graphics to represent multivariate,

605: time-varying, and logarithmic data.

606: Mezrich et al.~\cite{mezrich} used sound and

607: dynamic graphics to represent

608: multivariable time series data.

609: The ``Exvis'' experiment at the

610: University of Massachusetts at Lowell~\cite{smith}

611: expanded this work by assigning sonic attributes

612: to visual icons.

613: The importance of sound localization is recognized

614: by ongoing work at NASA-Ames~\cite{wenzel}.

615: The evaluation of auditory display techniques

616: is reported extensively at the annual conferences of ICAD,

617: the International Conference on Auditory Display;

618: see~\cite{kramer}.

619: Sound as a component of the human/computer interface

620: is discussed in~\cite{buxton}.

621:

622: Most of the attempts described above used MIDI-controlled

623: synthesizer sounds, which have drastic limitations

624: in the number and range of their control parameters.

625: Bargar et al.~\cite{bargar} at the National Center

626: for Supercomputing Applications (NCSA)

627: have developed a complex instrument

628: with interactive capabilities,

629: which includes the VSS sound server

630: for the CAVE virtual-reality environment.

631:

632: \subsection{What We Have Done So Far}

633: %

634: Much of our work so far has been focused on

635: the development of DIASS~\cite{ICMC92,ICMC95}.

636: In addition, we have used DIASS for two preliminary

637: experiments in scientific sonification, one in chemistry,

638: the other in materials science.

639:

640: The first experiment used data from Dr.~Jeff Tilson,

641: a computational chemist at ANL,

642: who studied the binding of a carbon atom

643: to a protonated thiophene molecule.

644: The data represented the difference in

645: the energy levels before and after the binding

646: at $128\times128\times128$ mesh points

647: of a regular computational grid in space.

648: Because the data were static,

649: we arbitrarily identified time with

650: one of the spatial coordinates

651: and sonified data in planes parallel to this axis.

652: The time to traverse a plane over its full length

653: was usually kept at 30 seconds.

654: In a typical experiment, we assigned a sound to

655: every other point in the vertical direction,

656: distributing the frequencies regularly over

657: a specified frequency range, and used the data in the

658: horizontal direction to generate amplitude envelopes

659: for each of the sounds.

660: Thus, a sound would become louder or softer

661: as the data increased or decreased, and

662: the evolution of the loudness distribution

663: within the ensemble of 64 sounds was an indicator

664: of the distribution of the energy difference

665: before and after the reaction in space.

666: The sound parameters chosen for the representation

667: of the data varied from one experiment to another.

668:

669: The second experiment involved data from

670: a numerical simulation in materials science.

671: The scientists were interested in patterns of motion of

672: magnetic flux vortices through a superconducting medium.

673: The medium was represented by $384\times256$ mesh points

674: in a rectangular domain.

675: As the vortices are driven across the domain,

676: from left to right, by an external force,

677: they repel each other but are attracted by

678: regularly or randomly distributed defects

679: in the material.

680: In this experiment,

681: frequency and frequency modulation (vibrato)

682: were used to represent movement in the plane,

683: and changes in loudness were connected to

684: changes in the speed of a vortex.

685: A traveling window of constant width

686: was used to capture the motion of a number

687: of vortices simultaneously.

688:

689: These investigations are ongoing,

690: and the results have not been subjected

691: to rigorous statistical evaluation.

692: They have merely served to demonstrate

693: the capabilities of DIASS and

694: explore various mappings from

695: the degrees of freedom in the data to

696: the parameters controlling the sound synthesis process.

697: Samples can be heard on the Web~\cite{web-sonification}.

698:

699: \subsection{What We Have Found So Far}

700: %

701: General conclusions are that

702: (i) the sounds produced in each experiment

703: conveyed information about

704: the qualitative nature of the data,

705: and (ii) DIASS is a flexible

706: and sophisticated tool capable of

707: rendering subtle variations in the data.

708:

709: Changes in some control variables,

710: such as time, frequency, and amplitude,

711: are immediately recognizable.

712: Changes in the combination of partials

713: in a sound, identifiable through its timbre,

714: can be recognized with some practice.

715: Some effects are enhanced by modifiers

716: like reverberation,

717: amplitude modulation (tremolo), and

718: frequency modulation (vibrato).

719: In some instances, a modifier may lump two,

720: three, or more degrees of freedom together,

721: like hall size, duration, and acoustic properties

722: in the case of reverberation.

723: Through the proper manipulation of reverberation,

724: loudness, and spectrum, one can create

725: the illusion of sounds being produced

726: at arbitrary locations in a room,

727: even with only two speakers.

728:

729: Like the eye,

730: the ear has a very high power of discrimination.

731: Even a coarse grid,

732: such as the temperate tuning used in Western music,

733: includes about 100 identifiable discrete steps over the

734: frequency range encompassed by a piano keyboard.

735: Contemporary music, as well as some non-Western

736: traditional music, successfully uses smaller increments

737: of a quarter tone or less for a total of some 200 or more

738: identifiable steps in the audible range.

739: Equally discriminating power is available

740: in the realm of timbre.

741:

742: Sound is an obvious means to identify regularities

743: in the time domain, both at the microlevel

744: and on a larger scale,

745: and to bring out transitions between random states

746: and periodic happenings.

747: Most auditory processes are based on the

748: recognition of time patterns

749: (periodic repetitions giving birth to pitch,

750: amplitude, or frequency modulation;

751: spectral consistency creating stable timbres

752: in a complex sound; etc.),

753: and the ear is highly attuned to detect

754: such regularities.

755:

756: Most conceptual problems in scientific sonification

757: are related to finding suitable mappings between

758: the space of data and the space of sounds.

759: Common sense points toward letting the two domains

760: share the coordinates of physical space-time if

761: these are relevant and translating

762: other degrees of freedom in the data

763: into separate sound parameters.

764: On the other hand, it may be advantageous

765: to experiment with alternative mappings.

766: Sonification software must be sufficiently flexible

767: that a user can pair different sets of parameters

768: in the two domains.

769:

770: Any mapping between data and sound parameters

771: must allow for redundancies to enable

772: the exploration of data at different levels

773: of complexity.

774: Similar to visualization software,

775: sonification software must have utilities

776: for zooming, modifying the audio palette,

777: switching between visual and aural representation

778: of parameters, defining time loops,

779: slowing down or speeding up, and so forth.

780:

781: Our experiments also showed that DIASS,

782: at least in its present form, has its limitations.

783: One limitation concerns the sheer volume of data

784: in scientific sonification.

785: While the composition of a musical piece

786: (the original intent behind DIASS)

787: typically entails the handling of

788: a few thousand sounds,

789: each with a dozen or so partials,

790: the number of data points in the

791: computational chemistry experiment

792: ran into the millions,

793: a difference of several orders of magnitude.

794: By the same token, while a typical amplitude envelope

795: for a partial or sound in a musical composition

796: involves ten or even fewer segments,

797: both experiments required envelopes with

798: well over 100 such segments.

799: Another difficulty encountered was the fact

800: that both experiments required sounds

801: to be accurately located in space.

802: While panning is very effective in pinpointing the source

803: on a horizontal line, suggesting the height

804: of a sound is a major challenge.

805: We hope that additions to the software

806: as well as a contemplated eight-speaker system

807: will help us get closer to a realistic

808: three-dimensional representation of sounds.

809: Finally, to become an effective tool for

810: sonification, DIASS must operate in real time.

811: All three concerns are being addressed

812: in the new C++ version of DIASS currently

813: under development.

814:

815: \section{Sound Visualization in a VR Environment}

816: %

817: The notion of sound visualization may at first sight

818: seem incongruous in the context of data sonification.

819: However, as has been recognized by several researchers,

820: the structure of a sound is difficult to detect

821: without proper training,

822: and any means of aiding the detection process

823: will enhance the value of data sonification.

824: Visualizing sounds is one of these means.

825: In this project we are focusing on

826: the visualization of sounds in the CAVE,

827: a room-size virtual-reality (VR) environment~\cite{cave},

828: and on the ImmersaDesk, a two-dimensional version.

829:

830: \subsection{M4CAVE -- A Visualization Tool}

831: %

832: The software collectively known as M4CAVE

833: takes a score file from the sound synthesis program DIASS

834: and renders the sounds represented by the score

835: as visual images in a CAVE or ImmersaDesk.

836: The images are computed on the fly and are made

837: to correspond exactly to the sounds one hears through

838: a one-to-one mapping between control parameters

839: and visual attributes.

840: The code, which is written in C++,

841: uses OpenGL for visualizing objects.

842:

843: \subsubsection{Graphical Representations}

844: %

845: Currently, M4CAVE can represent sounds

846: as a collection of spheres (or cubes or polyhedra),

847: as a cloud of confetti-like particles,

848: or as  a collection of planes.

849:

850: The spheres representation is the most developed

851: and incorporates more parameters of a sound

852: into the visualization than either of the other.

853: Sounds are visualized as stacks of spheres,

854: each sphere corresponding to a partial in the sound.

855: The position of a sphere along the vertical axis

856: is determined by the frequency of the partial,

857: and its size is proportional to the amplitude.

858: A sound's position in the stereo field

859: determines the placement of the spheres

860: in the room.

861: The visual objects rotate or pulse

862: when tremolo or vibrato is applied,

863: and their color varies when

864: reverberation is present.

865: An optional grid in the background

866: shows the octaves divided into

867: twelve equal increments.

868: Figure~\ref{f-cave}---taken from

869: our Web site~\cite{web-sonification},

870: where more samples can be found---shows

871: a visualization of nine sounds

872: with different numbers of partials.

873: %

874: \begin{center}

875: \begin{figure}[htb]

876: \hspace*{0.0in}

877: \centering{\psfig{file=cse-fig3.ps,width=3.0in}}

878: \caption{Visualization of nine sounds.

879: (Picture taken from a CAVE simulator.)

880: \label{f-cave}

881: }

882: \end{figure}

883: \end{center}

884: %

885:

886: \vspace*{-0.5in}

887: The plane and cloud representations were designed

888: more on the basis of artistic considerations.

889: (Remember that the purpose of the visualization

890: is to aid the perception of sounds.)

891: The strength of the cloud representation is

892: in showing tremolo and vibrato in the sound.

893: The planes representation is unique

894: in that it limits the visualization to only

895: one partial (usually the fundamental) of each sound.

896: The various representations can be combined,

897: and the mappings chosen for each representation

898: can be varied by means of a menu.

899:

900: \subsection{Preliminary Findings}

901: %

902: We have used M4CAVE to explore various mappings

903: from the sound domain to the visual domain.

904: Besides the obvious short score files to test

905: the implementation of these mappings,

906: we have used score files generated with DIASS

907: of various musical compositions, notably the

908: ``A.N.L.-folds'' of Tipei~\cite{folds}.

909: A.N.L.-folds is an example of a

910: {\em manifold composition},

911: described in Box~1.

912: Each member of A.N.L.-folds lasts exactly 2'26''

913: and comprises between 200 and 500 sounds of

914: medium to great complexity.

915: The picture of Fig.~\ref{f-cave} was taken

916: from a run of one of these A.N.L.-folds.

917:

918: The combination of visual images and sounds

919: provides indeed an extremely powerful tool

920: for uncovering complicated structures.

921: Sometimes, the sounds reveal features

922: that are hidden to the eye;

923: at other times, the visual images illuminate

924: features that are not easily detectable in the sound.

925: The two modes of perception reinforce each other,

926: and both improve with practice.

927:

928: \section{Larger Issues}

929: %

930: This project is unusual in several respects.

931: It is somewhat speculative, in the sense that

932: we don't have much experience with the use of

933: sound in scientific computing.

934: This is the main reason why the involvement of

935: someone expert in the intricacies of the sound world

936: is critical for its success.

937: In our case, the expertise comes from the realm

938: of music composition.

939:

940: When do we declare ``success''?

941: Can we reasonably expect that sonification

942: will evolve to the same level of usefulness

943: as visualization for computational science?

944: The answers to these questions depends

945: on one's expectations.

946: Ours is a visually oriented culture

947: {\em par excellence}, and as a society

948: we watch rather than listen.

949: Contemporary musical culture is often reduced

950: to entertainment genres that use a simple-minded

951: vocabulary---no small impediment to discover

952: the potential benefits of the world of sound.

953: But given unusual and unexpected sonorities,

954: we may yet discover that

955: we have not lost the ability to listen.

956:

957: When we engage in this type of research,

958: it is easy to get swept up by unreasonable

959: expectations, looking for the ``killer application.''

960: But the killer application is a phantom,

961: not worth pursuing.

962: What we can offer is a systematic investigation

963: of the potential of a new tool.

964: If it helps us understand some computational data sets

965: a little better, or if it enables us

966: to explore these data sets more easily and in more detail,

967: we have good reason to claim success.

968: If the project adds to our understanding

969: of aural semiotics, we have even more reason

970: to claim success.

971: And if none of these successes materializes,

972: we can still claim that the people involved,

973: both scientists and musicians,

974: gained by becoming more familiar with

975: each other's work and ways of thinking.

976: Such a rapprochement has, in fact, already

977: occurred and led to a new ``Discovery'' course

978: entitled

979: {\em Music, Science, and Technology}

980: at UIUC, where some of the issues presented here

981: are being discussed in a formal educational context.

982:

983: \section*{Acknowledgments}

984: %

985: This work was partially supported by the

986: Mathematical, Information, and Computational Sciences Division

987: subprogram of the Office of Computational and Technology Research,

988: U.S. Department of Energy, under Contract W-31-109-Eng-38.

989:

990: \begin{thebibliography}{99}

991:

992: \bibitem{buxton}

993: Baecker, R.~M., J.~Grudin, W.~Buxton, and S.~Greenberg,

994: \textsl{Readings in Human-Computer Interaction:

995: Toward the Year 2000},

996: second edition, Morgan Kaufmann Publ., Inc., San Francisco, 1995

997:

998: \bibitem{bargar}

999: Bargar, R., I.~Choi, S.~Das, and C.~Goudeseune,

1000: ``Model-based interactive sound for an immersive virtual environment,''

1001: \textsl{Proc.\ 1994 Int'l.\ Computer Music Conference}

1002: (Tokyo, Japan), pp.\ 471--477.

1003:

1004: \bibitem{M4C}

1005: Beauchamp, J.,

1006: \textsl{Music 4C Introduction},

1007: Computer Music Project, School of Music,

1008: University of Illinois at Urbana-Champaign, 1993.

1009: URL: http://cmp-rs.music.uiuc.edu/cmp/software/m4c.html

1010:

1011: \bibitem{bly}

1012: Bly, S.,

1013: \textsl{Sound and Computer Information Presentation},

1014: Ph.D.\ thesis, University of California -- Davis, 1982

1015: (unpublished)

1016:

1017: \bibitem{fletcher-munson}

1018: Fletcher, H.\ and W.~A.~Munson,

1019: ``Loudness, its definition, measurement, and calculation,''

1020: {\em J.~Acoust.\ Soc.\ Am.} {\bf 5} (1933), 82

1021:

1022: \bibitem{MPI}

1023: Gropp, W., E.~Lusk, and A.~Skjellum,

1024: \textsl{Using MPI: Portable Parallel Programming with the

1025: Message-Passing Interface},

1026: MIT Press, 1994.

1027: See also URL:

1028: http://www.mcs.anl.gov/mpi/index.html

1029:

1030: \bibitem{hiller-book}

1031: Hiller, L.\ and L.~Isaacson,

1032: \textsl{Experimental Music},

1033: McGraw-Hill, 1959;

1034: reprinted by Greenwood Press, 1983

1035:

1036: \bibitem{hiller-quartet}

1037: Hiller, L.,

1038: \textsl{Computer Music Retrospective},

1039: Compact disc WER 60128-50,

1040: WERGO Schallplatten GmbH,

1041: Mainz, Germany, 1989

1042:

1043: \bibitem{ISO}

1044: International Organization for Standardization (ISO),

1045: ``Acoustics -- Normal equal-loudness level contours,''

1046: Publ.\ No.~226:1987

1047:

1048: \bibitem{ICMC95}

1049: Kaper, H.~G., D.~Ralley, J.~M.~Restrepo, and S.~Tipei,

1050: ``Additive synthesis with DIASS\_M4C

1051: on Argonne National Laboratory's IBM POWERparallel System (SP),''

1052: \textsl{Proc.\ 1995 Int'l.\ Computer Music Conference}

1053: (Banff, Canada), pp.\ 351--352

1054:

1055: \bibitem{montreal}

1056: Kaper, H.~G., D.~Ralley, and S.~Tipei,

1057: ``Perceived equal loudness of complex tones:

1058: A software implementation for computer music composition,''

1059: \textsl{Proc.\ 1996 Int'l.\ Conference in Music Perception and Cognition}

1060: (Montreal, Canada), pp.\ 127--132

1061:

1062: \bibitem{kramer}

1063: Kramer, G.\ (ed.),

1064: \textsl{Auditory Display: Sonification, Audification,

1065: and Auditory Interfaces},

1066: Proc.\ ICAD '92,

1067: Addison-Wesley Publ.\ Co., 1994.

1068: For proceedings of later conferences,

1069: consult URL: http://www.santafe.edu/\~{ }icad

1070:

1071: \bibitem{ICMC92}

1072: Kriese, C.\ and S.~Tipei,

1073: ``A compositional approach to additive synthesis on supercomputers,''

1074: \textsl{Proc.\ 1992 Int'l.\ Computer Music Conference}

1075: (San Jose, California), pp.\ 394--395

1076:

1077: \bibitem{mezrich}

1078: Mezrich, J., S.~Frysinger, and R.~Slivjanovski,

1079: ``Dynamic representation of multivariate time series data,''

1080: {\em J.~Amer.\ Stat.\ Ass.} {\bf 79} (1984), 34--40

1081:

1082: \bibitem{nature}

1083: Pereverzev, S.~V., A.~Loshak, S.~Backhaus, J.~C.~Davis,

1084: and R.~E.~Packard,

1085: ``Quantum oscillations between two weakly coupled

1086: reservoirs of superfluid ${}^3$He,''

1087: {\em Nature} {\bf 388} (1997), 449-451

1088:

1089: \bibitem{roads}

1090: Roads, C.,

1091: \textsl{The Computer Music Tutorial},

1092: MIT Press, Cambridge, Mass., 1996

1093:

1094: \bibitem{roederer}

1095: Roederer, J.~G.,

1096: \textsl{The Physics and Psychophysics of Music},

1097: 3rd edition.

1098: Springer-Verlag, 1995

1099:

1100: \bibitem{rossing}

1101: Rossing, T.~D.,

1102: \textsl{The Science of Sound},

1103: Addison-Wesley Publ.\ Co., 1990

1104:

1105: \bibitem{ICMC}

1106: Simoni, M.\ (ed.),

1107: \textsl{Proc.\ ICMC98, Int'l Computer Music Conference}

1108: (Ann Arbor, Michigan), October 1998;

1109: see also proceedings of earlier conferences

1110:

1111: \bibitem{smith}

1112: Smith, S.\ and M.~Williams,

1113: ``The use of sound in an exploratory visualization experiment,''

1114: CS Dept., U.~Mass. at Lowell,

1115: tech report R-89-002, 1989

1116:

1117: \bibitem{tipei}

1118: Tipei, S.,

1119: ``The computer: A composer's collaborator,''

1120: {\em Leonardo\/} {\bf 22}(2) 1989, 189--195

1121:

1122: \bibitem{manif}

1123: Tipei, S.,

1124: ``Manifold compositions --- A (super)computer-assisted composition

1125: experiment in progress,''

1126: \textsl{Proc.\ 1989 Int'l.\ Computer Music Conference}

1127: (Columbus, Ohio), pp.\ 324--327

1128:

1129: \bibitem{folds}

1130: Tipei, S.,

1131: ``A.N.L.-folds.\

1132: mani 1943-0000;

1133: mani 1985r-2101;

1134: mani 1943r-0101;

1135: mani 1996m-1001;

1136: mani 1996t-2001''

1137: (1996).

1138: Report ANL/MCS-P679-0897,

1139: Mathematics and Computer Science Division,

1140: Argonne National Laboratory

1141:

1142: \bibitem{web-sonification}

1143: URL:

1144: http://mcs.anl.gov/appliedmath/Sonification/index.html

1145:

1146: \bibitem{cave}

1147: URL:

1148: http://www.evl.uic.edu/pape/CAVE/prog/CAVEGuide.html

1149:

1150: \bibitem{wenzel}

1151: Wenzel, E., S.~Fisher, P.~Stone, and S.~Foster,

1152: ``A system for three-dimensional acoustic `visualization'

1153: in a virtual environment workstation,''

1154: \textsl{Proc.\ Visualization '90:

1155: First IEEE Conf.\ on Visualization},

1156: IEEE Computer Society Press, Washington,

1157: pp. 329--337

1158:

1159: \bibitem{xenakis}

1160: Xenakis, I.,

1161: \textsl{Formalized Music: Thought and Mathematics

1162: in Musical Composition},

1163: revised edition, Pendragon Press, 1992

1164:

1165: \bibitem{yeung}

1166: Yeung, E.,

1167: ``Pattern recognition by audio representation

1168: of multivariate analytical data,''

1169: \textit{Analytical Chemistry} {\bf 52} (1980), 1120--1123

1170:

1171: \bibitem{zwicker-80}

1172: Zwicker, E.\ and E.~Terhardt,

1173: ``Analytical expressions for critical-band rate

1174: and critical bandwidth as a function of frequency,''

1175: {\em J.~Acoust.\ Soc.\ Am.} {\bf 68} (1980), 5

1176:

1177: \end{thebibliography}

1178:

1179: \newpage

1180:

1181: {\bf Hans G. Kaper}

1182: is Sr.\ Mathematician at Argonne National Laboratory.

1183: After receiving his Ph.D.\ in mathematics from the

1184: University of Groningen (the Netherlands) in 1965,

1185: he held positions at the University of Groningen

1186: and Stanford University.

1187: In 1969 he joined the staff of Argonne.

1188: He was director of the

1189: Mathematics and Computer Science Division

1190: from 1987 to 1991.

1191: Kaper's professional interests are in

1192: applied mathematics, particularly

1193: mathematics of physical systems and

1194: scientific computing.

1195: He is a corresponding member of the

1196: Royal Netherlands Academy of Sciences.

1197: His main interest outside mathematics is classical music.

1198: He is chairman of ``Arts at Argonne,''

1199: concert impresario, and an accomplished pianist.

1200: He can be reached at kaper@mcs.anl.gov or

1201: http://www.mcs.anl.gov/\~{ }kaper/index.html.

1202:

1203: {\bf Sever Tipei}

1204: is professor of composition and music theory at the

1205: University of Illinois at Urbana-Champaign (UIUC),

1206: where he also manages the Computer Music Project

1207: of the UIUC Experimental Music Studios.

1208: He has a Diploma in piano

1209: from the Bucharest Conservatory in Romania

1210: and a DMA in composition from the University of Michigan.

1211: Tipei has been involved in computer music since 1973

1212: and regards the composition of music both as an experimental

1213: and as a speculative endeavor.

1214: He can be reached at s-tipei@uiuc.edu or

1215: http://cmp-rs.music.uiuc.edu/people/tipei/index.html.

1216:

1217: {\bf Elizabeth Wiebel}

1218: was a participant in the

1219: Student Research Participation Program,

1220: which is sponsored by the

1221: Division of Educational Programs

1222: of Argonne National Laboratory.

1223: She is an undergraduate student

1224: at St.\ Norbert College in De Pere, Wisconsin,

1225: where she is pursuing a degree in

1226: mathematics and computer science.

1227: In addition, Wiebel studies and teaches piano

1228: through the St.\ Norbert College music department.

1229: She is currently spending a semester

1230: at Richmond College in London (England).

1231: She can be reached at wiebep@sncac.snc.edu,

1232: F300398@Richmond.ac.uk,

1233: or

1234: http://members.tripod.com/\~{ }LibbyW.

1235:

1236: \newpage

1237:

1238: \section*{Box~1. \quad Computer-Assisted Music Composition}

1239: %

1240: The idea of using computers for music composition

1241: goes back to the 1950s, when Lejaren Hiller performed

1242: his experiments at the University of Illinois~\cite{hiller-book}.

1243: The premiere of his Quartet No.\ 4 for strings

1244: ``Illiac Suite''~\cite{hiller-quartet}

1245: (May 1957) is generally regarded as

1246: the birth of computer music.

1247: Since then, computers have helped many composers

1248: to algorithmically synthesize new sounds and

1249: produce new pieces for acoustic as well as

1250: digital instruments.

1251: The proceedings of the annual conferences

1252: sponsored by the ICMA

1253: (International Computer Music Association)

1254: are good sources of references~\cite{ICMC}.

1255:

1256: Why would a composer need computer assistance

1257: when composing?

1258: A quick answer is that, as in many other areas,

1259: routine operations can be relegated to the machine.

1260: A more sophisticated reason may be that the composer

1261: may rely on expert systems to write Bach-like chorales

1262: or imitate the mannerisms of Chopin or Rachmaninov.

1263: There are, however, more compelling reasons

1264: when composing is viewed as a speculative

1265: and experimental endeavor, rather than as

1266: an ability to manufacture pleasing sounds~\cite{tipei}.

1267:

1268: Music is basically a dynamic event evolving

1269: in a multidimensional space;

1270: as such, it can be formalized~\cite{xenakis}.

1271: The composer controls the evolution by supplying

1272: a set of rules, and accepts the output as long as

1273: it is consistent with the logic of the program

1274: and the input data.

1275: If the set of rules allows for a certain degree

1276: of randomness, the output will be different

1277: every time a new ``seed'' is introduced.

1278: The same code and input data may thus produce

1279: an unlimited number of compositions,

1280: all belonging to the same ``equivalence class''

1281: or {\em manifold composition}~\cite{manif}.

1282: The members of a manifold composition are variants

1283: of the same piece; they share the same structure

1284: and are the result of the same process, but differ

1285: in the way specific events are arranged in time.

1286:

1287: A nontraditional way of composing,

1288: the manifolds show how high-performace computing

1289: provides the composer with new means

1290: to try out compositional strategies

1291: or materials and hear the results

1292: in a reasonable amount of time.

1293:

1294: \newpage

1295:

1296: \section*{Box~2. \quad Loudness}

1297: %

1298: Sound is transmitted through sound waves---periodic

1299: pressure variations that cause the eardrums to vibrate.

1300: But the perception of loudness has as much to do with

1301: the amount of energy that is carried by the sound wave

1302: as with the processing of this energy that takes place

1303: in the ear and the brain once the sound wave has hit

1304: the eardrums.

1305: The latter is a much more subjective part of the experience.

1306: The algorithms underlying the loudness routines

1307: of DIASS incorporate therefore formal definitions,

1308: as well as results of psychoacoustic research experiments.

1309: We summarize the most relevant elements of the algorithm,

1310: referring the reader to~\cite{roederer} or~\cite{rossing}

1311: for details.

1312:

1313: The definition of (perceived) loudness begins

1314: with the consideration of the energy carried

1315: by the sound wave.

1316: The {\em intensity\/} $I$ of a pure tone (sinusoidal sound)

1317: is expressed in terms of its average pressure variation

1318: $\Delta p$ (measured in newton/m$^2$),

1319: \[

1320:   I = 20 \times \log_{10} (\Delta p / \Delta p_0) .

1321: \]

1322: $\Delta p_0$ is a reference value,

1323: usually identified with a traveling wave

1324: of 1,000~Hz at the threshold of hearing,

1325: $\Delta p_0 = 2 \times 10^{-5}$~newton/m$^2$.

1326: The unit of $I$ is the decibel (dB).

1327:

1328: Because of the way acoustical vibrations are processed

1329: in the cochlea (the internal ear),

1330: the sensation of loudness is strongly frequency dependent.

1331: For instance, while an intensity of 50~dB at 1,000~Hz is considered

1332: {\em piano}, the same intensity is barely audible at 60~Hz.

1333: In other words, to produce a given loudness sensation

1334: at low frequencies, a much higher intensity (energy flow)

1335: is needed than at 1,000~Hz.

1336: The intensity $I$ is therefore not a good measure of loudness

1337: if different frequencies are involved.

1338:

1339: In the 1930s, Fletcher and Munson~\cite{fletcher-munson}

1340: performed a series of loudness-matching experiments,

1341: from which they derived a set of

1342: {\em curves of equal loudness}.

1343: These are curves in the

1344: frequency ($f$) vs.\ intensity ($I$) plane;

1345: points on the same curve represent

1346: single continuously sounding pure tones

1347: that are perceived as being ``equally loud.''

1348: They are similar to those recommended

1349: by the International Organization for

1350: Standardization (ISO)~\cite{ISO}

1351: and are presented in Fig.~\ref{f-loudness}.

1352: The curves show clearly that,

1353: in order to be perceived as equally loud,

1354: very low and very high frequencies require

1355: much higher intensities (energy)

1356: than frequencies in the middle range

1357: of the spectrum of audible sounds.

1358: %

1359: \begin{figure}[htb]

1360: \hspace*{0.0in}

1361: \centering{\psfig{file=cse-fig4.ps,width=3.0in}}

1362: \caption{

1363: Curves of equal loudness (marked in phons)

1364: in the frequency vs.\ intensity plane.

1365: \label{f-loudness}

1366: }

1367: \end{figure}

1368: %

1369:

1370: The (physical) {\em loudness level\/} $L_p$

1371: of a Fletcher-Munson curve is identified

1372: with the value of $I$

1373: at the reference frequency of 1,000~Hz.

1374: The unit of $L_p$ is the phon.

1375: The Fletcher-Munson curves range from a loudness

1376: level of 0 to 120~phons over a frequency

1377: range from 25 to 16,000~Hz.

1378:

1379: The loudness level $L_p$ still does not measure loudness

1380: in an absolute manner: a tone whose $L_p$ is twice

1381: as large does not sound twice as loud.

1382: Following Rossing~\cite{rossing},

1383: we define the (subjective) {\em loudness level\/} $L_s$

1384: in terms of $L_p$ by the formula

1385: $L_s = 2^{(L_p-40)/10}$.

1386: The unit of $L_s$ is a sone.

1387: To be effective, loudness scaling must be done

1388: on the basis of sones.

1389:

1390: The loudness of a sound that is composed of

1391: several partials depends on how well the

1392: frequencies of the partials are separated.

1393: With each frequency $f$ is associated

1394: a {\em critical band}, whose width $\Delta f$

1395: is approximely given by the expression~\cite{zwicker-80}

1396: \[

1397:   \Delta f \approx 25 + 75 \left(1 + 1.4(f/1000)^2 \right)^{0.69} .

1398: \]

1399: Intensities within a critical band are added,

1400: and the loudness of a critical band can again

1401: be read off from the Fletcher-Munson tables.

1402: If the frequencies of its constituent partials

1403: are spread over several critical bands,

1404: the loudness of a sound is computed

1405: in accordance with a formula

1406: due to Rossing~\cite{rossing},

1407: \[

1408:   L_s = L_{s,m} + 0.3 \sum_i L_{s,i} .

1409: \]

1410: Here, $L_{s,m}$ is the loudness of the loudest critical band,

1411: and the sum extends over the remaining bands.

1412:

1413: The loudness routines in DIASS use

1414: critical band information and

1415: a table derived from the Fletcher-Munson curves

1416: to create complex sounds of specified loudness.

1417:

1418: \newpage

1419:

1420: \section*{Box~3. \quad Loudness of Sound Clusters}

1421:

1422: The waveform of Fig.~\ref{f-clusters},

1423: which was produced with DIASS,

1424: illustrates the concept of equal loudness

1425: across the frequency spectrum and for

1426: different timbres.

1427: The waveform represents five sound clusters,

1428: each lasting 5.5 seconds (except the fourth,

1429: which lasts 5.7 seconds).

1430: The clusters, although of widely different structure,

1431: have been designed to be perceived at the same

1432: loudness level ($2^5$ sones).

1433: %

1434: \begin{figure}[htb]

1435: \hspace*{0.0in}

1436: \centering{\psfig{file=cse-fig5.ps,width=4.0in}}

1437: \caption{

1438: Waveform of five sound clusters of equal perceived loudness.

1439: \label{f-clusters}

1440: }

1441: \end{figure}

1442: %

1443:

1444: The distribution of the sounds within each cluster

1445: is represented schematically in the diagram of

1446: % Table~2.

1447: Table~\ref{t-clusters}.

1448: The first sound cluster has 24 sounds.

1449: The fundamental frequencies of the sounds

1450: range from 40 to 5,000 Hz.

1451: Each sound is harmonically tuned;

1452: that is, it is made up of a fundamental

1453: and all its harmonics

1454: (partials whose frequencies

1455: are integer multiples of

1456: the fundamental frequency).

1457: The frequencies are limited to

1458: one-half of the sampling rate

1459: (Nyquist criterion);

1460: hence, the number of partials

1461: in this cluster is 754

1462: (at a sampling rate of 22,050 Hz).

1463: The second sound cluster has 5 sounds,

1464: harmonically tuned, with fundamental frequencies

1465: ranging from 40 to 4,000 Hz;

1466: the number of partials is 113.

1467: The third, fourth, and fifth cluster have

1468: 15, 1, and 10 sounds, with 453, 60, and 250 partials,

1469: respectively.

1470: All partials are assigned the same amplitude,

1471: which presents the worst-case scenario

1472: when one tries to obtain the same

1473: perceived loudness for all clusters.

1474:

1475: \begin{center}

1476: \begin{table}[h]

1477: \caption{Distribution of fundamentals

1478: in the clusters of Figure~5.  \label{t-clusters}}

1479: \vspace*{1ex}

1480: \begin{footnotesize}

1481: \hspace*{-3em}

1482: \begin{tabular}{||r| c c c c c||}\hline

1483: \multicolumn{1}{||c |}{Fundamental}

1484:          & 24 Sounds   & \hspace{-2em}5 Sounds      & \hspace{-2em}15 Sounds

1485:  & \hspace{-2em}1 Sound       & \hspace{-2em}10 Sounds \\

1486: \multicolumn{1}{||c |}{Frequency}

1487:          &\hspace{-2em}(754 partials) &\hspace{-2em}(113 partials) &\hspace{-2em}(453 partials) &\hspace{-2em}(60 partials)  &\hspace{-2em}(250 partials) \\\hline

1488:          &               &               &               &               & \\

1489: 5,000 Hz & \Yes          &               &               &               & \\

1490: 4,500 Hz & \Yes          &               &               &               & \Yesm\\

1491: 4,000 Hz & \Yes          & \Yesm         & \Yesm         &               & \\

1492: 3,000 Hz & \Yes          &               & \Yesm         &               & \\

1493: 2,666 Hz & \Yes          &               & \Yesm         &               & \\

1494: 2,000 Hz & \Yes          &               &               &               & \Yesm\\

1495: 1,666 Hz & \Yes          & \Yesm         & \Yesm         &               & \\

1496: 1,333 Hz & \Yes          &               &               &               & \Yesm\\

1497: 1,000 Hz & \Yes          &               & \Yesm         &               & \Yesm\\

1498: 750 Hz   & \Yes          & \Yesm         & \Yesm         &               & \\

1499: 625 Hz   & \Yes          &               &               &               & \Yesm\\

1500: 500 Hz   & \Yes          &               & \Yesm         &               & \\

1501: 400 Hz   & \Yes          &               &               &               & \Yesm\\

1502: 300 Hz   & \Yes          & \Yesm         & \Yesm         &               & \\

1503: 200 Hz   & \Yes          &               & \Yesm         &               & \\

1504: 165 Hz   & \Yes          &               &               &               & \Yesm\\

1505: 130 Hz   & \Yes          &               & \Yesm         &               & \\

1506: 90 Hz    & \Yes          &               & \Yesm         &               & \\

1507: 80 Hz    & \Yes          &               &               &               & \Yesm\\

1508: 70 Hz    & \Yes          &               &               &               & \Yesm\\

1509: 60 Hz    & \Yes          &               & \Yesm         &               & \\

1510: 53 Hz    & \Yes          &               & \Yesm         &               & \\

1511: 46 Hz    & \Yes          &               & \Yesm         &               & \\

1512: 40 Hz    & \Yes          & \Yesm         & \Yesm         & \Yesm         & \Yesm\\

1513:          &               &               &               &               & \\\hline

1514: Time     &

1515: \multicolumn{1}{l}{0.0"}&

1516: \multicolumn{1}{l}{\hspace{-1em}5.5"}&

1517: \multicolumn{1}{l}{\hspace{-2em}11.0"}&

1518: \multicolumn{1}{l}{\hspace{-1em}16.5"}&

1519: \multicolumn{1}{l||}{\hspace{-2em}22.2"}\\\hline

1520: \end{tabular}

1521: \end{footnotesize}

1522: \end{table}

1523: \end{center}

1524: %

1525:

1526: \newpage

1527:

1528: \section*{Box~4. \quad Computational Complexity}

1529:

1530: To give some idea of the computational complexity,

1531: consider the following simple scenario,

1532: where we wish to sonify time-varying data

1533: representing the values of two primary and

1534: several secondary observables measured

1535: over the course of an experiment.

1536: A natural choice is to map the primary observables

1537: onto loudness and frequency and to use

1538: amplitude and frequency modulation to monitor

1539: the secondary observables.

1540: The sample values of the sound wave $S$ must be

1541: calculated from an expression of the form

1542: \be

1543:   S (t) = a (t) \sin \left(2\pi f(t) t + \phi \right) .

1544:   \Label{S}

1545: \ee

1546: The frequency $f$ represents three degrees of freedom:

1547: the carrier frequency $f^C$, and the amplitude

1548: $a^{FM}$ and frequency $f^{FM}$ of the modulating wave,

1549: \be

1550:   f (t)

1551:   =

1552:   f^C (t)

1553:   + a^{FM} (t) \sin \left(2\pi f^{FM} t + \phi^{FM} \right) .

1554:   \Label{f}

1555: \ee

1556: The carrier frequency is identified with a primary observable,

1557: each of the remaining two degrees of freedom can be identified

1558: with a secondary observable,

1559:

1560: Similarly, the amplitude $a$ is given by an expression

1561: of the form

1562: \be

1563:   a (t)

1564:   =

1565:   a^C (t)

1566:   + a^{AM} (t) \sin \left(2\pi f^{AM} t + \phi^{AM} \right) .

1567:   \Label{a}

1568: \ee

1569: We compute the carrier amplitude $a^C$

1570: from the observed loudness, which is identified with

1571: one of the (primary) observables, so its value is given.

1572: The amplitude $a^{AM}$ and frequency $f^{AM}$ of the modulation

1573: represent two more degrees of freedom,

1574: which can be identified with

1575: two other secondary observables.

1576: In total, we have therefore two primary and four secondary variables

1577: (not counting the phases, which we assume to be static).

1578:

1579: The amplitude $a^C (t)$ must be computed such that

1580: $S (t)$ has the perceived loudness level $L_s (t)$,

1581: \be

1582:   L_s (S(t)) = L_s (t) .

1583:   \Label{L}

1584: \ee

1585: The loudness function $L_s$ is a nonlinear function of

1586: the amplitude and frequency of the partial (sound).

1587: Its computation is done in the loudness routines of DIASS

1588: and involves a significant number of operations,

1589: including table lookups; see Box~2.

1590:

1591: On the basis of these formulas we can obtain

1592: a rough estimate of the number of operations

1593: (additions, multiplications,

1594: function evaluations---sine,

1595: exponential, or logarithm,

1596: and table lookups)

1597: required for the computation of a single sample value.

1598: The contribution that is most difficult to estimate is

1599: the computation of the carrier amplitude from

1600: the loudness;

1601: the data in Table~\ref{t-ops}

1602: represent the minimum number of operations.

1603: %

1604: \begin{table}[htb]

1605: \begin{center}

1606: \caption{Number of operations per partial per sample value.  \label{t-ops} }

1607: \vspace*{2ex}

1608: \begin{small}

1609: \begin{tabular}{|| c || c | c | c | c ||}\hline

1610: Eq.       & Adds & Mults & Fn Evals & Tbl Lkups \\\hline

1611: (\ref{S}) & 1    & 3     & 1        & -           \\

1612: (\ref{f})& 2    & 3     & 1        & -           \\

1613: (\ref{a}) & 2    & 3     & 1        & -           \\

1614: (\ref{L}) & 1    & 3     & 2        & 1           \\\hline

1615: Total     & 6    & 12    & 5        & 1           \\\hline

1616: \end{tabular}

1617: \end{small}

1618: \end{center}

1619: \end{table}

1620: %

1621: Ignoring phases and so forth, we find a total of

1622: at least 24 operations.

1623: Hence, at the standard rate of 44,100 samples per second,

1624: one needs to perform more than

1625: 1.1 million operations per second.

1626:

1627: The simultaneous sonification of more observables

1628: is obviously much more complicated;

1629: in fact, the complications grow exponentially.

1630: A careful estimate of the computational complexity

1631: requires an analysis of the anticlip routines,

1632: which is beyond the scope of the present article.

1633:

1634: \end{document}

1635: