cs0007007/cse.tex
1: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2: %
3: %  
4: %  DATA SONIFICATION AND SOUND VISUALIZATION
5: %
6: %  January, 1999
7: %
8: %  Computational Science and Engineering
9: %
10: %  Preprint ANL/MCS-P738-0199
11: %
12: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
13: %%%%%%               Preamble                  %%%%%%
14: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
15: %
16: 
17: \documentstyle[psfig,12pt]{article}
18: \pagestyle{plain}
19: \setlength{\textheight}{8.0in}
20: \setlength{\textwidth}{6.0in}
21: \setlength{\evensidemargin}{0.3in}
22: \setlength{\oddsidemargin}{0.3in}
23: \setlength{\topmargin}{0.0in}
24: \setlength{\parskip}{2ex}
25: \setlength{\parindent}{2em}
26: \newcommand{\Yes}{\rule{1.0in}{0.02in}}
27: \newcommand{\Yesm}{\hspace{-2em}\rule{1.3in}{0.02in}}
28: \newcounter{labelflag} \setcounter{labelflag}{0}
29: \newcommand{\labelon}{\setcounter{labelflag}{1}}
30: \newcommand{\Label}[1]{
31:                        \ifnum\thelabelflag=1
32:                           \ifmmode
33:                              \makebox[0in][l]{\qquad\fbox{\rm#1}}
34:                           \else
35:                              \marginpar{\vspace{0.7\baselineskip}
36:                                         \hspace{-1.1\textwidth}
37:                                         \fbox{\rm#1}}
38:                           \fi
39:                        \fi
40:                        \label{#1}
41:                       }
42: % \labelon                            % Keys are printed,
43: \newcommand{\be}{\begin{equation}}
44: \newcommand{\ee}{\end{equation}}
45: 
46: 
47: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
48:                        
49: \begin{document}       
50:                       
51: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
52: %%%%%%      Title         %%%%%
53: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
54:    
55: 
56: \begin{center}
57: {\Large\bf Data Sonification and Sound Visualization}
58: \end{center}
59:    
60: \noindent
61: Hans G.\ Kaper \\
62: \hspace*{2em}
63: \textit{
64: Mathematics and Computer Science Division,
65: Argonne National Laboratory
66: } \\
67: Sever Tipei \\
68: \hspace*{2em}
69: \textit{
70: School of Music,
71: University of Illinois
72: } \\
73: Elizabeth Wiebel \\
74: \hspace*{2em}
75: \textit{
76: Mathematics and Computer Science Division,
77: Argonne National Laboratory
78: }
79: 
80: \medskip
81: 
82: \begin{abstract}
83: This article describes a collaborative project
84: between researchers in the Mathematics and Computer
85: Science Division at Argonne National Laboratory
86: and the Computer Music Project of the University
87: of Illinois at Urbana-Champaign.
88: The project focuses on the use of sound for the
89: exploration and analysis of complex data sets
90: in scientific computing.
91: The article addresses digital sound synthesis
92: in the context of DIASS (Digital Instrument
93: for Additive Sound Synthesis) and sound
94: visualization in a virtual-reality environment
95: by means of M4CAVE.
96: It describes the procedures and preliminary results
97: of some experiments in scientific sonification
98: and sound visualization.
99: \end{abstract}
100: 
101: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
102: %%%%%       Body           %%%%%
103: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
104: 
105: \medskip
106: 
107: \noindent
108: While most computational scientists routinely use
109: visual imaging techniques to explore and analyze
110: large data sets, they tend to be much less familiar
111: with the use of sound.
112: Yet, sound signals carry significant amounts of
113: information and can be used advantageously to
114: increase the bandwidth of the human/computer
115: interface.
116: The project described in this article focuses on
117: scientific sonification---the faithful rendering
118: of scientific data in sounds---and the visualization
119: of sounds in a virtual-reality environment.
120: The project, which grew out of an effort to apply
121: the latest supercomputing technology to
122: the process of music composition (see Box~1),
123: is a joint collaboration between
124: Argonne National Laboratory (ANL, Mathematics and
125: Computer Science Division) and the University of
126: Illinois at Urbana-Champaign (UIUC, Computer Music
127: Project).
128: 
129: Digital sound synthesis is addressed in Section~1;
130: the discussion centers on DIASS
131: (Digital Instrument for Additive Sound Synthesis).
132: Section~2 describes some experiments in
133: scientific sonification.
134: Sound visualization in a virtual-reality (VR)
135: environment is discussed in Section~3;
136: here, the main tool is M4CAVE, a program to
137: visualize sounds from a score file.
138: Section~4 contains some general observations
139: about the project.
140: 
141: \section{Digital Sound Synthesis}
142: %
143: Digital sound synthesis is a way to generate
144: a stream of numbers representing the sampled
145: values of an audio waveform.
146: To realize the sounds, one sends these samples
147: through a digital-to-analog converter (DAC),
148: which converts the numbers to a continuously
149: varying voltage that can be amplified and sent to
150: a loudspeaker.
151: 
152: One way of viewing the digital sound-synthesis process
153: is to imagine a computer program that calculates
154: the sample values according to a mathematical formula
155: and sends those samples, one after the other, to the DAC.
156: All the calculations are carried out by a program,
157: which can be changed in arbitrary ways by the user.
158: From this point of view, digital synthesis is the same
159: as software synthesis.
160: Software synthesis contrasts with hardware synthesis,
161: where the calculations are carried out in special
162: circuitry.
163: Hardware synthesis has the advantage of high-speed
164: operation but lacks the flexibility of software
165: synthesis.
166: Software synthesis is the technique of choice
167: if one wishes to develop an instrument for
168: data sonification.
169: 
170: With software synthesis, one can indeed realize any
171: imaginable sound---provided one has the time
172: to wait for the results.
173: With a sampling rate of 44,100 samples per second
174: the time available per sample is only 20 microseconds,
175: too short for real-time synthesis of reasonably complex sounds.
176: For this reason, most of today's synthesis programs
177: generate a sound file, which is then played through a DAC.
178: But data sonification in real time may become feasible
179: on tomorrow's high-performance computing architectures.
180: Our research effort focuses on the development of
181: a flexible and powerful digital instrument for
182: scientific sonification and on finding optimal ways
183: to convey information through the medium of sound.
184: 
185: \subsection{DIASS -- A Digital Instrument}
186: 
187: Two pieces of software consitute the main tools
188: of the project: DIASS,
189: a Digital Instrument for Additive Sound Synthesis,
190: and M4CAVE,
191: a program for the visualization of sound objects
192: in a multimedia environment.
193: Both are part of a comprehensive
194: {\em Environment for Music Composition},
195: which includes additional software for
196: computer-assisted composition and
197: automatic music notation.
198: Figure~\ref{f-env} gives a schematic overview
199: of the various elements of the {\em Environment\/};
200: C and S mark the data entry points for
201: composition and sonification, respectively.
202: % 
203: \begin{figure}[htbp]
204: \hspace*{-0.0in}  
205: \centering{\psfig{file=cse-fig1.eps,height=3in,width=3in}}
206: \caption{The
207: {\em Environment for Music Composition}.
208: \label{f-env}
209: }
210: \end{figure}
211: %
212: 
213: In this section we describe the workings of DIASS;
214: we will describe M4CAVE after we have discussed
215: our ideas on scientific sonification.
216: 
217: \subsubsection{The Instrument}
218: %
219: The DIASS instrument functions as part of
220: the M4C synthesis language developed
221: by Beauchamp and his associates
222: at the University of Illinois~\cite{M4C}.
223: Synthesis languages like M4C are designed
224: around the notion that the user creates
225: an instrument together with a score
226: that references the instrument.
227: The synthesis program reads the instrument, feeds it
228: the data from the score file, and computes the final
229: audio signal, which is then written to a sound file
230: for later playback~\cite{roads}.
231: 
232: The M4C synthesis language is imbedded in the C language.
233: As part of the current project, the instrument
234: and relevant parts of M4C were redesigned
235: for a distributed-memory environment.
236: The parallel implementation uses the standard
237: MPI message-passing library~\cite{MPI}.
238: 
239: Like all additive-synthesis instruments,
240: DIASS creates sounds through a summation
241: of simple sine waves.
242: The basic formula is
243: \[
244:   S (t) = \sum_i P_i (t)
245:   = \sum_i a_i (t) \sin (2 \pi f_i (t) t + \phi_i (t)) .
246: \]
247: The individual sine waves that make up a sound
248: are commonly designated as the ``partials'' of the sound,
249: hence the symbol $P$.
250: The sum extends over all partials that are
251: active at the time $t$;
252: $a_i$ is the amplitude, $f_i$ the frequency,
253: and $\phi_i$ the phase of the $i$th partial.
254: These variables can be modulated periodically or otherwise;
255: the modulations evolve on a slow time scale,
256: typically on the order of the duration of a sound.
257: Phase modulation is barely distinguishable from
258: frequency modulation, particularly in the case
259: of time-varying frequency spectra, and is not
260: implemented in DIASS.
261: 
262: The audible frequencies range roughly
263: from 20 to 20,000~Hz, although in practice
264: the upper limit is one-half the sampling frequency
265: (Nyquist criterion).
266: 
267: The partials in a sound need not be
268: in any harmonic relationship
269: (that is, $f_i$ need not be a multiple of some
270: fundamental frequency $f_0$),
271: nor do they need to share any other property.
272: The definition of a sound is purely operational.
273: What distinguishes one ``sound'' from another
274: is that certain operations are defined
275: at the level of a sound and affect all
276: the partials that make up the sound.
277: 
278: The evolution of a partial can be subject
279: to many other controls, besides
280: amplitude and frequency modulation.
281: Moreover, these controls can affect a single partial
282: or all the partials in a sound.
283: For example, reverberation, which represents
284: the combined effects of the size and
285: acoustic characteristics of the hall,
286: affects all the partials in a sound simultaneously,
287: although not necessarily in the same way.
288: Furthermore, if a random element is present,
289: it must be applied at the level of a sound;
290: otherwise, a complex wave is perceived as
291: a collection of independent sine waves,
292: instead of a single sound.
293: Hence, it is important that all partials
294: in a sound access the same random number sequence
295: and that the controls of any partial
296: that changes its allegiance and moves
297: from one sound to another be adjusted accordingly.
298: 
299: %
300: \begin{table}[htb]
301: \begin{center}
302: \caption{Static (S) and dynamic (D) control parameters in DIASS.
303:  \label{t-controls} }
304: \vspace*{2ex}
305: \begin{footnotesize}
306: \begin{tabular}{|| l | l | l ||}\hline
307: \multicolumn{1}{||c|}{Level} &
308: \multicolumn{1}{c|}{Description}&
309: \multicolumn{1}{c||}{Control Parameter} \\ \hline
310: 
311: Partial & Carrier (sine) wave     & S: Starting time, duration, phase \\
312:         &                         & D: Amplitude, frequency \\
313:         & AM (tremolo) wave       & S: Wave type, phase \\
314:         &                         & D: Amplitude, frequency \\
315:         & FM (vibrato) wave       & S: Wave type, phase \\
316:         &                         & D: Amplitude, frequency \\
317:         & Amplitude transients    & S: Max size \\
318:         &                         & D: Shape \\
319:         & Amplitude transient rate& S: Max rate \\
320:         &                         & D: Rate shape \\
321:         & Frequency transients    & S: Max size \\
322:         &                         & D: Shape \\
323:         & Frequency transient rate& S: Max rate \\
324:         &                         & D: Rate shape \\
325: Sound   & Timbre                  & D: Partial-to-sound relation \\
326:         & Localization            & D: Panning \\
327:         & Reverberation           & S: Duration, decay rate, mix \\
328:         & Hall                    & S: Hall size, reflection coefficient \\ \hline
329: \end{tabular}
330: \end{footnotesize}
331: \end{center}
332: \end{table}
333: %
334: Table~\ref{t-controls} lists the control parameters
335: that can be applied in DIASS.
336: Some, like starting time and duration,
337: do not change for the duration of a sound;
338: they are static and determined by a single value.
339: Others are dynamic;
340: their evolution is controlled by an envelope---a
341: normalized function consisting of
342: linear and exponential segments---and a maximum size.
343: Not all control parameters are totally independent;
344: some occur only in certain combinations, and
345: some are designed to reinforce others.
346: 
347: The control parameters give DIASS its flexibility
348: and make it an instrument suitable for data sonification.
349: On the other hand, the fact that the control parameters
350: act at the level of a partial as well as at the level
351: of a sound (or even at the level of a collection of sounds)
352: significantly increases its computational complexity.
353: 
354: \subsubsection{The Score}
355: %
356: Input for DIASS consists of a raw score file
357: detailing the controls.
358: The raw score file is transformed
359: into a score file for the instrument---a
360: collection of ``Instrument cards'' (I-cards),
361: one for each partial, which are fed
362: to the instrument by M4C.
363: The transformation is accomplished
364: in a number of steps.
365: 
366: Among the controls are certain global operations
367: (``macros''), which are defined at the level of a sound.
368: In a first pass, these global controls are expanded
369: into controls for the individual partials.
370: The next step consists of the application of
371: the loudness routines.
372: These routines operate at the sound level and ensure
373: that the sounds have the desired loudness.
374: The final step consists of the application of
375: the anticlip routines.
376: For various reasons, historical as well as technical,
377: sound samples are stored as 16-bit integers.
378: The anticlip routines guarantee that none of
379: the sample values produced by the instrument
380: from the score file exceeds 16 bits.
381: Because loudness and anticlip play a significant role
382: in sonification, we discuss the issues in more detail.
383: 
384: \paragraph{Loudness.}
385: The perception of loudness is a subjective experience.
386: Although the perceived loudness of a sound is related
387: to the amplitudes of its constituent partials,
388: the relation is nonlinear and depends on
389: the frequencies of the partials.
390: At the most elementary level,
391: pure sinusoidal waves of low or high frequencies
392: require a higher energy flow and therefore a larger
393: amplitude to achieve the same loudness level
394: as similar waves at mid-range frequencies.
395: When waves of different frequencies
396: are superimposed to form a sound,
397: the situation becomes still more complicated.
398: The sum of two tones of the same frequency
399: produced by two identical instruments
400: played simultaneously is not perceived as twice
401: as loud as the tone produced by a single instrument.
402: 
403: An algorithm for data sonification must reflect
404: these subjective experiences.
405: For example, when we sonify two degrees of freedom,
406: mapping one ($x_1$, say) to amplitude and the other
407: ($x_2$, say) to frequency, then we should perceive
408: equal loudness levels when $x_1$ has the same value, 
409: irrespective of the values of $x_2$.
410: Also, when the variable $x_1$ increases or decreases,
411: we should perceive a proportional increase
412: or decrease in the loudness level.
413: 
414: The loudness routines in DIASS incorporate
415: the relevant results of psychoacoustic
416: research~\cite{montreal}
417: and give the user full control over the perceived
418: loudness of a sound.
419: They also scale each partial so each sample value
420: fits in a 16-bit register (see Box~2).
421: 
422: \paragraph{Anticlip.}
423: When several sounds coexist and their waveforms
424: are added, sample values may exceed 16 bits (overflow),
425: even when the individual waveforms stay
426: within the 16-bit limit.
427: Overflow gives rise to ``clipping''---a popping
428: noise---when the sound file is played.
429: The anticlip routines in DIASS check
430: the score for potential overflow
431: and rescale the sounds as necessary,
432: while preserving the ratio of perceived loudness levels.
433: Thus it is possible to produce an entire sound file
434: in a single run from the score file, even when
435: the sounds cover a wide dynamic range.
436: 
437: To appreciate the difficulty inherent in the scaling processes,
438: consider the case of a sound cluster consisting of
439: numerous complex sounds, all very loud and resulting in clipping,
440: followed by a barely audible sound with only two or three partials.
441: If the cluster's amplitude is brought down to fit the register
442: capacity, and that of the soft tiny sound following it
443: is scaled proportionally,
444: the latter disappears under system noise.
445: On the other hand, if only the loud cluster is scaled,
446: the relationship between the two sound events
447: is completely distorted.
448: Many times in the past, individual sounds
449: or groups of sounds were generated separately
450: and then merged with the help of analog equipment
451: or an additional digital mixer.
452: The loudness and anticlip routines in DIASS
453: deal with this problem by adjusting both loud and
454: soft sounds so their perceived loudness matches
455: the desired relationship specified by the user,
456: and no clipping occurs (see Box~3).
457: 
458: \subsubsection{The Editor}
459: %
460: Features like the loudness routines make DIASS
461: a fine-tuned, flexible, and precise instrument
462: suitable for data sonification.
463: Of course, they require the specification of
464: significant amounts of input data.
465: The editor in DIASS is designed to facilitate
466: this process.
467: It comes in a ``slow'' and a ``fast'' version.
468: 
469: In the slow version, data are entered
470: one at a time, either in response to questions
471: from a menu or through a graphic user interface (GUI).
472: The process gives the user the opportunity
473: to build sounds step by step, experiment, and fine-tune
474: the instrument.
475: It is suitable for sound composition and for designing
476: prototype experiments in sonification.
477: The fast version uses the same code but reads
478: the responses to the menu questions from a script.
479: This version is used for sonification experiments.
480: 
481: \subsubsection{Computing Requirements}
482: %
483: The sound synthesis software embodied in DIASS
484: is computationally intensive (see Box~4).
485: The instrument proper,
486: the engine that computes the samples,
487: has been implemented in a workstation environment
488: and on the IBM Scalable POWERparallel (SP) system.
489: Parallelism is implemented at the sound level
490: to minimize communication among the processors
491: and enable all partials of a sound to access
492: the same random number sequence.
493: In parallel mode, at least four processors are
494: used---one to distribute the tasks and
495: supervise the entire run (the ``master'' processor),
496: a second to mix the results (the ``mixer''),
497: and at least two ``slave'' nodes to compute
498: the samples one sound at a time.
499: Sounds are computed in their starting-time order,
500: irrespective of their duration or complexity.
501: (A smart load-balancing algorithm would take into account
502: the duration of the various sounds and the
503: number of their partials.)
504: 
505: Performance depends greatly on the complexity
506: of the sounds---that is, on the number of partials per sound
507: and the number of active controls for each partial.
508: Typically, the time to generate a two-channel sound file
509: for a 2'26" musical composition
510: with 236 sounds and 4939 partials
511: ranges from almost two hours on four processors
512: to about 10 minutes on 34 processors of the SP.
513: Figure~\ref{f-speedup} gives some indication of
514: the speedups one observes in a multiprocessing
515: environment.
516: The three graphs correspond to three variants
517: of the same 2'26" piece with different complexity.
518: The time $T_p$ refers to a computation
519: on $p+2$ processors ($p$ ``slaves'');
520: all times are approximate, as they were
521: extracted from data given by LoadLeveler,
522: a not very sophisticated timing instrument
523: for the SP.
524: Speedup is measured relative to the performance
525: on four processors (two compute nodes).
526: One observes the typical linear speedup
527: until saturation sets in.
528: The more complex the piece (the more partials),
529: the later saturation sets in.
530: %
531: \begin{center}
532: \begin{figure}[htbp]
533: \hspace*{-0.2in}
534: \centering{\psfig{file=cse-fig2.ps,width=3.0in}}
535: \caption{Timing results for DIASS on an IBM SP.
536: \label{f-speedup}
537: }
538: \end{figure}
539: \end{center}
540: %
541: \vspace*{-0.3in}
542: 
543: With a sampling rate of 44,100 samples per second
544: and two-channel output, a sound file occupies
545: 176~KB per second of sound,
546: so the sound file for the 2'26"~musical composition
547: takes close to 25.8~MB of memory.
548: 
549: \section{Data Sonification}
550: %
551: Sonification is the faithful rendition of data in sounds.
552: When the data come from scientific experiments---actual
553: physical experiments or computational experiments---
554: we speak of ``scientific sonification.''
555: Scientific sonification is therefore the analog
556: of scientific visualization,
557: where we deal with aural
558: instead of visual images.
559: Because sounds can convey significant amounts
560: of information, sonification has the potential
561: to increase the bandwidth of the human/computer
562: interface.
563: Yet, its use in scientific computing has received
564: limited attention.
565: One reason is, of course, that our sense of vision
566: seems much more dominant than our sense of hearing.
567: Another important reason is the lack of
568: a suitable instrument for scientific sonification.
569: One of the goals of our project is to demonstrate
570: that, with an instrument like DIASS, one can probe
571: multidimensional datasets with surgical precision
572: and uncover structures that may be hidden to the eye.
573: 
574: \subsection{Past Experiments}
575: %
576: An early experiment with scientific sonification
577: was done by Yeung~\cite{yeung}.
578: Seven chemical variables were matched with
579: seven variables of sound:
580: two with frequency, one each with loudness,
581: decay, direction, duration, and rest (silence
582: between sounds).
583: His test subjects (professional chemists) were
584: able to understand the different patterns
585: of sound representations and correctly classify
586: the chemicals with a 90\% accuracy rate before and
587: a 98\% accuracy rate after training.
588: His experiment showed that motivated expert users
589: can easily adapt to complex auditory displays.
590: 
591: Recently, a successful application of
592: scientific sonification was reported in
593: physics by Pereverzev et al.~\cite{nature}.
594: The authors were able to detect quantum oscillations
595: between two weakly coupled reservoirs
596: of superfluid ${}^3$He using sound,
597: where oscilloscope traces failed
598: to reveal structure.
599: 
600: Several other experiments reported in the literature
601: refer to situations where sounds are used in
602: combination with visual images for data analysis.
603: Bly~\cite{bly} ran discriminant analysis experiments
604: using sound and graphics to represent multivariate,
605: time-varying, and logarithmic data.
606: Mezrich et al.~\cite{mezrich} used sound and
607: dynamic graphics to represent
608: multivariable time series data.
609: The ``Exvis'' experiment at the
610: University of Massachusetts at Lowell~\cite{smith}
611: expanded this work by assigning sonic attributes
612: to visual icons.
613: The importance of sound localization is recognized
614: by ongoing work at NASA-Ames~\cite{wenzel}.
615: The evaluation of auditory display techniques
616: is reported extensively at the annual conferences of ICAD,
617: the International Conference on Auditory Display;
618: see~\cite{kramer}.
619: Sound as a component of the human/computer interface
620: is discussed in~\cite{buxton}.
621: 
622: Most of the attempts described above used MIDI-controlled
623: synthesizer sounds, which have drastic limitations
624: in the number and range of their control parameters.
625: Bargar et al.~\cite{bargar} at the National Center
626: for Supercomputing Applications (NCSA)
627: have developed a complex instrument
628: with interactive capabilities,
629: which includes the VSS sound server
630: for the CAVE virtual-reality environment.
631: 
632: \subsection{What We Have Done So Far}
633: %
634: Much of our work so far has been focused on
635: the development of DIASS~\cite{ICMC92,ICMC95}.
636: In addition, we have used DIASS for two preliminary
637: experiments in scientific sonification, one in chemistry,
638: the other in materials science.
639: 
640: The first experiment used data from Dr.~Jeff Tilson,
641: a computational chemist at ANL,
642: who studied the binding of a carbon atom
643: to a protonated thiophene molecule.
644: The data represented the difference in
645: the energy levels before and after the binding
646: at $128\times128\times128$ mesh points
647: of a regular computational grid in space.
648: Because the data were static,
649: we arbitrarily identified time with
650: one of the spatial coordinates
651: and sonified data in planes parallel to this axis.
652: The time to traverse a plane over its full length
653: was usually kept at 30 seconds.
654: In a typical experiment, we assigned a sound to
655: every other point in the vertical direction,
656: distributing the frequencies regularly over
657: a specified frequency range, and used the data in the
658: horizontal direction to generate amplitude envelopes
659: for each of the sounds.
660: Thus, a sound would become louder or softer
661: as the data increased or decreased, and
662: the evolution of the loudness distribution
663: within the ensemble of 64 sounds was an indicator
664: of the distribution of the energy difference
665: before and after the reaction in space.
666: The sound parameters chosen for the representation
667: of the data varied from one experiment to another.
668: 
669: The second experiment involved data from
670: a numerical simulation in materials science.
671: The scientists were interested in patterns of motion of
672: magnetic flux vortices through a superconducting medium.
673: The medium was represented by $384\times256$ mesh points
674: in a rectangular domain.
675: As the vortices are driven across the domain,
676: from left to right, by an external force,
677: they repel each other but are attracted by
678: regularly or randomly distributed defects
679: in the material.
680: In this experiment,
681: frequency and frequency modulation (vibrato)
682: were used to represent movement in the plane,
683: and changes in loudness were connected to
684: changes in the speed of a vortex.
685: A traveling window of constant width
686: was used to capture the motion of a number
687: of vortices simultaneously.
688: 
689: These investigations are ongoing,
690: and the results have not been subjected
691: to rigorous statistical evaluation.
692: They have merely served to demonstrate
693: the capabilities of DIASS and
694: explore various mappings from
695: the degrees of freedom in the data to
696: the parameters controlling the sound synthesis process.
697: Samples can be heard on the Web~\cite{web-sonification}.
698: 
699: \subsection{What We Have Found So Far}
700: %
701: General conclusions are that 
702: (i) the sounds produced in each experiment
703: conveyed information about
704: the qualitative nature of the data,
705: and (ii) DIASS is a flexible
706: and sophisticated tool capable of 
707: rendering subtle variations in the data.
708: 
709: Changes in some control variables,
710: such as time, frequency, and amplitude,
711: are immediately recognizable.
712: Changes in the combination of partials
713: in a sound, identifiable through its timbre,
714: can be recognized with some practice.
715: Some effects are enhanced by modifiers
716: like reverberation,
717: amplitude modulation (tremolo), and
718: frequency modulation (vibrato).
719: In some instances, a modifier may lump two,
720: three, or more degrees of freedom together,
721: like hall size, duration, and acoustic properties
722: in the case of reverberation.
723: Through the proper manipulation of reverberation,
724: loudness, and spectrum, one can create
725: the illusion of sounds being produced
726: at arbitrary locations in a room,
727: even with only two speakers.
728: 
729: Like the eye,
730: the ear has a very high power of discrimination.
731: Even a coarse grid,
732: such as the temperate tuning used in Western music,
733: includes about 100 identifiable discrete steps over the
734: frequency range encompassed by a piano keyboard.
735: Contemporary music, as well as some non-Western
736: traditional music, successfully uses smaller increments
737: of a quarter tone or less for a total of some 200 or more
738: identifiable steps in the audible range.
739: Equally discriminating power is available
740: in the realm of timbre.
741: 
742: Sound is an obvious means to identify regularities
743: in the time domain, both at the microlevel
744: and on a larger scale,
745: and to bring out transitions between random states
746: and periodic happenings.
747: Most auditory processes are based on the
748: recognition of time patterns 
749: (periodic repetitions giving birth to pitch,
750: amplitude, or frequency modulation;
751: spectral consistency creating stable timbres
752: in a complex sound; etc.),
753: and the ear is highly attuned to detect
754: such regularities.
755: 
756: Most conceptual problems in scientific sonification
757: are related to finding suitable mappings between
758: the space of data and the space of sounds.
759: Common sense points toward letting the two domains
760: share the coordinates of physical space-time if
761: these are relevant and translating
762: other degrees of freedom in the data
763: into separate sound parameters.
764: On the other hand, it may be advantageous
765: to experiment with alternative mappings.
766: Sonification software must be sufficiently flexible
767: that a user can pair different sets of parameters
768: in the two domains.
769: 
770: Any mapping between data and sound parameters
771: must allow for redundancies to enable
772: the exploration of data at different levels
773: of complexity.
774: Similar to visualization software,
775: sonification software must have utilities
776: for zooming, modifying the audio palette,
777: switching between visual and aural representation
778: of parameters, defining time loops,
779: slowing down or speeding up, and so forth.
780: 
781: Our experiments also showed that DIASS,
782: at least in its present form, has its limitations.
783: One limitation concerns the sheer volume of data
784: in scientific sonification.
785: While the composition of a musical piece
786: (the original intent behind DIASS)
787: typically entails the handling of
788: a few thousand sounds,
789: each with a dozen or so partials,
790: the number of data points in the
791: computational chemistry experiment
792: ran into the millions,
793: a difference of several orders of magnitude.
794: By the same token, while a typical amplitude envelope
795: for a partial or sound in a musical composition
796: involves ten or even fewer segments,
797: both experiments required envelopes with
798: well over 100 such segments.
799: Another difficulty encountered was the fact
800: that both experiments required sounds
801: to be accurately located in space.
802: While panning is very effective in pinpointing the source
803: on a horizontal line, suggesting the height
804: of a sound is a major challenge.
805: We hope that additions to the software
806: as well as a contemplated eight-speaker system
807: will help us get closer to a realistic
808: three-dimensional representation of sounds.
809: Finally, to become an effective tool for
810: sonification, DIASS must operate in real time.
811: All three concerns are being addressed
812: in the new C++ version of DIASS currently
813: under development.
814: 
815: \section{Sound Visualization in a VR Environment}
816: %
817: The notion of sound visualization may at first sight
818: seem incongruous in the context of data sonification.
819: However, as has been recognized by several researchers,
820: the structure of a sound is difficult to detect
821: without proper training,
822: and any means of aiding the detection process
823: will enhance the value of data sonification.
824: Visualizing sounds is one of these means.
825: In this project we are focusing on
826: the visualization of sounds in the CAVE,
827: a room-size virtual-reality (VR) environment~\cite{cave},
828: and on the ImmersaDesk, a two-dimensional version.
829: 
830: \subsection{M4CAVE -- A Visualization Tool}
831: %
832: The software collectively known as M4CAVE
833: takes a score file from the sound synthesis program DIASS
834: and renders the sounds represented by the score
835: as visual images in a CAVE or ImmersaDesk.
836: The images are computed on the fly and are made
837: to correspond exactly to the sounds one hears through
838: a one-to-one mapping between control parameters
839: and visual attributes.
840: The code, which is written in C++,
841: uses OpenGL for visualizing objects.
842: 
843: \subsubsection{Graphical Representations}
844: %
845: Currently, M4CAVE can represent sounds
846: as a collection of spheres (or cubes or polyhedra),
847: as a cloud of confetti-like particles,
848: or as  a collection of planes.
849: 
850: The spheres representation is the most developed
851: and incorporates more parameters of a sound
852: into the visualization than either of the other.
853: Sounds are visualized as stacks of spheres,
854: each sphere corresponding to a partial in the sound.
855: The position of a sphere along the vertical axis
856: is determined by the frequency of the partial,
857: and its size is proportional to the amplitude.
858: A sound's position in the stereo field
859: determines the placement of the spheres
860: in the room.
861: The visual objects rotate or pulse
862: when tremolo or vibrato is applied,
863: and their color varies when
864: reverberation is present.
865: An optional grid in the background
866: shows the octaves divided into
867: twelve equal increments.
868: Figure~\ref{f-cave}---taken from
869: our Web site~\cite{web-sonification},
870: where more samples can be found---shows
871: a visualization of nine sounds
872: with different numbers of partials.
873: %
874: \begin{center}
875: \begin{figure}[htb]
876: \hspace*{0.0in}
877: \centering{\psfig{file=cse-fig3.ps,width=3.0in}}
878: \caption{Visualization of nine sounds.
879: (Picture taken from a CAVE simulator.)
880: \label{f-cave}
881: }
882: \end{figure}
883: \end{center}
884: %
885: 
886: \vspace*{-0.5in}
887: The plane and cloud representations were designed
888: more on the basis of artistic considerations.
889: (Remember that the purpose of the visualization
890: is to aid the perception of sounds.)
891: The strength of the cloud representation is
892: in showing tremolo and vibrato in the sound.
893: The planes representation is unique
894: in that it limits the visualization to only
895: one partial (usually the fundamental) of each sound.
896: The various representations can be combined,
897: and the mappings chosen for each representation
898: can be varied by means of a menu.
899: 
900: \subsection{Preliminary Findings}
901: %
902: We have used M4CAVE to explore various mappings
903: from the sound domain to the visual domain.
904: Besides the obvious short score files to test
905: the implementation of these mappings,
906: we have used score files generated with DIASS
907: of various musical compositions, notably the
908: ``A.N.L.-folds'' of Tipei~\cite{folds}.
909: A.N.L.-folds is an example of a
910: {\em manifold composition},
911: described in Box~1.
912: Each member of A.N.L.-folds lasts exactly 2'26''
913: and comprises between 200 and 500 sounds of
914: medium to great complexity.
915: The picture of Fig.~\ref{f-cave} was taken
916: from a run of one of these A.N.L.-folds.
917: 
918: The combination of visual images and sounds
919: provides indeed an extremely powerful tool
920: for uncovering complicated structures.
921: Sometimes, the sounds reveal features
922: that are hidden to the eye;
923: at other times, the visual images illuminate
924: features that are not easily detectable in the sound.
925: The two modes of perception reinforce each other,
926: and both improve with practice.
927: 
928: \section{Larger Issues}
929: %
930: This project is unusual in several respects.
931: It is somewhat speculative, in the sense that
932: we don't have much experience with the use of
933: sound in scientific computing.
934: This is the main reason why the involvement of
935: someone expert in the intricacies of the sound world
936: is critical for its success.
937: In our case, the expertise comes from the realm
938: of music composition.
939: 
940: When do we declare ``success''?
941: Can we reasonably expect that sonification
942: will evolve to the same level of usefulness
943: as visualization for computational science?
944: The answers to these questions depends
945: on one's expectations.
946: Ours is a visually oriented culture
947: {\em par excellence}, and as a society
948: we watch rather than listen.
949: Contemporary musical culture is often reduced
950: to entertainment genres that use a simple-minded
951: vocabulary---no small impediment to discover
952: the potential benefits of the world of sound.
953: But given unusual and unexpected sonorities,
954: we may yet discover that
955: we have not lost the ability to listen.
956: 
957: When we engage in this type of research,
958: it is easy to get swept up by unreasonable
959: expectations, looking for the ``killer application.''
960: But the killer application is a phantom,
961: not worth pursuing.
962: What we can offer is a systematic investigation
963: of the potential of a new tool.
964: If it helps us understand some computational data sets
965: a little better, or if it enables us
966: to explore these data sets more easily and in more detail,
967: we have good reason to claim success.
968: If the project adds to our understanding
969: of aural semiotics, we have even more reason
970: to claim success.
971: And if none of these successes materializes,
972: we can still claim that the people involved,
973: both scientists and musicians,
974: gained by becoming more familiar with
975: each other's work and ways of thinking.
976: Such a rapprochement has, in fact, already
977: occurred and led to a new ``Discovery'' course
978: entitled
979: {\em Music, Science, and Technology}
980: at UIUC, where some of the issues presented here
981: are being discussed in a formal educational context.
982: 
983: \section*{Acknowledgments}
984: %
985: This work was partially supported by the
986: Mathematical, Information, and Computational Sciences Division
987: subprogram of the Office of Computational and Technology Research,
988: U.S. Department of Energy, under Contract W-31-109-Eng-38.
989: 
990: \begin{thebibliography}{99}
991: 
992: \bibitem{buxton}
993: Baecker, R.~M., J.~Grudin, W.~Buxton, and S.~Greenberg,
994: \textsl{Readings in Human-Computer Interaction:
995: Toward the Year 2000},
996: second edition, Morgan Kaufmann Publ., Inc., San Francisco, 1995
997: 
998: \bibitem{bargar}
999: Bargar, R., I.~Choi, S.~Das, and C.~Goudeseune,
1000: ``Model-based interactive sound for an immersive virtual environment,''
1001: \textsl{Proc.\ 1994 Int'l.\ Computer Music Conference}
1002: (Tokyo, Japan), pp.\ 471--477.
1003: 
1004: \bibitem{M4C}
1005: Beauchamp, J.,
1006: \textsl{Music 4C Introduction},
1007: Computer Music Project, School of Music,
1008: University of Illinois at Urbana-Champaign, 1993.
1009: URL: http://cmp-rs.music.uiuc.edu/cmp/software/m4c.html
1010: 
1011: \bibitem{bly}
1012: Bly, S.,
1013: \textsl{Sound and Computer Information Presentation},
1014: Ph.D.\ thesis, University of California -- Davis, 1982
1015: (unpublished)
1016: 
1017: \bibitem{fletcher-munson}
1018: Fletcher, H.\ and W.~A.~Munson,
1019: ``Loudness, its definition, measurement, and calculation,''
1020: {\em J.~Acoust.\ Soc.\ Am.} {\bf 5} (1933), 82
1021: 
1022: \bibitem{MPI}
1023: Gropp, W., E.~Lusk, and A.~Skjellum,
1024: \textsl{Using MPI: Portable Parallel Programming with the
1025: Message-Passing Interface},
1026: MIT Press, 1994.
1027: See also URL:
1028: http://www.mcs.anl.gov/mpi/index.html
1029: 
1030: \bibitem{hiller-book}
1031: Hiller, L.\ and L.~Isaacson,
1032: \textsl{Experimental Music},
1033: McGraw-Hill, 1959;
1034: reprinted by Greenwood Press, 1983
1035: 
1036: \bibitem{hiller-quartet}
1037: Hiller, L.,
1038: \textsl{Computer Music Retrospective},
1039: Compact disc WER 60128-50,
1040: WERGO Schallplatten GmbH,
1041: Mainz, Germany, 1989
1042: 
1043: \bibitem{ISO}
1044: International Organization for Standardization (ISO),
1045: ``Acoustics -- Normal equal-loudness level contours,''
1046: Publ.\ No.~226:1987
1047: 
1048: \bibitem{ICMC95}
1049: Kaper, H.~G., D.~Ralley, J.~M.~Restrepo, and S.~Tipei,
1050: ``Additive synthesis with DIASS\_M4C
1051: on Argonne National Laboratory's IBM POWERparallel System (SP),''
1052: \textsl{Proc.\ 1995 Int'l.\ Computer Music Conference}
1053: (Banff, Canada), pp.\ 351--352
1054: 
1055: \bibitem{montreal}
1056: Kaper, H.~G., D.~Ralley, and S.~Tipei,
1057: ``Perceived equal loudness of complex tones:
1058: A software implementation for computer music composition,''
1059: \textsl{Proc.\ 1996 Int'l.\ Conference in Music Perception and Cognition}
1060: (Montreal, Canada), pp.\ 127--132
1061: 
1062: \bibitem{kramer}
1063: Kramer, G.\ (ed.),
1064: \textsl{Auditory Display: Sonification, Audification,
1065: and Auditory Interfaces},
1066: Proc.\ ICAD '92,
1067: Addison-Wesley Publ.\ Co., 1994.
1068: For proceedings of later conferences,
1069: consult URL: http://www.santafe.edu/\~{ }icad
1070: 
1071: \bibitem{ICMC92}
1072: Kriese, C.\ and S.~Tipei,
1073: ``A compositional approach to additive synthesis on supercomputers,''
1074: \textsl{Proc.\ 1992 Int'l.\ Computer Music Conference}
1075: (San Jose, California), pp.\ 394--395
1076: 
1077: \bibitem{mezrich}
1078: Mezrich, J., S.~Frysinger, and R.~Slivjanovski,
1079: ``Dynamic representation of multivariate time series data,''
1080: {\em J.~Amer.\ Stat.\ Ass.} {\bf 79} (1984), 34--40
1081: 
1082: \bibitem{nature}
1083: Pereverzev, S.~V., A.~Loshak, S.~Backhaus, J.~C.~Davis,
1084: and R.~E.~Packard,
1085: ``Quantum oscillations between two weakly coupled
1086: reservoirs of superfluid ${}^3$He,''
1087: {\em Nature} {\bf 388} (1997), 449-451
1088: 
1089: \bibitem{roads}
1090: Roads, C.,
1091: \textsl{The Computer Music Tutorial},
1092: MIT Press, Cambridge, Mass., 1996
1093: 
1094: \bibitem{roederer}
1095: Roederer, J.~G.,
1096: \textsl{The Physics and Psychophysics of Music},
1097: 3rd edition.
1098: Springer-Verlag, 1995
1099: 
1100: \bibitem{rossing}
1101: Rossing, T.~D.,
1102: \textsl{The Science of Sound},
1103: Addison-Wesley Publ.\ Co., 1990
1104: 
1105: \bibitem{ICMC}
1106: Simoni, M.\ (ed.),
1107: \textsl{Proc.\ ICMC98, Int'l Computer Music Conference}
1108: (Ann Arbor, Michigan), October 1998;
1109: see also proceedings of earlier conferences
1110: 
1111: \bibitem{smith}
1112: Smith, S.\ and M.~Williams,
1113: ``The use of sound in an exploratory visualization experiment,''
1114: CS Dept., U.~Mass. at Lowell,
1115: tech report R-89-002, 1989
1116: 
1117: \bibitem{tipei}
1118: Tipei, S.,
1119: ``The computer: A composer's collaborator,''
1120: {\em Leonardo\/} {\bf 22}(2) 1989, 189--195
1121: 
1122: \bibitem{manif}
1123: Tipei, S.,
1124: ``Manifold compositions --- A (super)computer-assisted composition
1125: experiment in progress,''
1126: \textsl{Proc.\ 1989 Int'l.\ Computer Music Conference}
1127: (Columbus, Ohio), pp.\ 324--327
1128: 
1129: \bibitem{folds}
1130: Tipei, S.,
1131: ``A.N.L.-folds.\
1132: mani 1943-0000;
1133: mani 1985r-2101;
1134: mani 1943r-0101;
1135: mani 1996m-1001;
1136: mani 1996t-2001''
1137: (1996).
1138: Report ANL/MCS-P679-0897,
1139: Mathematics and Computer Science Division,
1140: Argonne National Laboratory
1141: 
1142: \bibitem{web-sonification}
1143: URL:
1144: http://mcs.anl.gov/appliedmath/Sonification/index.html
1145: 
1146: \bibitem{cave}
1147: URL:
1148: http://www.evl.uic.edu/pape/CAVE/prog/CAVEGuide.html
1149: 
1150: \bibitem{wenzel}
1151: Wenzel, E., S.~Fisher, P.~Stone, and S.~Foster,
1152: ``A system for three-dimensional acoustic `visualization'
1153: in a virtual environment workstation,''
1154: \textsl{Proc.\ Visualization '90:
1155: First IEEE Conf.\ on Visualization},
1156: IEEE Computer Society Press, Washington,
1157: pp. 329--337
1158: 
1159: \bibitem{xenakis}
1160: Xenakis, I.,
1161: \textsl{Formalized Music: Thought and Mathematics
1162: in Musical Composition},
1163: revised edition, Pendragon Press, 1992
1164: 
1165: \bibitem{yeung}
1166: Yeung, E.,
1167: ``Pattern recognition by audio representation
1168: of multivariate analytical data,''
1169: \textit{Analytical Chemistry} {\bf 52} (1980), 1120--1123
1170: 
1171: \bibitem{zwicker-80}
1172: Zwicker, E.\ and E.~Terhardt,
1173: ``Analytical expressions for critical-band rate
1174: and critical bandwidth as a function of frequency,''
1175: {\em J.~Acoust.\ Soc.\ Am.} {\bf 68} (1980), 5
1176: 
1177: \end{thebibliography}
1178: 
1179: \newpage
1180: 
1181: {\bf Hans G. Kaper}
1182: is Sr.\ Mathematician at Argonne National Laboratory.
1183: After receiving his Ph.D.\ in mathematics from the
1184: University of Groningen (the Netherlands) in 1965,
1185: he held positions at the University of Groningen
1186: and Stanford University.
1187: In 1969 he joined the staff of Argonne.
1188: He was director of the
1189: Mathematics and Computer Science Division
1190: from 1987 to 1991.
1191: Kaper's professional interests are in
1192: applied mathematics, particularly
1193: mathematics of physical systems and
1194: scientific computing.
1195: He is a corresponding member of the
1196: Royal Netherlands Academy of Sciences.
1197: His main interest outside mathematics is classical music.
1198: He is chairman of ``Arts at Argonne,''
1199: concert impresario, and an accomplished pianist.
1200: He can be reached at kaper@mcs.anl.gov or
1201: http://www.mcs.anl.gov/\~{ }kaper/index.html.
1202: 
1203: {\bf Sever Tipei}
1204: is professor of composition and music theory at the
1205: University of Illinois at Urbana-Champaign (UIUC),
1206: where he also manages the Computer Music Project
1207: of the UIUC Experimental Music Studios.
1208: He has a Diploma in piano 
1209: from the Bucharest Conservatory in Romania
1210: and a DMA in composition from the University of Michigan.
1211: Tipei has been involved in computer music since 1973
1212: and regards the composition of music both as an experimental 
1213: and as a speculative endeavor.
1214: He can be reached at s-tipei@uiuc.edu or
1215: http://cmp-rs.music.uiuc.edu/people/tipei/index.html.
1216: 
1217: {\bf Elizabeth Wiebel}
1218: was a participant in the
1219: Student Research Participation Program,
1220: which is sponsored by the
1221: Division of Educational Programs
1222: of Argonne National Laboratory.
1223: She is an undergraduate student
1224: at St.\ Norbert College in De Pere, Wisconsin,
1225: where she is pursuing a degree in
1226: mathematics and computer science.
1227: In addition, Wiebel studies and teaches piano
1228: through the St.\ Norbert College music department.
1229: She is currently spending a semester
1230: at Richmond College in London (England).
1231: She can be reached at wiebep@sncac.snc.edu,
1232: F300398@Richmond.ac.uk,
1233: or
1234: http://members.tripod.com/\~{ }LibbyW.
1235: 
1236: \newpage
1237: 
1238: \section*{Box~1. \quad Computer-Assisted Music Composition}
1239: %
1240: The idea of using computers for music composition
1241: goes back to the 1950s, when Lejaren Hiller performed
1242: his experiments at the University of Illinois~\cite{hiller-book}.
1243: The premiere of his Quartet No.\ 4 for strings
1244: ``Illiac Suite''~\cite{hiller-quartet}
1245: (May 1957) is generally regarded as
1246: the birth of computer music.
1247: Since then, computers have helped many composers
1248: to algorithmically synthesize new sounds and
1249: produce new pieces for acoustic as well as
1250: digital instruments.
1251: The proceedings of the annual conferences
1252: sponsored by the ICMA
1253: (International Computer Music Association)
1254: are good sources of references~\cite{ICMC}.
1255: 
1256: Why would a composer need computer assistance 
1257: when composing?
1258: A quick answer is that, as in many other areas,
1259: routine operations can be relegated to the machine.
1260: A more sophisticated reason may be that the composer
1261: may rely on expert systems to write Bach-like chorales
1262: or imitate the mannerisms of Chopin or Rachmaninov.  
1263: There are, however, more compelling reasons
1264: when composing is viewed as a speculative 
1265: and experimental endeavor, rather than as
1266: an ability to manufacture pleasing sounds~\cite{tipei}.
1267: 
1268: Music is basically a dynamic event evolving
1269: in a multidimensional space;
1270: as such, it can be formalized~\cite{xenakis}.
1271: The composer controls the evolution by supplying
1272: a set of rules, and accepts the output as long as
1273: it is consistent with the logic of the program
1274: and the input data.
1275: If the set of rules allows for a certain degree
1276: of randomness, the output will be different
1277: every time a new ``seed'' is introduced.
1278: The same code and input data may thus produce
1279: an unlimited number of compositions,
1280: all belonging to the same ``equivalence class''
1281: or {\em manifold composition}~\cite{manif}.
1282: The members of a manifold composition are variants
1283: of the same piece; they share the same structure
1284: and are the result of the same process, but differ
1285: in the way specific events are arranged in time.
1286: 
1287: A nontraditional way of composing,
1288: the manifolds show how high-performace computing
1289: provides the composer with new means
1290: to try out compositional strategies
1291: or materials and hear the results
1292: in a reasonable amount of time.  
1293: 
1294: \newpage
1295: 
1296: \section*{Box~2. \quad Loudness}
1297: %
1298: Sound is transmitted through sound waves---periodic
1299: pressure variations that cause the eardrums to vibrate.
1300: But the perception of loudness has as much to do with
1301: the amount of energy that is carried by the sound wave
1302: as with the processing of this energy that takes place
1303: in the ear and the brain once the sound wave has hit
1304: the eardrums.
1305: The latter is a much more subjective part of the experience.
1306: The algorithms underlying the loudness routines
1307: of DIASS incorporate therefore formal definitions,
1308: as well as results of psychoacoustic research experiments.
1309: We summarize the most relevant elements of the algorithm,
1310: referring the reader to~\cite{roederer} or~\cite{rossing}
1311: for details.
1312: 
1313: The definition of (perceived) loudness begins
1314: with the consideration of the energy carried
1315: by the sound wave.
1316: The {\em intensity\/} $I$ of a pure tone (sinusoidal sound)
1317: is expressed in terms of its average pressure variation
1318: $\Delta p$ (measured in newton/m$^2$),
1319: \[
1320:   I = 20 \times \log_{10} (\Delta p / \Delta p_0) .
1321: \]
1322: $\Delta p_0$ is a reference value,
1323: usually identified with a traveling wave
1324: of 1,000~Hz at the threshold of hearing,
1325: $\Delta p_0 = 2 \times 10^{-5}$~newton/m$^2$.
1326: The unit of $I$ is the decibel (dB).
1327: 
1328: Because of the way acoustical vibrations are processed 
1329: in the cochlea (the internal ear),
1330: the sensation of loudness is strongly frequency dependent.
1331: For instance, while an intensity of 50~dB at 1,000~Hz is considered
1332: {\em piano}, the same intensity is barely audible at 60~Hz.
1333: In other words, to produce a given loudness sensation
1334: at low frequencies, a much higher intensity (energy flow)
1335: is needed than at 1,000~Hz.
1336: The intensity $I$ is therefore not a good measure of loudness
1337: if different frequencies are involved.
1338: 
1339: In the 1930s, Fletcher and Munson~\cite{fletcher-munson}
1340: performed a series of loudness-matching experiments,
1341: from which they derived a set of
1342: {\em curves of equal loudness}.
1343: These are curves in the
1344: frequency ($f$) vs.\ intensity ($I$) plane;
1345: points on the same curve represent
1346: single continuously sounding pure tones
1347: that are perceived as being ``equally loud.''
1348: They are similar to those recommended
1349: by the International Organization for
1350: Standardization (ISO)~\cite{ISO}
1351: and are presented in Fig.~\ref{f-loudness}.
1352: The curves show clearly that,
1353: in order to be perceived as equally loud,
1354: very low and very high frequencies require
1355: much higher intensities (energy)
1356: than frequencies in the middle range
1357: of the spectrum of audible sounds.
1358: %
1359: \begin{figure}[htb]
1360: \hspace*{0.0in}
1361: \centering{\psfig{file=cse-fig4.ps,width=3.0in}}
1362: \caption{
1363: Curves of equal loudness (marked in phons)
1364: in the frequency vs.\ intensity plane.
1365: \label{f-loudness}
1366: }
1367: \end{figure}
1368: %
1369: 
1370: The (physical) {\em loudness level\/} $L_p$
1371: of a Fletcher-Munson curve is identified
1372: with the value of $I$
1373: at the reference frequency of 1,000~Hz.
1374: The unit of $L_p$ is the phon.
1375: The Fletcher-Munson curves range from a loudness
1376: level of 0 to 120~phons over a frequency
1377: range from 25 to 16,000~Hz.
1378: 
1379: The loudness level $L_p$ still does not measure loudness
1380: in an absolute manner: a tone whose $L_p$ is twice
1381: as large does not sound twice as loud.
1382: Following Rossing~\cite{rossing},
1383: we define the (subjective) {\em loudness level\/} $L_s$
1384: in terms of $L_p$ by the formula
1385: $L_s = 2^{(L_p-40)/10}$.
1386: The unit of $L_s$ is a sone.
1387: To be effective, loudness scaling must be done
1388: on the basis of sones.
1389: 
1390: The loudness of a sound that is composed of
1391: several partials depends on how well the
1392: frequencies of the partials are separated.
1393: With each frequency $f$ is associated
1394: a {\em critical band}, whose width $\Delta f$
1395: is approximely given by the expression~\cite{zwicker-80}
1396: \[
1397:   \Delta f \approx 25 + 75 \left(1 + 1.4(f/1000)^2 \right)^{0.69} .
1398: \]
1399: Intensities within a critical band are added,
1400: and the loudness of a critical band can again
1401: be read off from the Fletcher-Munson tables.
1402: If the frequencies of its constituent partials
1403: are spread over several critical bands,
1404: the loudness of a sound is computed
1405: in accordance with a formula
1406: due to Rossing~\cite{rossing},
1407: \[
1408:   L_s = L_{s,m} + 0.3 \sum_i L_{s,i} .
1409: \]
1410: Here, $L_{s,m}$ is the loudness of the loudest critical band,
1411: and the sum extends over the remaining bands.
1412: 
1413: The loudness routines in DIASS use
1414: critical band information and
1415: a table derived from the Fletcher-Munson curves
1416: to create complex sounds of specified loudness.
1417: 
1418: \newpage
1419: 
1420: \section*{Box~3. \quad Loudness of Sound Clusters}
1421: 
1422: The waveform of Fig.~\ref{f-clusters},
1423: which was produced with DIASS,
1424: illustrates the concept of equal loudness
1425: across the frequency spectrum and for
1426: different timbres.
1427: The waveform represents five sound clusters,
1428: each lasting 5.5 seconds (except the fourth,
1429: which lasts 5.7 seconds).
1430: The clusters, although of widely different structure,
1431: have been designed to be perceived at the same
1432: loudness level ($2^5$ sones).
1433: %
1434: \begin{figure}[htb]
1435: \hspace*{0.0in}
1436: \centering{\psfig{file=cse-fig5.ps,width=4.0in}}
1437: \caption{
1438: Waveform of five sound clusters of equal perceived loudness.
1439: \label{f-clusters}
1440: }
1441: \end{figure}
1442: %
1443: 
1444: The distribution of the sounds within each cluster
1445: is represented schematically in the diagram of
1446: % Table~2.
1447: Table~\ref{t-clusters}.
1448: The first sound cluster has 24 sounds.
1449: The fundamental frequencies of the sounds
1450: range from 40 to 5,000 Hz.
1451: Each sound is harmonically tuned;
1452: that is, it is made up of a fundamental
1453: and all its harmonics
1454: (partials whose frequencies
1455: are integer multiples of
1456: the fundamental frequency).
1457: The frequencies are limited to
1458: one-half of the sampling rate
1459: (Nyquist criterion);
1460: hence, the number of partials
1461: in this cluster is 754
1462: (at a sampling rate of 22,050 Hz).
1463: The second sound cluster has 5 sounds,
1464: harmonically tuned, with fundamental frequencies
1465: ranging from 40 to 4,000 Hz;
1466: the number of partials is 113.
1467: The third, fourth, and fifth cluster have
1468: 15, 1, and 10 sounds, with 453, 60, and 250 partials,
1469: respectively.
1470: All partials are assigned the same amplitude,
1471: which presents the worst-case scenario
1472: when one tries to obtain the same
1473: perceived loudness for all clusters.
1474: 
1475: \begin{center}
1476: \begin{table}[h]
1477: \caption{Distribution of fundamentals
1478: in the clusters of Figure~5.  \label{t-clusters}}
1479: \vspace*{1ex}
1480: \begin{footnotesize}
1481: \hspace*{-3em}
1482: \begin{tabular}{||r| c c c c c||}\hline
1483: \multicolumn{1}{||c |}{Fundamental}
1484:          & 24 Sounds   & \hspace{-2em}5 Sounds      & \hspace{-2em}15 Sounds
1485:  & \hspace{-2em}1 Sound       & \hspace{-2em}10 Sounds \\
1486: \multicolumn{1}{||c |}{Frequency}
1487:          &\hspace{-2em}(754 partials) &\hspace{-2em}(113 partials) &\hspace{-2em}(453 partials) &\hspace{-2em}(60 partials)  &\hspace{-2em}(250 partials) \\\hline
1488:          &               &               &               &               & \\
1489: 5,000 Hz & \Yes          &               &               &               & \\
1490: 4,500 Hz & \Yes          &               &               &               & \Yesm\\
1491: 4,000 Hz & \Yes          & \Yesm         & \Yesm         &               & \\
1492: 3,000 Hz & \Yes          &               & \Yesm         &               & \\
1493: 2,666 Hz & \Yes          &               & \Yesm         &               & \\
1494: 2,000 Hz & \Yes          &               &               &               & \Yesm\\
1495: 1,666 Hz & \Yes          & \Yesm         & \Yesm         &               & \\
1496: 1,333 Hz & \Yes          &               &               &               & \Yesm\\
1497: 1,000 Hz & \Yes          &               & \Yesm         &               & \Yesm\\
1498: 750 Hz   & \Yes          & \Yesm         & \Yesm         &               & \\
1499: 625 Hz   & \Yes          &               &               &               & \Yesm\\
1500: 500 Hz   & \Yes          &               & \Yesm         &               & \\
1501: 400 Hz   & \Yes          &               &               &               & \Yesm\\
1502: 300 Hz   & \Yes          & \Yesm         & \Yesm         &               & \\
1503: 200 Hz   & \Yes          &               & \Yesm         &               & \\
1504: 165 Hz   & \Yes          &               &               &               & \Yesm\\
1505: 130 Hz   & \Yes          &               & \Yesm         &               & \\
1506: 90 Hz    & \Yes          &               & \Yesm         &               & \\
1507: 80 Hz    & \Yes          &               &               &               & \Yesm\\
1508: 70 Hz    & \Yes          &               &               &               & \Yesm\\
1509: 60 Hz    & \Yes          &               & \Yesm         &               & \\
1510: 53 Hz    & \Yes          &               & \Yesm         &               & \\
1511: 46 Hz    & \Yes          &               & \Yesm         &               & \\
1512: 40 Hz    & \Yes          & \Yesm         & \Yesm         & \Yesm         & \Yesm\\
1513:          &               &               &               &               & \\\hline
1514: Time     &
1515: \multicolumn{1}{l}{0.0"}&
1516: \multicolumn{1}{l}{\hspace{-1em}5.5"}&
1517: \multicolumn{1}{l}{\hspace{-2em}11.0"}&
1518: \multicolumn{1}{l}{\hspace{-1em}16.5"}&
1519: \multicolumn{1}{l||}{\hspace{-2em}22.2"}\\\hline
1520: \end{tabular}
1521: \end{footnotesize}
1522: \end{table}
1523: \end{center}
1524: %
1525: 
1526: \newpage
1527: 
1528: \section*{Box~4. \quad Computational Complexity}
1529: 
1530: To give some idea of the computational complexity,
1531: consider the following simple scenario,
1532: where we wish to sonify time-varying data
1533: representing the values of two primary and
1534: several secondary observables measured
1535: over the course of an experiment.
1536: A natural choice is to map the primary observables
1537: onto loudness and frequency and to use
1538: amplitude and frequency modulation to monitor
1539: the secondary observables.
1540: The sample values of the sound wave $S$ must be
1541: calculated from an expression of the form
1542: \be
1543:   S (t) = a (t) \sin \left(2\pi f(t) t + \phi \right) .
1544:   \Label{S}
1545: \ee
1546: The frequency $f$ represents three degrees of freedom:
1547: the carrier frequency $f^C$, and the amplitude
1548: $a^{FM}$ and frequency $f^{FM}$ of the modulating wave,
1549: \be
1550:   f (t)
1551:   =
1552:   f^C (t)
1553:   + a^{FM} (t) \sin \left(2\pi f^{FM} t + \phi^{FM} \right) .
1554:   \Label{f}
1555: \ee
1556: The carrier frequency is identified with a primary observable,
1557: each of the remaining two degrees of freedom can be identified
1558: with a secondary observable,
1559: 
1560: Similarly, the amplitude $a$ is given by an expression
1561: of the form
1562: \be
1563:   a (t)
1564:   =
1565:   a^C (t)
1566:   + a^{AM} (t) \sin \left(2\pi f^{AM} t + \phi^{AM} \right) .
1567:   \Label{a}
1568: \ee
1569: We compute the carrier amplitude $a^C$
1570: from the observed loudness, which is identified with
1571: one of the (primary) observables, so its value is given.
1572: The amplitude $a^{AM}$ and frequency $f^{AM}$ of the modulation
1573: represent two more degrees of freedom,
1574: which can be identified with
1575: two other secondary observables.
1576: In total, we have therefore two primary and four secondary variables
1577: (not counting the phases, which we assume to be static).
1578: 
1579: The amplitude $a^C (t)$ must be computed such that
1580: $S (t)$ has the perceived loudness level $L_s (t)$,
1581: \be
1582:   L_s (S(t)) = L_s (t) .
1583:   \Label{L}
1584: \ee
1585: The loudness function $L_s$ is a nonlinear function of
1586: the amplitude and frequency of the partial (sound).
1587: Its computation is done in the loudness routines of DIASS
1588: and involves a significant number of operations,
1589: including table lookups; see Box~2.
1590: 
1591: On the basis of these formulas we can obtain
1592: a rough estimate of the number of operations
1593: (additions, multiplications,
1594: function evaluations---sine,
1595: exponential, or logarithm,
1596: and table lookups)
1597: required for the computation of a single sample value.
1598: The contribution that is most difficult to estimate is
1599: the computation of the carrier amplitude from
1600: the loudness;
1601: the data in Table~\ref{t-ops}
1602: represent the minimum number of operations.
1603: %
1604: \begin{table}[htb]
1605: \begin{center}
1606: \caption{Number of operations per partial per sample value.  \label{t-ops} }
1607: \vspace*{2ex}
1608: \begin{small}
1609: \begin{tabular}{|| c || c | c | c | c ||}\hline
1610: Eq.       & Adds & Mults & Fn Evals & Tbl Lkups \\\hline
1611: (\ref{S}) & 1    & 3     & 1        & -           \\
1612: (\ref{f})& 2    & 3     & 1        & -           \\
1613: (\ref{a}) & 2    & 3     & 1        & -           \\
1614: (\ref{L}) & 1    & 3     & 2        & 1           \\\hline
1615: Total     & 6    & 12    & 5        & 1           \\\hline
1616: \end{tabular}
1617: \end{small}
1618: \end{center}
1619: \end{table}
1620: %
1621: Ignoring phases and so forth, we find a total of
1622: at least 24 operations.
1623: Hence, at the standard rate of 44,100 samples per second,
1624: one needs to perform more than
1625: 1.1 million operations per second.
1626: 
1627: The simultaneous sonification of more observables
1628: is obviously much more complicated;
1629: in fact, the complications grow exponentially.
1630: A careful estimate of the computational complexity
1631: requires an analysis of the anticlip routines,
1632: which is beyond the scope of the present article.
1633: 
1634: \end{document}
1635: