
Last Updated November 2008
This page contains a list of all the papers produced by our group starting with the most current. We have also provided links to each available publication.
By Current Group Members


Vercoe, B. (2004). "Audio-Pro with Multiple DSPs and Dynamic Load Distribution" in British Telecom Technology Journal, (to appear October 2004).

Vercoe, B.L., Haidar, M., Kitamura, H., and Jayakumar, S. (2003). "Multiprocessor Csound: Audio-Pro with Multiple DSP's and Dynamic Load Distribution" in Proceedings, Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas.

Chai, Wei and Vercoe, Barry. (2001). "Folk Music Classification Using Hidden Markov Models" in Proceedings, International Conference on Artificial Intelligence, June 2001.

Vercoe, B.L. (2000). "Understanding Csound's Spectral Data Types" in Boulanger, R.C. (Ed.), The Csound Book, Cambridge, MA: MIT Press, pp. 437-447.

Vercoe, B.L. (2000). "Foreword" in Boulanger, R.C. (Ed.), The Csound Book, Cambridge, MA: MIT Press, pp. xxvii-xxx.

Scheirer, E. D. and Vercoe, B.L. (1999). "SAOL: The MPEG-4 Structured Audio Orchestra Language" in Computer Music Journal 23:2 (Summer 1999), pp 31-51.

Scheirer, E.D., and Vercoe, B.L. (1998). "The MPEG-4 Structured Audio Orchestra Language". Proc. 1998 Int. Computer Music Conf, Ann Arbor, MI, Oct 1998.

Martin, K.D., Scheirer, E.D., and Vercoe, B.L. (1998). "Music content analysis through models of audition" in Proceedings, 1998 ACM Multimedia Workshop on Content Processing of Music for Multimedia Applications, Bristol UK, (Sept. 1998).

Vercoe, B.L., Gardner, W.G., and Scheirer, E.D. (1998). "Structured Audio: Creation, Transmission, and Rendering of Parametric Sound Representations" in Proceedings of the IEEE 86:5 (May 1998), pp. 922-940.

Vercoe, B.L. (1997). "Computational Auditory Pathways to Music Understanding" in Deliege I. and Sloboda J. (Eds.), Perception and Cognition of Music, East Sussex, UK: Psychology Press, pp. 307-326.

Vercoe, B.L. (1996). "Extended Csound" in Proceedings, ICMC, Hong Kong, pp. 141-142.

Vercoe, B.L. (1996). "Extended Csound: A Manual for the Audio-Processing System and DSP Realtime Extensions", Analog Devices Inc., Norwood MA.

Vercoe, B.L. (1994). "Csound: A Manual for the Audio-Processing System", MIT Media Lab.

Ellis, D.P.W., and Vercoe, B.L. (1992). "A perceptual representation of sound for auditory signal separation" in Proceedings, Acoustical Society of America, Salt Lake City, May 1992.

Vercoe, B.L., (1990). "Synthetic Listeners and Synthetic Performers" in Proceedings, International Symposium on Multimedia Technology and Artificial Intelligence (Computerworld 90), Kobe Japan, Nov 1990, pp. 136-141.

Vercoe, B.L., (1990). "A Realtime Auditory Model of Rhythm Perception and Cognition" 2nd International Conference on Music and the Cognitive Sciences, Cambridge UK, Sept 1990.

Vercoe, B.L., and Ellis, D.P.W. (1990). "Real-time Csound: Software Synthesis with Sensing and Control" in Proceedings, ICMC, Glasgow, pp. 209-211.

Vercoe, B.L. (1988). "Hearing Polyphonic Music with the Connection Machine" in Proceedings, First Workshop on Artificial Intelligence and Music, AAA-88, St. Paul, MN, pp. 183-194.

Vercoe, B.L., and Puckette, M.S. (1985). "Synthetic Rehearsal: Training the Synthetic Performer" in Proceedings, ICMC, Burnaby, BC, Canada, pp. 275-278.

Vercoe, B.L. (1984). "The Synthetic Performer in the Context of Live Performance" in Proceedings, International Computer Music Conference, Paris, pp. 199-200.

Vercoe, B.L. (1983). "Computer Systems and Languages for Audio Research" The New World of Digital Audio (Audio Engineering Society Special Edition), pp. 245-250.

Vercoe, B. (1982). "New Dimensions in Computer Music" Trends and Perspectives in Signal Processing II/2 (April, 1982), pp. 15-23.

« Back to the Top »


Brown, J.C. (2008). "Mathematics of pulsed vocalizations with application to killer whale biphonation", J. Acoust. Soc. Am. 123, 2875-2883.

Brown, J.C., and P.J.O. Miller (2007). "Automatic classification of killer whale vocalizations using dynamic time warping", J. Acoust. Soc. Am. 122, 1201-1207.

Brown, Judith C. (2006). "Fundamental Frequency Tracking and Applications to Musical Signal Analysis", in {Analysis, Synthesis, and Perception of Musical Sounds: The Sound of Music (Modern Acoustics and Signal Processing Series)} edited by J.W. Beauchamp (Springer, New York), 90-121.

Ellis, D.P.W., B. Raj, J.C. Brown, M. Slaney, P. Smaragdis (2006)."Editorial for Special Section on Statistical and Perceptual Audio Processing," IEEE Transactions on Audio, Speech, and Language Processing Vol 14 NO. 1 pp 2-4.

Brown, J.C. and P.J.O. Miller (2006)."Classifying Killer Whale Vocalization Using Time Warping,", Acoustics Today, July 2006 (Echoes 16), 45-47.

Brown, J.C., A. Hodgins-Davis, and P.J.O. Miller (2006)."Classification of vocalizations of killer whales using dynamic time warping", J. Acoust. Soc. Am. 119, EL34-EL40.

Brown, J.C. and P. Smaragdis (2004)."Independent Component Analysis for automatic note extraction from musical trills", J. Acoust. Soc. Am. 115, 2295-2306.

Brown, J.C., Houix, O., and McAdams, S. (2001) "Feature dependence in the automatic identification of musical woodwind instruments" J. Acoust. Soc. Am. 109, 1064-1072.

Brown, J.C. (2000)."Computer Identification of Musical Woodwind Instruments" Invited lay language presentation for the Acoustical Society of America meeting held May 30 - June 3, 2000.

Brown, J.C. (1999). "Computer identification of musical instruments using pattern recognition with cepstral coefficients as features" J. Acoust. Soc. Am. 105, 1933-1941.

Puckette, M.S. and Brown J.C., (1998) "Accuracy of frequency estimates using the phase vocoder" IEEE Trans on Speech and Audio Processing 6, 166-176.

Brown, J.C. (1997). "Calculation of a Constant Q Spectral Transform" reprinted in Research papers in violin acoustics, 1975-1993, Ed. Carleen Hutchins, 35-44.

Brown, J.C., and Vaughan, K.V., (1996). "Pitch center of frequency modulated musical sounds" J. Acoust. Soc. Am. 100, 1728-1735.

Brown, J.C., (1996). "Frequency ratios of spectral components of musical sounds" J. Acoust. Soc. Am. 99, 1210-1218.

Brown, J.C., (1993). "Determination of the meter of musical scores by autocorrelation", J. Acoust. Soc. Am. 94, 1953-1957.

Brown, J.C., and Puckette, M.S., (1993). "A high resolution fundamental frequency determination based on phase changes of the fourier transform", J. Acoust. Soc. Am. 94, 662-667.

Brown, J.C., and Puckette, M.S., (1992). "An Efficient Algorithm for the Calculation of a Constant Q Transform", J. Acoust. Soc. Am. 92, 2698-2701.

Brown, J.C., (1992)."Musical Fundamental Frequency Tracking using a Pattern Recognition Method" J. Acoust. Soc. Am. 92 1394-1402.

Brown, J.C. and Zhang, B., (1991). "Musical Frequency Tracking using the Methods of Conventional and 'Narrowed' Autocorrelation" J. Acoust. Soc. Am. 89 2346-2354.

Palmer, C. and Brown, J.C., (1991). "Investigations in the amplitude of sounded piano tones" J. Acoust. Soc. Am. 90 60-66.

Brown, J.C., (1991). "Calculation of a Constant Q Spectral Transform" J. Acoust. Soc. Am. 89 425-434.

Brown, J.C., and Puckette, M.S., (1989). "Calculation of a Narrowed Autocorrelation Function", J. Acoust. Soc. Am. 85, 1595-1601.

Brown, J.C., - Older Publications

« Back to the Top »


Sarkar, Mihir (2007). "TablaNet: a Real-Time Online Musical Collaboration System for Indian Percussion". Master's Thesis, MAS Department, Massachusetts Institute of Technology, Cambridge, MA, USA.

Sarkar, Mihir and Vercoe, Barry (2007). "Recognition and Prediction in a Network Music Performance System for Indian Percussion". International Conference on New Interfaces for Musical Expression (NIME 2007), New York, NY, USA.

M. Sarkar, B. Vercoe, and Y. Yang (2007). "Words that describe timbre: a study of auditory perception through language" [abstract], Language and Music as Cognitive Systems Conference (LMCS-2007), Cambridge, UK, May 11-13, 2007.

« Back to the Top »


Li, Wu-Hsi (2008). "Musicpainter: a Collaborative Composing Environment". Master's Thesis, MAS Department, Massachusetts Institute of Technology, Cambridge, MA, USA.

« Back to the Top »


Huang, Cheng-Zhi Anna (2008). "Melodic Variations: Toward Cross-Cultural Transformation". Master's Thesis, MAS Department, Massachusetts Institute of Technology, Cambridge, MA, USA.

« Back to the Top »

By Group Alumni


Casey, M., "Auditory Group Theory: with Applications to Statistical Basis Methods for Structured Audio", Ph.D. Thesis, MIT Media Lab, February 1998.

Casey, M., "Independent Component Analysis and Sound Synthesis", presented at International Conference on Auditory Display, Palo Alto, CA, November 1997.

Casey, M., Gardner, W. and Basu, S., "Vision-Steered Beam Forming and Transaural Rendering for the Artificial Life Interactive Video Environment (ALIVE)", Proceedings of the Audio Engineering Society 99th Conference, New York, November 1996.

Casey, M. and Wachman, J., "Unsupervised Cross-Modal Analysis of Professional Discourse", Fourth International Conference on Spoken Language Processing (ICSLP 96) Workshop on the Integration of Gesture and Language in Speech, Delaware, October 1996.

Casey, M. and Anderson, D., "The Sound Dimension: Audio for Distributed Virtual Environments", IEEE Spectrum special issue on Distributed Virtual Environments, April 1997.

Casey, M. and Smaragdis, P., "Netsound: Structured Audio Encoding and Rendering", Proceedings of the International Computer Music Conference, ICMA, Hong Kong, September, 1996

Casey, M., "Multi-Model Classification as Basis for Computational Timbre Understanding", International Conference on Music Perception and Cognition, Montreal, August 1996.

Casey, M. "Understanding Musical Sound with Forward Models and Physical Models", Connection Science 6:2, 1995.

Casey, M. A. (1994). "Understanding Musical Sound with Forward Models and Physical Models Connection Science", Vol. 6, nos. 2 & 3, 1994.

Casey, M., "Practice Makes Perfect: Distal Learning of Musical Instrument Control Parameters", International Conference on Music Perception and Cognition, Philadelphia, July 1993.

« Back to the Top »


Chai, Wei. "Segmentation and Summarization of Music". IEEE Signal Processing Magazine, Special Issue on Semantic Retrieval of Multimedia, 2006.

Chai, Wei and Vercoe, Barry. "Detection of Key Change in Classical Piano Music". Proceedings of International Conference on Music Information Retrieval, 2005.

Chai, Wei and Vercoe, Barry. "Music Classification with Partial Selection Based on Confidence Measures". Proceedings of ICML05 Workshop on Machine Learning Techniques for Processing Multimedia Content, 2005.

Chai, Wei. "Automated Analysis of Musical Structure". PhD Dissertation. MIT 2005.

Chai, Wei and Vercoe, Barry. "Music Thumbnailing Via Structural Analysis." Proceedings of ACM Multimedia Conference, November 2003.

Chai, Wei and Vercoe, Barry. "Structural Analysis Of Musical Signals For Indexing and Thumbnailing." Proceedings of ACM/IEEE Joint Conference on Digital Libraries, May 2003.

Chai, Wei. "Structural Analysis Of Musical Signals Via Pattern Matching." Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, April 2003. Poster

Chai, Wei. "Structural Analysis of Musical Signals for Indexing, Segmentation and Thumbnailing." Paper for the Major Area of the PhD General Exam, March 2003.

Chai, Wei and Vercoe, Barry. "Melody Retrieval On The Web." Proceedings of ACM/SPIE Conference on Multimedia Computing and Networking, Jan. 2002.

Chai, Wei. "Melody Retrieval On The Web." Master thesis. MIT 2001.

Chai, Wei and Vercoe, Barry. "Folk Music Classification Using Hidden Markov Models." Proceedings of International Conference on Artificial Intelligence, June 2001.

Chai, Wei and Vercoe, Barry. "Using User Models in Music Information Retrieval Systems." Proceedingsof International Symposium on Music Information Retrieval, Oct. 2000. Poster PDF

Kim, Youngmoo, Chai, Wei, Garcia, Ricardo and Vercoe, Barry. "Analysis of a Contour-based Representation for Melody." Proceedings of International Symposium on Music Information Retrieval, Oct. 2000. Poster PDF

« Back to the Top »


Ellis, D.P.W. (1996). "Prediction-driven computational auditory scene analysis for dense sound mixtures" Proc. ESCA Workshop on the Auditory Basis of Speech Perception, Keele, July 1996. (6pp)

Ellis, D.P.W. (1996). "Prediction-driven computational auditory scene analysis" Ph.D. thesis, Dept. of Elec. Eng & Comp. Sci., M.I.T., June 1996. (180pp)

Ellis, D.P.W. (1995). "Underconstrained stochastic representations for top-down computational auditory scene analysis" Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, Mohonk, October 1995. (4pp)

Ellis, D.P.W., Rosenthal, D.F. (1995). "Mid-level representations for Computational Auditory Scene Analysis" Proc. Intl. Joint Conf. on Artif. Intell. Workshop on Computational Auditory Scene Analysis, Montreal, August 1995. (7pp)

Ellis, D.P.W. (1994). "A computer implementation of psychoacoustic grouping rules" Proc. 12th Intl. Conf. on Pattern Recognition, Jerusalem, October 1994. (9pp)

Ellis, D.P.W. (1993). "Vowel separation by glottal-pulse synchrony" Presented to the 126th meeting of the Acoustical Society of America, Denver, November 1993. (17pp)

Ellis, D.P.W. (1993). "Hierarchic models of sound for separation and restoration" Proc. 1993 IEEE Mohonk workshop on Applications of Signal Processing to Acoustics and Audio, October 1993. (4pp)

Ellis, D.P.W., Vercoe, B.L. (1992). "A perceptual representation of sound for auditory signal separation" Presented to the 123rd meeting of the Acoustical Society of America, Salt Lake City, May 1992. (8pp)

Ellis, D.P.W. (1992). "A Perceptual Representation of Audio" (also in Acrobat PDF) Master's thesis, EECS dept, MIT, February 1992. (88pp)

« Back to the Top »


Gardner, W. G. (1999). "Reduced-Rank Modeling of Head-Related Impulse Responses Using Subset Selection". Proc. 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY.

Gardner, W. G. (1998). "3-D Audio Using Loudspeakers". Kluwer Academic Publishers, Norwell, MA. ISBN 0-7923-8156-4.

Gardner, W. G. (1997). "3-D Audio Using Loudspeakers". Ph.D. Thesis, MIT Media Lab.

Gardner, W. G. (1997). "Head-Tracked 3-D Audio Using Loudspeakers". Proc. 1997 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY.

Gardner, W. G. (1998). "Reverberation algorithms" in Applications of Digital Signal Processing to Audio and Acoustics, ed. M. Kahrs, K. Brandenburg, Kluwer Academic. ISBN number: 0-7923-8130-0

Gardner, W. G. (1995). "Transaural 3-D audio". MIT Media Lab Perceptual Computing Technical Report #342.

Gardner, W. G., and Martin, K. D. (1995). "HRTF measurements of a KEMAR". J. Acoust. Soc. Am. 97 (6), pp. 3907-3908.

Gardner, W. G., and Martin, K. D. (1994). "HRTF measurements of a KEMAR dummy head microphone". MIT Media Lab Perceptual Computing Technical Report #280. Included on the CD-ROM "Standards in Computer Generated Music", Goffredo Haus and Isabella Pighi, eds., published by the IEEE CS Technical Committee on Computer Generated Music, 1996.

Gardner, W. G. (1994). "Efficient convolution without input-output delay". J. Audio Eng. Soc. 43 (3), 127-136.

Gardner, W. G. (1994). "Efficient convolution without input-output delay". Presented at the 97th convention of the Audio Engineering Society, San Francisco. Preprint 3897.

Gardner, W. G., and Griesinger, D. (1994). "Reverberation level matching experiments". Proc. of the Sabine Centennial Symposium, Acoust. Soc. of Am., pp. 263-266.

Gardner, W. G. (1992). "Reverb: a reverberator design tool for Audiomedia". Proc. Int. Comp. Music Conf., San Jose, CA.

Gardner, W. G. (1992). "A realtime multichannel room simulator". J. Acoust. Soc. Am., 92 (A), 2395.

Gardner, W. G. (1992). "The virtual acoustic room". Master's thesis, MIT Media Lab.

« Back to the Top »


Kim, Y. E. (2003). "A Framework for Parametric Singing Voice Analysis/Synthesis". Proceedings of the 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY: IEEE.

Kim, Youngmoo E. and Brian Whitman (2002). "Singer Identification in Popular Music Recordings Using Voice Coding Features". Proc. 2002 International Symposium on Music Information Retrieval, Paris, France, Oct. 2002.

Kim, Youngmoo E. (2001). "Excitation Codebook Design for Coding of the Singing Voice". Proc. 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, Oct. 2001.

Kim, Youngmoo, Wei Chai, Ricardo Garcia, and Barry Vercoe (2000)."Analysis of a Contour-based Representation for Melody". Proc. 2000 International Symposium on Music Information Retrieval, Plymouth, MA, Oct. 2000. Poster

Kim, Youngmoo E. (1999). "Structured Encoding of the Singing Voice Using Prior Knowledge of the Musical Score". Proc. 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, Oct. 1999.

Scheirer, Eric D., and Youngmoo E. Kim (1999). "Generalized Audio Coding with MPEG-4 Structured Audio". Proc. Audio Eng. Society 17th International Conference on High-Quality Audio Coding, Florence IT, Sept. 1999.

Martin, Keith D. and Youngmoo E. Kim (1998). "Musical instrument identification: a pattern-recognition approach". Presented at the 136th Meeting of the Acoustical Society of America, Norfolk, VA, October, 1998. (Winner of the Best Student Presentation Award in Musical Acoustics.)

« Back to the Top »


Lefford, Nyssim (2000). "Recording Studios Without Walls: Geographically Unrestricted Music Collaboration". Masters Thesis, MAS Department, Massachusetts Institute of Technology.

« Back to the Top »


Martin, K. D., Scheirer, E. D., and Vercoe, B. L. (1998). "Music content analysis through models of audition". Presented at the 1998 ACM Multimedia Workshop on Content Processing of Music for Multimedia Applications, Bristol, England, September, 1998. (A version of this has been submitted to the Journal of New Music Research.)

Martin, K. D. (1998). "Toward automatic sound source recognition: Identifying musical instruments". Presented at the 1998 NATO Advanced Study Institute on Computational Hearing, Il Ciocco, Italy, July, 1998. [Paper] [ Poster]

Martin, K.D. (1997). "Echo Suppression in a Computational Model of the Precedence Effect" Presented at the 1997 IEEE Mohonk workshop on Applications of Signal Processing to Acoustics and Audio, October 1997.

Martin, K. D. and Scheirer, E. D. (1997). "Automatic Transcription of Simple Polyphonic Music: Integrating Musical Knowledge". Presented at SMPC, August 1997.

Martin, K. D. (1996). "Automatic Transcription of Simple Polyphonic Music: Robust Front End Processing". M.I.T. Media Lab Perceptual Computing Technical Report #399, November 1996, presented at the Third Joint Meeting of the Acoustical Societies of America and Japan, December, 1996.

Martin, K. D. (1996). "A Blackboard System for Automatic Transcription of Simple Polyphonic Music". M.I.T. Media Lab Perceptual Computing Technical Report #385, July 1996.

Martin, K. D. (1995). "Estimating Azimuth and Elevation from Interaural Differences". Presented at the 1995 IEEE Mohonk workshop on Applications of Signal Processing to Acoustics and Audio, October 1995.

Martin, K. D. (1995). "A Computational Model of Spatial Hearing". Master's thesis, EECS dept, MIT, June 1995.

Martin, K. D. (1994). "A Computational Model of Spatial Hearing". Presented at the 126th meeting of the Acoustical Society of America, Austin, November 1994.

« Back to the Top »


Meyers, Owen (2007). "A Mood-Based Music Classification and Exploration System". Masters Thesis, MAS Department, Massachusetts Institute of Technology.

« Back to the Top »


Reich, R.D (2002). "Instrument identification through a simulated cochlear implant processing system". Master's Thesis, MIT, September 2002. Reich, R.D. and Eddington, D (2002). "Identification of musical instruments by normal-hearing subjects listening through a cochlear-implant simulation", poster presentation at 143rd Meeting of the Acoustical Society of America, June 2002.

Bradley, J.S., Reich, R. D. and Norcross, S. G. (2000). "On the combined effects of early- and late-arriving sound on spatial impression in concert halls". J. Acoust. Soc. Am. 108 (2), pp. 651-661.

Bradley, J.S., Reich, R.D. and Norcross, S. G. (1999). "On the combined effects of signal-to-noise ratio and room acoustics on speech intelligibility". J. Acoust. Soc. Am. 106 (4), pp. 1820-1828.

Bradley, J.S., Reich, R.D. and Norcross, S. G. (1999). "A Just Noticeable Difference in C50 for Speech". Applied Acoustics 58 (2), pp. 99-108.

Reich, R. and Bradley, J.S. (1998). "Optimizing Classroom Acoustics using Computer Model Studies". Canadian Acoustics 26 (4), pp. 15-21. (Director's Award for Best Student Publication)

Bradley, J. S. and Reich, R.(1998). "Computer Studies of Optimum Classroom Acoustics". Canadian Acoustics 26 (3), p. 16.

« Back to the Top »


Scheirer, Eric D., and Barry L. Vercoe (in press). "SAOL: The MPEG-4 Structured Audio Orchestra Language". To appear in Computer Music Journal.

Scheirer, Eric D., and Youngjik Lee and Jae-Woo Yang (in press). "Synthetic and SNHC Audio in MPEG-4". In Advances in Multimedia: Standards, Systems, and Networks, Atul Puri and Tsuhan Chan (eds). New York: Marcel Dekker, in press.

Scheirer, Eric D. (1999). "Structured Audio and Effects Processing in the MPEG-4 Multimedia Standard". Multimedia Systems 7:1 (Jan 1999) pp. 11-22.

Scheirer, Eric D. (1998). "Music Perception Systems". Unpublished proposal for Ph.D. dissertation, MIT MediaLaboratory, Oct. 1998.

Scheirer, Eric D., and Riitta Väänneän and Jyri Huopaniemi (1998). "AudioBIFS: The MPEG-4 Standard for Effects Processing". Proc. DAFX98 Workshop on Digital Audio Effects, Barcelona, Nov. 1998.

Scheirer, Eric D., and Barry L. Vercoe (1998). "The MPEG-4 Structured Audio Orchestra Language". Proc. 1998 Int. Computer Music Conf, Ann Arbor, MI, Oct 1998.

Scheirer, Eric D., and Lee Ray (1998). "Algorithmic and Wavetable Synthesis in the MPEG-4 Multimedia Standard". Proc. 105th Meeting of the AES (invited paper), San Francisco, Sept 1998.

Colomes, Catherine, and Caroline Jacobson and Eric Scheirer (1998). "Report on the MPEG-4 audio NADIB verification tests". ISO/IEC/JTC1 SC29/WG11 document W2276, Dublin IE, July 1998.

Scheirer, Eric D. (1998). "The MPEG-4 Structured Audio standard". Proc 1998 IEEE Intl Conf Acoustics, Speech, and Signal Processing , Seattle, USA, May 1998

Meares, David, and Kaoru Watanabe and Eric D. Scheirer (1998). "Report on the MPEG-2 AAC Stereo Verification Tests". ISO/IEC/JTC1 SC29/WG11 (MPEG) document W2006, San Jose CA USA, Feb 1998.

Scheirer, Eric D. (1998). "Tempo and Beat Analysis of Acoustic Musical Signals". J. Acoust. Soc. Am. 103:1, pp 588-601, Jan 1998.

Grill, Bernhardt, and B. Edler, I. Kaneko, Y. Lee, M. Nishiguchi, E. Scheirer, and M. Väänänen (eds), "MPEG-4 Audio Committee Draft". ISO/IEC SC29/WG11 (MPEG) document W1903, Fribourg, CH, Oct 1997.

Scheirer, Eric D. (1997). "Pulse Tracking with a Pitch Tracker". Proc 1997 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics , Mohonk, NY, Oct 1997.

Scheirer, Eric D., and Malcolm Slaney (1997). "Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator". Proc 1997 IEEE Intl Conf Acoustics, Speech, and Signal Processing, Munich, Germany, April 1997.

Scheirer, Eric D. (1997). "Using Musical Knowledge to Extract Expressive Performance Information from Audio Recordings". In Readings in Computational Auditory Scene Analysis, Okuno and Rosenthal (eds.). Mahweh, NJ: Lawrence Erlbaum Publication, 1997.

Scheirer, Eric D. (1996). "Bregman's Chimerae: Music Perception as Auditory Scene Analysis". Presented at 4th Intl Conf Music Perception and Cognition, Montreal, August 1996.

Scheirer, Eric D. (1995). "Extracting Expressive Performance Information from Recorded Music". Master's thesis, MIT Media Lab, Aug 1995.

« Back to the Top »


Smaragdis, P. (2001). "Redundancy reduction for computational audition, a unifying approach". Ph.D. dissertation, MAS department, Massachusetts Institute of Technology.

Boulanger, R., P. Smaragdis and J. Fitch, (2000). "Scanned Synthesis: an introduction and demonstration of a new synthesis and signal processing technique". In proceedings of the International Computer Music Conference 2000, Berlin, September 2000.

Smaragdis, P. (2000). "Efficient Csound Programming", in The Csound Book eds. Richard Boulanger and Barry Vercoe, MIT Press.

Smaragdis, P. (1999). "Information Theoretic Auditory Grouping", IJCAI-99 Workshop on Computational Auditory Scene Analysis. Stockholm, Sweden, August 1 1999.

Daniël W.E. Schobben, K. Torkkola and P. Smaragdis (1999). "Evaluation of Blind Signal Separation Methods". First International Workshop on Independent Component Analysis and Blind Signal Separation, Aussois, France, January 11-15, 1999.

Smaragdis, P. (1998). "Blind Separation of Convolved Mixtures in the Frequency Domain". International Workshop on Independence & Artificial Neural Networks University of La Laguna, Tenerife, Spain, February 9 - 10, 1998. Revised version also in Neurocomputing 22 (1998) 21-34.

Smaragdis, P. (1997). "Efficient Blind Separation of Convolved Sound Mixtures", IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz NY, October 1997.

Smaragdis, P. (1997). "Information Theoretic Approaches to Source Separation", Masters Thesis, MAS Department, Massachusetts Institute of Technology.

« Back to the Top »


Dobson, Kelly, Brian Whitman and Daniel P.W. Ellis (2005). "Learning Auditory Models of Machine Voices". To appear in the 2005 Workshop on Applications of Signal Processing to Audio and Acoustics.

Whitman, Brian (2004). "Learning the meaning of music". Dissertation defense, MIT, April 14, 2004.

Whitman, Brian, Daniel P.W. Ellis (2004). "Automatic Record Reviews". In Proceedings of ISMIR 2004 - 5th International Conference on Music Information Retrieval. October 10-14, 2004, Barcelona, Spain.

Berenzweig, Adam, Beth Logan, Daniel Ellis, Brian Whitman (2004). "A Large Scale Evaluation of Acoustic and Subjective Music Similarity Measures". Computer Music Journal, Summer 2004, 28(2), pp 63-76.

Berenzweig, Adam, Daniel Ellis, Beth Logan, Brian Whitman (2003). "A Large Scale Evaluation of Acoustic and Subjective Music Similarity Measures". In Proceedings of the 2003 International Symposium on Music Information Retrieval. 26-30 October 2003, Baltimore, MD.

Whitman, Brian (2003). "Semantic Rank Reduction of Music Audio". In Proceedings of the 2003 Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 19-22 October 2003, New Paltz, NY. pp135-138.

Recht, Ben and Brian Whitman (2003). "Musically Expressive Sound Textures from Generalized Audio". In Proceedings of the 2003 Digital Audio Effects (DAFX03) Conference. 8-11 September 2003, Queen Mary, University of London, U.K.

Whitman, Brian, Deb Roy, and Barry Vercoe (2003). "Learning Word Meanings and Descriptive Parameter Spaces from Music". in Proceedings of the HLT-NAACL03 workshop on Learning Word Meaning from Non-Linguistic Data. 26-31 May 2003, Edmonton, Alberta, Canada.

Whitman, Brian and Ryan Rifkin (2002). "Musical Query-by-Description as a Multiclass Learning Problem". In Proceedings of the IEEE Multimedia Signal Processing Conference. 8-11 December 2002, St. Thomas, USA.

Ellis, Daniel, Brian Whitman, Adam Berenzweig and Steve Lawrence (2002). "The Quest For Ground Truth in Musical Artist Similarity". In Proceedings of the 3rd International Conference on Music Information Retrieval. 13-17 October 2002, Paris, France.

Kim, Youngmoo and Brian Whitman (2002). "Singer Identification in Popular Music Recordings Using Voice Coding Features". In Proceedings of the 3rd International Conference on Music Information Retrieval. 13-17 October 2002, Paris, France.

Whitman, Brian and Paris Smaragdis (2002). "Combining Musical and Cultural Features for Intelligent Style Detection". In Proceedings of the 3rd International Conference on Music Information Retrieval. 13-17 October 2002, Paris, France.

Whitman, Brian and Steve Lawrence (2002). "Inferring Descriptions and Similarity for Music from Community Metadata". In "Voices of Nature," Proceedings of the 2002 International Computer Music Conference. pp 591-598. 16-21 September 2002, Göteborg, Sweden.

Whitman, Brian, Gary Flake, and Steve Lawrence (2001, September 10-12). "Artist Detection in Music with Minnowmatch". In Proceedings of the 2001 IEEE Workshop on Neural Networks for Signal Processing, pp. 559-568. Falmouth, Massachusetts.

« Back to the Top »