The AMI Meeting Corpus

I. McCowan

IDIAP, Martigny, Switzterland

To support the multi-disciplinary research in the AMI project, a 100-hour corpus of meetings is being collected. This corpus is being recorded in several instrumented rooms equipped with a variety of microphones, video cameras, electronic pens, presentation slide capture and white-board capture devices. As well as real meetings, the corpus contains a signi.cant proportion of scenario-driven meetings, which have been designed to elicit a rich range of realistic behaviors. To facilitate research, the raw data are being annotated at a number of levels including speech transcriptions, dialogue acts and summaries. The corpus is being distributed using a web server designed to allow convenient browsing and download of multimedia content and associated annotations. In this presentation we will overview AMI research themes, then discuss design of the instrumented rooms for data collection, design of the scenario-driven meetings, annotation schemes, and data distribution.


Paper presented at Measuring Behavior 2005 , 5th International Conference on Methods and Techniques in Behavioral Research, 30 August - 2 September 2005, Wageningen, The Netherlands.

© 2005 Noldus Information Technology bv