Home
 

 

VALID DB

The 106 person VALID Audio Visual Database

Dr Richard Reilly, Niall Fox, Brian O’Mullane
The performance of deployed audio, face, and multi-modal person recognition systems in non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust audio, face, and multi-modal person recognition systems, the large and realistic multi-modal (audio-visual) VALID database was acquired in a noisy “real world” office scenario with no control on illumination or acoustic noise. In this paper we describe the acquisition and content of the VALID database, consisting of five recording sessions of 106 subjects over a period of one month. Speaker identification experiments using visual speech features extracted from the mouth region are carried out. Two types of features are examined. The performance of the uncontrolled VALID database is compared with that of the controlled XM2VTS database. The best VALID and XM2VTS database accuracies are 63.21% and 97.17% respectively. The results highlight the degrading effect of an uncontrolled illumination environment and in the importance of the database in providing realworld deployment metrics. The database will be made available the academic community through ee.ucd.ie/validdb
Sponsor: Enterprise Ireland – Advanced Technology Research Programme