
|
VALID DB
The 106 person VALID Audio Visual Database |
 |
|
Dr Richard Reilly, Niall Fox, Brian
O’Mullane
The performance of deployed audio, face, and multi-modal
person recognition systems in non-controlled scenarios,
is typically lower than systems developed in highly
controlled environments. With the aim to facilitate
the development of robust audio, face, and multi-modal
person recognition systems, the large and realistic
multi-modal (audio-visual) VALID database was
acquired in a noisy “real world” office scenario
with no control on illumination or acoustic noise.
In this paper we describe the acquisition and
content of the VALID database, consisting of five
recording sessions of 106 subjects over a period
of one month. Speaker identification experiments
using visual speech features extracted from the
mouth region are carried out. Two types of features
are examined. The performance of the uncontrolled
VALID database is compared with that of the controlled
XM2VTS database. The best VALID and XM2VTS database
accuracies are 63.21% and 97.17% respectively.
The results highlight the degrading effect of
an uncontrolled illumination environment and in
the importance of the database in providing realworld
deployment metrics. The database will be made
available the academic community through ee.ucd.ie/validdb
Sponsor: Enterprise Ireland – Advanced Technology
Research Programme
|
 |