Share Email Print
cover

Proceedings Paper

Real-time speaker identification for video conferencing
Author(s): S. Saravi; I. Zafar; E. A. Edirisinghe; R. S. Kalawsky
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Automatic speaker identification in a videoconferencing environment will allow conference attendees to focus their attention on the conference rather than having to be engaged manually in identifying which channel is active and who may be the speaker within that channel. In this work we present a real-time, audio-coupled video based approach to address this problem, but focus more on the video analysis side. The system is driven by the need for detecting a talking human via the use of computer vision algorithms. The initial stage consists of a face detector which is subsequently followed by a lip-localization algorithm that segments the lip region. A novel approach for lip movement detection based on image registration and using the Coherent Point Drift (CPD) algorithm is proposed. Coherent Point Drift (CPD) is a technique for rigid and non-rigid registration of point sets. We provide experimental results to analyse the performance of the algorithm when used in monitoring real life videoconferencing data.

Paper Details

Date Published: 4 May 2010
PDF: 9 pages
Proc. SPIE 7724, Real-Time Image and Video Processing 2010, 77240D (4 May 2010); doi: 10.1117/12.854846
Show Author Affiliations
S. Saravi, Loughborough Univ. (United Kingdom)
I. Zafar, Loughborough Univ. (United Kingdom)
E. A. Edirisinghe, Loughborough Univ. (United Kingdom)
R. S. Kalawsky, Loughborough Univ. (United Kingdom)


Published in SPIE Proceedings Vol. 7724:
Real-Time Image and Video Processing 2010
Nasser Kehtarnavaz; Matthias F. Carlsohn, Editor(s)

© SPIE. Terms of Use
Back to Top
PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?
close_icon_gray