Robust fuzzy statistical modeling of dynamic backgrounds in IR videos

Taking into account uncertainties in classification results in an improved alternative to conventional means of detecting moving objects.
13 November 2009
Fida El Baf and Thierry Bouwmans

Background modeling is a technique used in applications such as video surveillance, optical motion capture, and multimedia to detect moving objects. Modeling generally begins by acquiring a background image without moving objects. Under certain conditions, however, the background might not be available. Moreover, acquiring it may be complicated by changes in illumination as well as the addition or removal of objects from the scene. Different background modeling methods have been devised to deal with these problems of robustness and adaptation. In video surveillance, for example, moving objects are commonly detected by using so-called background subtraction, as detailed in Figure 1. The background model is initialized during a learning step by using N frames. Next, preliminary foreground detection is achieved. This consists of classifying a pixel as belonging to the background or foreground, and applying a foreground mask to the current frame to isolate the moving object. The background is then adapted over time, following the changes that have occurred in the scene.

The choice of background model is key because it determines adaptation to dynamic changes. Gaussian mixture models (GMMs)1 are the most popular of the conventional approaches, but they have limitations that can cause false classifications in the foreground detection mask. In recent years, numerous improvements have been proposed,2 but none of them take into account uncertainties related to insufficient or noisy data in the training sequence. One way around this problem is to use ‘fuzzy’ concepts with GMMs. Type-2 fuzzy sets3 provide a theoretically well-founded framework for handling the uncertainty parameters of GMMs.4 Accordingly, here we describe an approach to modeling backgrounds using type-2 fuzzy GMMs (T2-FGMMs). We apply these primarily to IR videos,5 but they can be extended to RGB (additive primary color) videos.6,7

Figure 1. Background subtraction: the pipeline. t: Time. N: Number of frames. B: Background image. I: Current image.

Figure 2. Uncertain mean (μ) vector.

Pixels are characterized by their IR intensity. Consequently, the observation Xt is a scalar at time t, and the GMM is a mixture of K Gaussians. That is, each background pixel can have K distinct intensity values (usually K=3) that are modeled separately by a Gaussian, each of which in turn is represented by a mean and a variance. To obtain the two fuzzy versions of the GMM, T2-FGMM-UM and T2-FGMM-UV, we introduce the UM (uncertain mean) and UV (uncertain variance) matrices. Figure 2 shows the case for the UM.

Figure 3. (top row) Current image, GMM.1(bottom row) T2-FGMM-UM and T2-FGMM-UV.

Figure 4. (top row) Current image, ground truth. (middle row) T2-FGMM-UM and T2-FGMM-UV. (bottom row) GMM.8

Both GMM versions can be used to model the background and must be trained by estimating the mean and variance.5 Once the training is complete, preliminary foreground detection can be carried out as in Stauffer and Grimson.1 First, the Gaussians are ordered and labeled. When a new frame enters at time t+1, a match test is made for each pixel by using the logarithmic likelihood. Second, the pixel is classified as background or foreground following the label of the matched Gaussian. If no match is found, the pixel is classified as foreground. At this step, a binary mask is obtained. Finally, the parameters of the model are updated.

We have tested our algorithms on Terravic data sets.9 Figure 3 shows results obtained using a GMM, T2-FGMM-UM, and T2-FGMM-UV on frame 150 of sequence IRTR01. For the qualitative evaluation we used Dataset 01: OSU Thermal Pedestrian Database.9 Figure 4 shows the results obtained on frame 27 of sequence 1 using the same three algorithms. All of them detect silhouettes well, but the T2-FGMM-UM gives fewer false detections. For quantitative evaluation (see Table 1), we used the ‘similarity’ (S) measure,8which approaches 1 if the segmented image (A) corresponds to the ground truth (B), and 0 otherwise. The similarity values obtained for this experiment confirm the qualitative evaluation.

Table 1. Performance analysis
S(A, B)48%43%36%

In summary, we have modeled backgrounds using a T2-FGMM approach. Experiments on IR videos show very satisfactory performance and more robustness in difficult environments with T2-FGMMs compared with GMMs. This work confirms the pertinence of fuzzy concepts in the field of background subtraction. Our future work will concentrate on applying fuzzy concepts to other areas of the field.

Fida El Baf, Thierry Bouwmans
Laboratoire de Mathématiques, Image et Applications (MIA)
University of La Rochelle (ULR)
La Rochelle, France

Fida El Baf received her MS in imaging and calculus in 2005 and her PhD in mathematics applied to image processing in June 2009, both from ULR. Her research focuses on application of fuzzy concepts to the field of background subtraction.

Thierry Bouwmans received his PhD in image processing from the University of Littoral Côte d'Opale (Calais, France) in 1997. Between 2000 and 2007, he was a member of the Laboratoire Informatique, Image et Interaction at ULR. During this period, he was also working on the Aqu@theque project. He has been at the MIA since 2008. Currently, his research concerns moving-object detection in video sequences using fuzzy concepts.