Combining camera systems for human shape detection

Incorporating daylight and infrared optics offers improved rates of target acquisition.
09 February 2009
Alberto Broggi, Massimo Bertozzi, Mirko Felisa, Paolo Grisleri, and Michael Del Rose

The detection of pedestrians is an important field of research for commercial, governmental, and military organizations. In particular, the US Army is actively developing obstacle detection for multifunction utility/logistics equipment (MULE) vehicle operations, path following, and intent-based anti-tamper surveillance systems. This article introduces a new optical system for the detection of human shapes from unmanned MULE vehicles on the move.

Locating people against background noise can be a challenging task. For example, pedestrians can assume different poses, wear different clothes, and carry objects that obscure the distinctive human silhouette. These problems are further compounded by camera movements and different lighting conditions in uncontrolled outdoor environments.

To tackle this task, a number of monocular and stereo optical systems have been developed that use visible light (daylight cameras) or far infrared (FIR) wavelengths (7–14μm). In many scenarios, FIR cameras (often called ‘thermal’ cameras) are well suited to initial detection. In other situations, such as sunny and hot environments, targets are harder to pick out from the background, and daylight cameras are a better choice. In addition, daylight cameras provide more detailed images and offer more reliable target verification.

The simultaneous use of two stereo camera systems, one based on visible light (daylight) and the other on FIR wavelengths, have therefore been investigated to exploit the benefits of both technologies.1,2


Figure 1. The system installed on the experimental unmanned vehicle.

Designed and tested on vehicles as depicted in Figure 1, the system can detect both stationary and moving pedestrians and exploits passive sensors, which detect apparent motion by comparing the change in infrared temperature when, for example, a human passes in front of an infrared source with a different temperature, such as a building.

At the start of four processing steps, the two stereo systems are used independently to scan the target area. In this phase, different approaches are used to highlight portions of the images that warrant further attention. For example, warm areas are detected on FIR images, the density of edges from FIR and daylight images, and techniques such as disparity space image, among others, further process the initial data.

Stereo-based computation of the scene allows the 3D position of features such as roads, as well as their slope, distance, and size to be measured against the calibration parameters of the system so that features incompatible with the presence of a person (or a small group of people) can be discarded.

In the second step, areas highlighted in the two different spectra are filtered and fused applying symmetry, size, and distance constraints. In the third step, different models and filters are used to evaluate the presence of human shapes, which include neural networks, adaptive boosting, and others.3

Results are output to a controller-area network (CAN) or an Internet protocol suite (TCP/IP) network in extensible markup language (XML), or they can be shown through a graphical user interface that allows inspection of the intermediate steps, as shown in Figure 2.


Figure 2. Wireless remote display of the graphical user interface during real-time field tests.

Designed to be modular, the system can be connected to a more complex assembly as a detection module, with results output through a CAN or TCP/IP network. Moreover, it can act as a master processing module, since the system features plug-in capabilities and can accept preprocessed data from perception modules other than cameras to enable low or high-level data fusion.

Currently, the system is able to detect up to 90% of pedestrians in a given scene with an extremely low false positive ratio.

The result of a seven-year cooperation between VisLab, University of Parma, Italy, and the US Army Tank-Automotive Research, Development and Engineering Center (TARDEC), this system has been tested in urban and rural environments using the VisLab experimental vehicle shown in Figure 3. It was also demonstrated at TARDEC in 2004 on a high mobility multipurpose wheeled vehicle, and at General Dynamic Robotic Systems on their experimental unmanned vehicle in April 2008.

Future work will focus on development of the plug-in structure, custom hardware that holds the four cameras and processing unit, and the addition of a tracking system.


Figure 3. The system was tested in rural and urban areas using this VisLab experimental vehicle.

Alberto Broggi, Massimo Bertozzi, Mirko Felisa, Paolo Grisleri
VisLab, University of Parma
Parma, Italy

Alberto Broggi is the director of VisLab and the author of more than 150 publications covering research on automated vehicles. He is currently the editor-in-chief of the Institute of Electrical and Electronics Engineers (IEEE) Transactions on Intelligent Transportation Systems (for the term 2004–2009).

Massimo Bertozzi is assistant professor at the University of Parma and is working on applied machine vision for intelligent transport system.

Mirko Felisa is a PhD candidate researching obstacle detection using stereo vision.

Paolo Grisleri is a researcher at VisLab and is working on real-time machine-vision systems.

Michael Del Rose
US Army Research, Development and Engineering Command International Technology Center-Atlantic (ITC-A)
Ruislip, United Kingdom

Michael Del Rose is the associate director for vehicles, robotics, and armaments research at ITC-A.


PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research