Share Email Print

Proceedings Paper

Parallel processing architecture for H.264 deblocking filter on multi-core platforms
Author(s): Durga P. Prasad; Sekar Sonachalam; Mangesh K. Kunchamwar; Nageswara Rao Gunupudi
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Massively parallel computing (multi-core) chips offer outstanding new solutions that satisfy the increasing demand for high resolution and high quality video compression technologies such as H.264. Such solutions not only provide exceptional quality but also efficiency, low power, and low latency, previously unattainable in software based designs. While custom hardware and Application Specific Integrated Circuit (ASIC) technologies may achieve lowlatency, low power, and real-time performance in some consumer devices, many applications require a flexible and scalable software-defined solution. The deblocking filter in H.264 encoder/decoder poses difficult implementation challenges because of heavy data dependencies and the conditional nature of the computations. Deblocking filter implementations tend to be fixed and difficult to reconfigure for different needs. The ability to scale up for higher quality requirements such as 10-bit pixel depth or a 4:2:2 chroma format often reduces the throughput of a parallel architecture designed for lower feature set. A scalable architecture for deblocking filtering, created with a massively parallel processor based solution, means that the same encoder or decoder will be deployed in a variety of applications, at different video resolutions, for different power requirements, and at higher bit-depths and better color sub sampling patterns like YUV, 4:2:2, or 4:4:4 formats. Low power, software-defined encoders/decoders may be implemented using a massively parallel processor array, like that found in HyperX technology, with 100 or more cores and distributed memory. The large number of processor elements allows the silicon device to operate more efficiently than conventional DSP or CPU technology. This software programing model for massively parallel processors offers a flexible implementation and a power efficiency close to that of ASIC solutions. This work describes a scalable parallel architecture for an H.264 compliant deblocking filter for multi core platforms such as HyperX technology. Parallel techniques such as parallel processing of independent macroblocks, sub blocks, and pixel row level are examined in this work. The deblocking architecture consists of a basic cell called deblocking filter unit (DFU) and dependent data buffer manager (DFM). The DFU can be used in several instances, catering to different performance needs the DFM serves the data required for the different number of DFUs, and also manages all the neighboring data required for future data processing of DFUs. This approach achieves the scalability, flexibility, and performance excellence required in deblocking filters.

Paper Details

Date Published: 2 February 2012
PDF: 10 pages
Proc. SPIE 8295, Image Processing: Algorithms and Systems X; and Parallel Processing for Imaging Applications II, 829512 (2 February 2012); doi: 10.1117/12.912168
Show Author Affiliations
Durga P. Prasad, Parallel Prisms (United States)
Sekar Sonachalam, Parallel Prisms (United States)
Mangesh K. Kunchamwar, Parallel Prisms (United States)
Nageswara Rao Gunupudi, Parallel Prisms (United States)

Published in SPIE Proceedings Vol. 8295:
Image Processing: Algorithms and Systems X; and Parallel Processing for Imaging Applications II
Karen O. Egiazarian; John Recker; Guijin Wang; Sos S. Agaian; Atanas P. Gotchev, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?