Multimedia on Mobile Devices 2007

Front Matter: Volume 6507

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 6507, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and the Conference Committee listing.

Color adaptation of videos for mobile devices

Stephan Kopf, Wolfgang Effelsberg

Show abstract

A large number of recorded videos cannot be viewed on mobile devices (e.g., PDAs or mobile phones) due to inappropriate screen resolutions or color depths of the displays. Recently, automatic transcoding algorithms have been introduced which facilitate the playback of previously recorded videos on new devices. One major challenge of transcoding is the preservation of the semantic content of the videos. Although much work was done on the adaptation of the image resolution, color adaptation of videos has not been addressed in detail before. In this paper, we present a novel color adaptation algorithm for videos which preserves the semantics. In our approach, the color depth of a video is adapted to facilitate the playback of videos on mobile devices which support only a limited number of different colors. We analyze our adaptation approach in the experimental results, visualize adapted keyframes and illustrate, that we obtain a better quality and are able to recognize much more details with our approach.

Complexity-constrained rated-distortion optimization for Wyner-Ziv video coding

Limin Liu, Zhen Li, Edward J. Delp

Show abstract

Wyner-Ziv video coding has been widely investigated in recent years. The main characteristic of Wyner-Ziv coding is that side information is available only to the decoder. Many current Wyner-Ziv video coding schemes encode the sequences using two approaches which separate the frames into what are known as key frames and Wyner-Ziv frames. Key frames are encoded using conventional video coding methods and Wyner-Ziv frames are encoded using channel coding techniques. At the decoder, the reconstructed key frames serve as the side information used to reconstruct the Wyner-Ziv frames. We have previously presented a Wyner-Ziv scheme that uses backward-channel-aware motion estimation to encode the key frames, where motion estimation was performed at the decoder and motion information was transmitted back to the encoder. We refer to these backward predictively coded frames as BP frames. In this paper, we extend our previous work to describe three types of motion estimators. We present a model to examine the analytical complexity-rate-distortion performance of BP frames for the three motion estimators.

Dynamic full-scalability conversion in scalable video coding

Dong Su Lee, Tae Meon Bae, Truong Cong Thang, et al.

Show abstract

For outstanding coding efficiency with scalability functions, SVC (Scalable Video Coding) is being standardized. SVC can support spatial, temporal and SNR scalability and these scalabilities are useful to provide a smooth video streaming service even in a time varying network such as a mobile environment. But current SVC is insufficient to support dynamic video conversion with scalability, thereby the adaptation of bitrate to meet a fluctuating network condition is limited. In this paper, we propose dynamic full-scalability conversion methods for QoS adaptive video streaming in SVC. To accomplish full scalability dynamic conversion, we develop corresponding bitstream extraction, encoding and decoding schemes. At the encoder, we insert the IDR NAL periodically to solve the problems of spatial scalability conversion. At the extractor, we analyze the SVC bitstream to get the information which enable dynamic extraction. Real time extraction is achieved by using this information. Finally, we develop the decoder so that it can manage the changing scalability. Experimental results showed that dynamic full-scalability conversion was verified and it was necessary for time varying network condition.

Displays enabling mobile multimedia

Jyrki Kimmel

Show abstract

With the rapid advances in telecommunications networks, mobile multimedia delivery to handsets is now a reality. While a truly immersive multimedia experience is still far ahead in the mobile world, significant advances have been made in the constituent audio-visual technologies to make this become possible. One of the critical components in multimedia delivery is the mobile handset display. While such alternatives as headset-style near-to-eye displays, autostereoscopic displays, mini-projectors, and roll-out flexible displays can deliver either a larger virtual screen size than the pocketable dimensions of the mobile device can offer, or an added degree of immersion by adding the illusion of the third dimension in the viewing experience, there are still challenges in the full deployment of such displays in real-life mobile communication terminals. Meanwhile, direct-view display technologies have developed steadily, and can provide a development platform for an even better viewing experience for multimedia in the near future. The paper presents an overview of the mobile display technology space with an emphasis on the advances and potential in developing direct-view displays further to meet the goal of enabling multimedia in the mobile domain.

A neighborhood analysis based technique for real-time error concealment in H.264 intra pictures

Steven T. C. Beesley, Christos Grecos, Eran Edirisinghe

Show abstract

H.264s extensive use of context-based adaptive binary arithmetic or variable length coding makes streams highly susceptible to channel errors, a common occurrence over networks such as those used by mobile devices. Even a single bit error will cause a decoder to discard all stream data up to the next fixed length resynchronisation point, the worst scenario is that an entire slice is lost. In cases where retransmission and forward error concealment are not possible, a decoder should conceal any erroneous data in order to minimise the impact on the viewer. Stream errors can often be spotted early in the decode cycle of a macroblock which if aborted can provide unused processor cycles, these can instead be used to conceal errors at minimal cost, even as part of a real time system. This paper demonstrates a technique that utilises Sobel convolution kernels to quickly analyse the neighbourhood surrounding erroneous macroblocks before performing a weighted multi-directional interpolation. This generates significantly improved statistical (PSNR) and visual (IEEE structural similarity) results when compared to the commonly used weighted pixel value averaging. Furthermore it is also computationally scalable, both during analysis and concealment, achieving maximum performance from the spare processing power available.

Reducing computational complexity of three-dimensional discrete cosine transform in video coding process

Jari J. Koivusaari, Jarmo H. Takala, Moncef Gabbouj

Show abstract

Low complexity video coding schemes are aimed to provide video encoding services also for devices with restricted computational power. Video coding process based on the three-dimensional discrete cosine transform (3D DCT) can offer a low complexity video encoder by omitting the computationally demanding motion estimation operation. In this coding scheme, extended fast transform is also used, instead of the motion estimation, to decorrelate the temporal dimension of video data. Typically, the most complex part of the 3D DCT based coding process is the three-dimensional transform. In this paper, we demonstrate methods that can be used in lossy coding process to reduce the number of one-dimensional transforms required to complete the full 3D DCT or its inverse operation. Because unnecessary computations can be omitted, fewer operations are required to complete the transform. Results include the obtained computational savings for standard video test sequences. The savings are reported in terms of computational operations. Generally, the reduced number of computational operations also implies longer battery lifetime for portable devices.

Restructuring a software based MPEG-4 video decoder for short latency hardware acceleration

Jani Boutellier, Olli Silvén, Tamas Erdelyi

Show abstract

The multimedia capabilities of emerging high-end battery powered mobile devices rely on monolithic hardware accelerators with long latencies to minimize interrupt and software overheads. When compared to pure software implementations, monolithic hardware accelerator solutions need an order of magnitude less power. However, they are rather inflexible and difficult to modify to provide support for multiple coding standards. A more flexible alternative is to employ finer grained short latency accelerators that implement the individual coding functions. Unfortunately, with this approach the software overheads can become very high, if interrupts are used for synchronizing the software and hardware. Preferably, the cost of hardware accelerator interfacing should be at the same level with software functions. In this paper we study the benefits attainable from such an approach. As a case study we restructure a MPEG-4 video decoder in a manner that enables the simultaneous decoding of multiple bit streams using short latency hardware accelerators. The approach takes multiple video bit streams as input and produces a multiplexed stream that is used to control the hardware accelerators without interrupts. The decoding processes of each stream can be considered as threads that share the same hardware resources. Software simulations predict that the energy efficiency of the approach would be significantly better than for a pure software implementation.

Hand-held analog television over WiMAX executed in SW

Daniel Iancu, Hua Ye, Murugappan Senthilvelan, et al.

Show abstract

This paper describes a device capable of performing the following tasks: it samples and decodes the composite video analog TV signal, it encodes the resulting RGB data into a MPEG-4 stream and sends it over a WiMAX link. On the other end of the link a similar device receives the WiMAX signal, in either TDD or FDD mode, decodes the MPEG data and displays it on the LCD display. The device can be a hand held device, such as a mobile phone or a PDA. The algorithms for the analog TV, WiMAX physical layer, WiMAX MAC and the MPEG encoder/decoder are executed entirely in software in real time, using the Sandbridge Technologies' low power SB3011 digital signal processor. The SB3011 multithreaded digital signal processor includes four DSP cores with eight threads each, and one ARM processor. The execution of the algorithms requires the entire four cores for the FDD mode. The WiMAX MAC is executed on the ARM processor.

Implementation of DCT-based video denoising algorithm with OMAP Innovator Development Kit

Dmytro Rusanovskyy, Jari J. Koivusaari, Karen Egiazarian, et al.

Show abstract

The recent development of in the field of embedded systems has enabled mobile devices with significant computation power and long battery life. However, there are still a limited number of video applications for such platforms. Due to high computational requirements of video processing algorithms, an intensive assembler optimization or even hardware design is required to meet the resource constraints of the mobile platforms. One example of such challenging video processing problem is video denoising. In this paper, we present a software implementation of a state-of-the-art video denoising algorithm on a mobile computational platform. The chosen algorithm is based on the three-dimensional discrete cosine transform (3D DCT) and block-matching. Apart from its architectural simplicity, algorithm allows the computational scalability due to the "sliding window"-style processing. In addition, main components of this algorithm are 8-point DCT and block matching which can be efficiently calculated with hardware acceleration of the modern DSP. Our target platform is the OMAP Innovator development kit, a dual processor environment including ARM 925 RISC general purpose processor (GPP) and TMS320C55x digital signal processor (DSP). The C55x DSP offers a hardware acceleration support for computing of the DCT and block-matching intensively used in the chosen denoising algorithm. Hardware acceleration can offer a significant "speed-up" in comparison to assembler optimization of source codes. The results demonstrate a possibility to implement an efficient video denoising algorithm on a mobile computational platform with limited computational resources.

Multimedia support for J2ME on high-end PDAs

Mike Lehmann, Thomas Preuss, Mark Rambow, et al.

Show abstract

Platform independence is the major advantage of the Java programming language. Whereas Java is widespread on servers, desktop computers and mobile phones. In contrast, PDA and Pocket PC applications are usually based on C or C# applications in the .NET Compact Framework. The paper focuses on the J2ME standard and its suitability for Pocket PCs and PDA. In that context we also give an overview of existing Java Virtual Machines (JVMs). In particular we evaluate Esmertecs Jbed CDC, IBM Websphere Everyplace, Creme and MySaifu and compare them according to functional criteria as well as standard conformance and performance. Furthermore a set of tests to benchmark these different JVMs is given. Finally, An example application is implemented, as part of the Bosporus project, to evaluate the JVMs from the programmers perspective.

Underflow prevention for AV streaming media under varying channel conditions

Sachin Deshpande

Show abstract

We propose a method to prevent receiver buffer underflow for AV streaming media under varying channel conditions using a variable scale factor adaptive media playout algorithm. Our proposed algorithm dynamically calculates a slow or a fast playout factor based on the current buffer state, target buffer level, past history of media data reception, estimate of future data arrival, content characteristics and the estimated current network conditions. As a result our algorithm results in prevention of underflow in situations where prior art approach can still result in underflow. Our algorithm can also avoid oscillations between slow and fast playout. Also variants of our algorithm can result in more smooth transition to normal playout from adaptive playout stage thus improving the perceptual user experience. The proposed algorithm can also be used to reduce initial buffering latency at the start of a media stream playback while achieving same robustness by reaching the desired target buffer level. We present a number of network simulation results for our proposed approach.

The interactive contents authoring system for terrestrial digital multimedia broadcasting

Won-Sik Cheong, Sangwoo Ahn, Jihun Cha, et al.

Show abstract

This paper introduces an interactive contents authoring system which can easily and conveniently produce interactive contents for the Terrestrial Digital Multimedia Broadcasting (T-DMB). For interactive broadcasting service, T-DMB adopted MPEG-4 Systems technology. In order to the interactive service becomes flourishing on the market, various types of interactive contents should be well provided prior to the service. In MPEG-4 Systems specification, broadcasting contents are described by the combination of a large number of nodes, routes and descriptors. In order to provide interactive data services through the T-DMB network, it is essential to have an interactive contents authoring system which allows contents authors to compose interactive contents easily and conveniently even if they lack any background on MPEG-4 Systems technology. The introduced authoring system provides powerful graphical user interface and produces interactive broadcasting contents in the forms of binary and textual format. Therefore, the interactive contents authoring system presented in this paper would vastly contribute to the flourishing interactive service.

Musical Slide Show MAF with Protection and Governance using MPEG-21 IPMP Components and REL

Muhammad Syah Houari Sabirin, Hendry Tan, Jeongyeon Lim, et al.

Show abstract

The Musical Slide Show Multimedia Application Format (MAF) which is currently being standardized by the Moving Picture Expert Group (MPEG) conveys the concept of combining several established standard technologies in a single file format. It defines the format of packing up MP3 audio data, along with JPEG images, MPEG-7 Simple Profile metadata, timed text, and MPEG-4 LASeR script. The presentation of Musical Slide Show MAF contents is made in a synchronized manner with JPEG images, timed text to MP3 audio track. Also, the rendering effect on JPEG images can be supported by the MPEG-4 LASeR script. This Musical Slide Show MAF will enrich the consumption of MP3 contents assisted with synchronized and rendered JPEG images, text as well as MPEG-7 metadata about the MP3 audio contents. However, there is no protection and governance mechanism for Musical Slide Show MAF which is the essential elements to deploy the sorts of contents. In this paper, to manage the Musical Slide Show MAF contents in a controlled manner, we present a protection and governance mechanism by using MPEG-21 Intellectual Property Management and Protection (IPMP) Components and MPEG-21 Rights Expression Language (REL) technologies We implement an authoring tool and a player tool for Musical Slide Show MAF contents and show the experimental results as well.

Framework for emotional mobile computation for creating entertainment experience

Artur R. Lugmayr

Show abstract

Ambient media are media, which are manifesting in the natural environment of the consumer. The perceivable borders between the media and the context, where the media is used are getting more and more blurred. The consumer is moving through a digital space of services throughout his daily life. As we are developing towards an experience society, the central point in the development of services is the creation of a consumer experience. This paper reviews possibilities and potentials of the creation of entertainment experiences with mobile phone platforms. It reviews sensor network capable of acquiring consumer behavior data, interactivity strategies, psychological models for emotional computation on mobile phones, and lays the foundations of a nomadic experience society. The paper rounds up with a presentation of several different possible service scenarios in the field of entertainment and leisure computation on mobiles. The goal of this paper is to present a framework and evaluation of possibilities of applying sensor technology on mobile platforms to create an increasing consumer entertainment experience.

Sensometrics: Identifying pen digitizers by statistical multimedia signal processing

Andrea Oermann, Claus Vielhauer, Jana Dittmann

Show abstract

In this paper a new approach will be introduced to identify pen-based digitizer devices based on handwritten samples used for biometric user authentication. This new method of digitizer identification based on their signal properties can also be seen as an influencing part in the new research area of so-called sensometrics. The goal of the work presented in this paper is to identify statistical features, derived from signals provided by pen-based digitizer tablets during the writing process, which allow identification, or at least group discrimination of different device types. Based on a database of a total of approximately 40,000 writing samples taken on 23 different pen digitizers, specific features for class discrimination will be chosen and a novel feature vector based classification system will be implemented and experimentally validated. The goal of our experimental validation is to study the class space that can be obtained, given a specific feature set, i.e. to which degree single tablets and/or groups of pen digitizers can be identified using our developed classification by a decision tree model. The results confirm that a group discrimination of devices can be achieved. By applying this new approach, the 23 different tablets from our database can be discriminated in 19 output groups.

Semantic consumption of photos on mobile devices

Seungji Yang, Sihyoung Lee, Yong Man Ro, et al.

Show abstract

In this paper, we propose a new promising photo album application format, which enables augmented use of digital home photos over a wide range of mobile devices and semantic photo consumption as minimizing user's manual tasks. The photo album application format packages photo collection and associated metadata based on MPEG-4 file format. The schema of the album metadata is designed in two levels: collection- and item-level descriptions. The collection-level description is metadata related to group of photos, each of which has item-level description that contains its detailed information. To demonstrate the use of the proposed album format on mobile devices, a photo album system was also developed, which could realize semantic photo consumption in sense of situation, category, and person.

Smooth transitions for mobile imagery browsing

René Rosenbaum, Heidrun Schumann

Show abstract

Due to limitations of mobile environments, the handling of imagery is still problematic. This becomes especially apparent if two images are to be blended in order to create a smooth transition. As existing techniques fail to adapt to the restrictions of mobile environments, new ideas for the appropriate implementation of the blending are required. This publication proposes a new approach for the creation of transitions for mobile hardware. The foundation are the different features of the new image compression standard JPEG2000. The main properties used within the proposed strategy are the Discrete Wavelet Transform as integral part of the codec and the options for Random Spatial Access within the data-stream. Based on this, different options to blend contents of two images in JPEG2000-domain either in stationary as well as remote environments are introduced. A discussion of the results shows that this saves valuable computing power, is able to be applied to partial image data and achieves better visual results than traditional approaches. Thus, it is appropriate to apply the proposed approach to support the browsing of image sequences within mobile devices.

Experienced quality factors: qualitative evaluation approach to audiovisual quality

Satu Jumisko-Pyykkö, Jukka Häkkinen, Göte Nyman

Show abstract

Subjective evaluation is used to identify impairment factors of multimedia quality. The final quality is often formulated via quantitative experiments, but this approach has its constraints, as subject's quality interpretations, experiences and quality evaluation criteria are disregarded. To identify these quality evaluation factors, this study examined qualitatively the criteria participants used to evaluate audiovisual video quality. A semi-structured interview was conducted with 60 participants after a subjective audiovisual quality evaluation experiment. The assessment compared several, relatively low audio-video bitrate ratios with five different television contents on mobile device. In the analysis, methodological triangulation (grounded theory, Bayesian networks and correspondence analysis) was applied to approach the qualitative quality. The results showed that the most important evaluation criteria were the factors of visual quality, contents, factors of audio quality, usefulness - followability and audiovisual interaction. Several relations between the quality factors and the similarities between the contents were identified. As a research methodological recommendation, the focus on content and usage related factors need to be further examined to improve the quality evaluation experiments.

Tablet PC interaction with digital micromirror device (DMD)

Hakki H. Refai, Mostafa H. Dahshan, James J. Sluss Jr.

Show abstract

Digital light processing (DLP) is an innovative display technology that uses an optical switch array, known as a digital micromirror device (DMD), which allows digital control of light. To date, DMDs have been used primarily as high-speed spatial light modulators for projector applications. A tablet PC is a notebook or slate-shaped mobile PC. Its touch screen or digitizing tablet technology allows the user to operate the notebook with a stylus or digital pen instead of using a keyboard or mouse. In this paper, we describe an interface solution that translates any sketch on the tablet PC screen to an identical mirror-copy over the cross-section of the DMD micromirrors such that the image of the sketch can be projected onto a special screen. An algorithm has been created to control each single micromirror of the hundreds of thousands of micromirrors that cover the DMD surface. We demonstrate the successful application of a DMD to a high-speed two-dimensional (2D) scanning environment, acquiring the data from the tablet screen and launching its contents to the projection screen; with very high accuracy up to 13.68 &mgr;m x 13.68 &mgr;m of mirror pitch.

TiDi browser: a novel photo browsing technique for mobile devices

Gerald Bieber, Christian Tominski, Bodo Urban

Show abstract

Today's digital photos can be tagged with information about when and where they were taken. On stationary computers, this information is often used to drive photo browsing. This is not the case for mobile devices. We describe first results of our current research on a novel photo browsing technique called TiDi Browser. TiDi Browser exploits time and location information available in digital photos to facilitate the identification of personal events and the detection of patterns of specific occurrences in time and space. Along with a main view and thumbnail previews, our browser application provides two time lines. One time line visualizes the number of photos taken per temporal unit (e.g., day, week, etc.). This allows users to easily detect personal events in time. The second time line visualizes location information. Since two- or three-dimensional locations are difficult to represent on small displays, we reduce the location information to one-dimensional distance information. The distance is shown in the second time line. Both time lines serve a second purpose as graphical user interface, meaning that they can be used to browse in time. Even larger photo collections can be browsed on very small displays intuitively and efficiently. We implemented our ideas in an interactive prototype that uses a client-server-architecture. To save bandwidth, we transmit appropriately scaled photos that fit the display dimensions of the client (mobile device). To enhance the user's browsing experience, we apply caching and prefetching strategies.

Of MOS and men: bridging the gap between objective and subjective quality measurements in mobile TV

T. C. M. de Koning, P. Veldhoven, H. Knoche, et al.

Show abstract

In this paper we explore the relation between subjective and objective measures of video quality. We computed objective MOS values from video clips using the video quality measuring tool VQM and compared it to the clips' subjective Acceptability scores. Using the ITU defined mapping (M2G) from MOS to binary Good or Better (GoB) values, we compared the M2G translated values to the clips' subjective Acceptability scores at various encoding bitrates (32-224kbps) and sizes (120x90, 168x126, 208x156 and 240x180). The results show that in the domain of mobile TV the ITU mapping M2G represents a serious overestimation of Acceptability. The mapping M2A, between MOS and Acceptability, that we suggest provides a significant improvement of 76% in the root mean square error (RMSE) over M2G. We show that Acceptability depended on more than just the visual quality and that both content type and size are essential to provide accurate estimates of Acceptability in the field of mobile TV. We illustrate this gain in Acceptability predictions for the popular content type football (soccer). In terms of RMSE our content dependent mapping (M2Af) yielded an improvement of 39% over M2A. Future research will validate the predictive power of our suggested mapping on other video material.

Interfaces for mobile image browsing

René Rosenbaum, Heidrun Schumann

Show abstract

The available screen space of common mobile hardware is little compared to stationary devices. This leads to significant usability problems if large imagery is to be displayed. This can be overcome by an appropriate user interface in terms of an image browsing technique. Such an interface consists of the two main parts content representation and means for interaction. However, many existing techniques do not consider the limitations of mobile environments. To overcome them, this publication proposes 3 new interactive image browsing techniques based on approaches from general content browsing. To consider remote environments, it is also shown how to combine each technique with a dynamic image streaming strategy significantly decreasing the transmitted data volume. Overall, a significant enhancement regarding representation, interaction and use of bandwidth compared to traditional strategies could be achieved by the introduced image browsing techniques.

Novel layered scalable video coding transmission over MIMO wireless systems with partial CSI and adaptive channel selection

Daewon Song, Chang Wen Chen

Show abstract

In this paper, we present a novel layered scalable video transmission scheme over multi-input multi-output (MIMO) wireless systems. The proposed layered scalable video transmission scheme is able to adaptively select the MIMO sub-channels for prioritized delivery of layered video signals based on only estimated partial channel state information (CSI). This scheme is fundamentally different from open loop (OL)-MIMO systems such as V-BLAST in which the CSI is only available at the receiver side. Without CSI at the transmitter, data sequences in OL-MIMO are transmitted simultaneously with equal power via multiple antennas. Therefore, OL-MIMO systems are not appropriate for transmitting compressed video data that need prioritized transmission. In this research, we assume that partial CSI, or the ordering of each sub-channel's SNR strength, is available at the transmitting end through simple estimation and feedback. The adaptive channel selection (ACS) algorithm we developed in this research shall switch the bit-stream automatically to match the ordering of SNR strength for the sub-channels. Essentially, we will launch higher priority layer bit-stream into higher SNR strength sub-channel by the proposed ACS algorithm. In this fashion, we can implicitly achieve unequal error protection (UEP) for layered scalable video coding transmission over MIMO system. Experimental results show that the proposed scheme is able to achieve UEP with partial CSI and the reconstructed video PSNRs demonstrate the performance improvement of the proposed system as compared with OL-MIMO system.

An efficient client-based JPEG2000 image transmission protocol

J. P. Ortiz, V. G. Ruiz, I. García

Show abstract

JPEG2000 is a new still image coding standard that allows to implement efficient remote image browsing applications. In this kind of developments, the clients retreives from a server only the desired regions of the remote images. In the Part 9 of this standard is defined the JPIP protocol, offering a complete set of syntaxes and methods for the remote interrogation of JPEG 2000 images. Using the JPIP protocol, the server side is where all the hard processes are done, so the clients have only to request the desired region and wait for the associated information. This little advantage has many other disadvantages like, for example, to not support the partial proxy caching, to difficult to implement improvement techniques like data prefetching or to offer a poor behaviour in wireless environments. A image transmission protocol with a more complex client side avoids all these disadvantages, only requiring a bit more process in clients. In this paper it is analyzed the current JPIP protocol, showing how affect its philosophy in the mentioned situations. A new JPEG2000 image transmission protocol is presented, J2KP, based on JPIP. Its performance is compared with JPIP and it is demostrated that, all the features rejected by JPIP offer a considerable increase of performance when are applied to this new protocol. Moreover, the process overload in the clients is nearly negligible.

Error resilient image transmission based on virtual SPIHT

Rongke Liu, Jie He, Xiaolin Zhang

Show abstract

SPIHT is one of the most efficient image compression algorithms. It had been successfully applied to a wide variety of images, such as medical and remote sensing images. However, it is highly susceptible to channel errors. A single bit error could potentially lead to decoder derailment. In this paper, we integrate new error resilient tools into wavelet coding algorithm and present an error-resilient image transmission scheme based on virtual set partitioning in hierarchical trees (SPIHT), EREC and self truncation mechanism. After wavelet decomposition, the virtual spatial-orientation trees in the wavelet domain are individually encoded using virtual SPIHT. Since the self-similarity across sub bands is preserved, a high source coding efficiency can be achieved. The scheme is essentially a tree-based coding, thus error propagation is limited within each virtual tree. The number of virtual trees may be adjusted according to the channel conditions. When the channel is excellent, we may decrease the number of trees to further improve the compression efficiency, otherwise increase the number of trees to guarantee the error resilience to channel. EREC is also adopted to enhance the error resilience capability of the compressed bit streams. At the receiving side, the self-truncation mechanism based on self constraint of set partition trees is introduced. The decoding of any sub-tree halts in case the violation of self-constraint relationship occurs in the tree. So the bits impacted by the error propagation are limited and more likely located in the low bit-layers. In additional, inter-trees interpolation method is applied, thus some errors are compensated. Preliminary experimental results demonstrate that the proposed scheme can achieve much more benefits on error resilience.

Complexity analysis and control in joint channel protection system for wireless video communications

Xin Jin, Guangxi Zhu

Show abstract

In wireless communications, channel coding and error control are essential to protect the video data from wireless interference. The power it consumed, which is determined by the protection method it used, will directly affect the system performance especially on the decoding side. In this paper, a channel coding and error control system, called joint channel protection (JCP) system here, is proposed as an improvement of the hybrid automatic repeat request (HARQ) system to integrate the complexity controllability. The complexity models of the encoder and decoder are established based on theoretical analysis and statistical data retrieval using the time complexity concept, and the relative variation in the computational complexity is carefully studied to provide a proportional variation reference for complexity control. Based on the models, strategies are designed to control the system complexity by adjusting the packet length, iterative decoding times and retransmission ratio according to the decoding quality and complexity level.

Intuitive user interface for mobile devices based on visual motion detection

Stefan Winkler, Karthik Rangaswamy, ZhiYing Zhou

Show abstract

The small form factor and unergonomic keys of mobile phones call for new and more natural approaches in user interface (UI) design. In this paper, we propose intuitive motion-based UI controls for mobile devices with built-in cameras based on the visual detection of the device's self-motion. We developed a car-racing game to test our new interface, and we conducted a user study to evaluate the accuracy, sensitivity, responsiveness and usability of our proposed system. Results show that our motion-based interface is well received by the users and clearly preferred over traditional button-based controls.

Bidirectional Traffic Status Image Information Service based on T-DMB

Youngho Jeong, Soonchoul Kim, Geon Kim, et al.

Show abstract

Terrestrial-Digital Multimedia Broadcasting (T-DMB) can provide a variety of data services such as traffic and travel information (TTI), EPG and BWS. Among these data services, the demand of TTI service has dramatically increased due to rapid increase in the number of vehicles and population of frequently traveling on long-weekends. In order to providing TTI services, TPEG protocol was developed. However, it has been applied merely to two application areas such as RTM and PTI. Recently, the traffic status image information (TSI) service based on still image or moving pictures becomes more important because public confidence about the accuracy of currently used coded-traffic information is low. In this paper, we propose a novel bidirectional TSI application. It can overcome the drawbacks of conventional RTM service, which cannot provide any subjective route decision chance to users. The proposed application protocol is designed to be interoperable with TPEG and can be provided in T-DMB. To verify the stability of proposed application protocol, we implemented the data server/client, the receiving platform and the bidirectional contents server/host server. Through an experimental broadcasting using by T-DMB and wireless communication networks, it is shown that the proposed bidirectional TSI application protocol operates stably and effectively.

Codesign toolset for application-specific instruction set processors

Pekka Jääskeläinen, Vladimír Guzma, Andrea Cilio, et al.

Show abstract

Application-specific programmable processors tailored for the requirements at hand are often at the center of today's embedded systems. Therefore, it is not surprising that considerable effort has been spent on constructing tools that assist in codesigning application-specific processors for embedded systems. It is desirable that such design toolsets support an automated design flow from application source code down to synthesizable processor description and optimized machine code. In this paper, such a toolset is described. The toolset is based on a customizable processor architecture template, which is VLIW-derived architecture paradigm called Transport Triggered Architecture (TTA). The toolset addresses some of the pressing shortcomings found in existing toolsets, such as lack of automated exploration of the "design space", limited run time retargetability of the design tools or restrictions in the customization of the target processors.

Dynamic power management for UML modeled applications on multiprocessor SoC

Petri Kukkala, Tero Arpinen, Mikko Setälä, et al.

Show abstract

The paper presents a novel scheme of dynamic power management for UML modeled applications that are executed on a multiprocessor System-on-Chip (SoC) in a distributed manner. The UML models for both application and architecture are designed according to a well-defined UML profile for embedded system design, called TUT-Profile. Application processes are considered as elementary units of distributed execution, and their mapping on a multiprocessor SoC can be dynamically changed at run-time. Our approach on the dynamic power management balances utilized processor resources against current workload at runtime by (1) observing the processor and workload statistics, (2) re-evaluating the amount of required resources (i.e. the number of active processors), and (3) re-mapping the application processes to the minimum set of active processors. The inactive processors are set to a power-save state by using clock-gating. The approach integrates the well-known power management techniques tightly with the UML based design of embedded systems in a novel way. We evaluated the dynamic power management with a WLAN terminal implemented on a multiprocessor SoC on Altera Stratix II FPGA containing up to five Nios II processors and dedicated hardware accelerators. Measurements proved up to 21% savings in the power consumption of the whole FPGA board.

Multimedia on Mobile Devices 2007

Volume Details

Table of Contents

Table of Contents