Proceedings Volume 7256

Multimedia on Mobile Devices 2009

cover
Proceedings Volume 7256

Multimedia on Mobile Devices 2009

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 28 January 2009
Contents: 7 Sessions, 23 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2009
Volume Number: 7256

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 7256
  • Security and Services for Mobile Devices
  • Mobile Media Coding and Processing
  • Large Media Processing
  • Safety and Location
  • 3D Video Delivery for Mobile Devices
  • Interactive Paper Session
Front Matter: Volume 7256
icon_mobile_dropdown
Front Matter: Volume 7256
This PDF file contains the front matter associated with SPIE Proceedings Volume 7256, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Security and Services for Mobile Devices
icon_mobile_dropdown
A secure wireless mobile-to-server link
Modern mobile devices are some of the most technologically advanced devices that people use on a daily basis and the current trends indicate continuous growth in mobile phone applications. Nowadays phones are equipped with cameras that can capture still images and video, they are equipped with software that can read, convert, manipulate, communicate and save multimedia in multiple formats. This tremendous progress increased the volumes of communicated sensitive information which should be protected against unauthorized access. This paper discusses two general approaches for data protection, steganography and cryptography, and demonstrates how to integrate such algorithms with a mobile-toserver link being used by many applications.
Image encryption based on edge information
This paper presents a new concept of image encryption which is based on edge information. The basic idea is to separate the image into the edges and the image without edges, and encrypt them using any existing or new encryption algorithm. The user has the flexibility to encrypt the edges or the image without edges, or both of them. In this manner, different security requirements can be achieved. The encrypted images are difficult for unauthorized users to decode, providing a high level of security. We also introduce a new lossless encryption algorithm using 3D Cat Map. This algorithm can fully encrypt 2D images in a straightforward one-step process. It simultaneously changes image pixel locations and pixel data. Experimental examples demonstrate the performance of the presented algorithm in image encryption. It can also withstand chosen-plaintext attack. The presented encryption approach can encrypt all 2D and 3D images and easily be implemented in mobile devices.
Mobile Media Coding and Processing
icon_mobile_dropdown
On-demand learning system using 4K video source
Akira Yutani, Yoshitsugu Manabe, Hideki Sunahara, et al.
There are various kinds of learning systems in the world and quite a lot of them are using video sources. Also, those video sources have many kinds according to the content of learning and aim. In this paper, I'd like to describe the usability of learning systems by using a super high definition video source focusing on making handling of video source using super high resolution. Furthermore, the future progress and present problems would be considered by proposing an on-demand learning system using a super high definition video source. The super high resolution here means 4K (4096x2160 dots).
Adaptive timeline aware client controlled HTTP streaming
We propose an adaptive timeline aware client controlled HTTP streaming method to improve performance in a situation where the client has buffer constraints and is connected to the network with a constrained bandwidth link. The proposed approach uses the HTTP/1.1 byte ranges feature or URL parameters to achieve a better HTTP streaming performance. The proposed method does not require any change to the HTTP server side. It can support pausing the HTTP stream without any network data transfer occurring during the paused state. It does not rely on the TCP flow control and so it can work with any TCP/IP stack. The proposed approach allows client to intelligently employ single or multiple concurrent HTTP connections to receive streaming media. The client adaptively switches between using single and multiple concurrent HTTP connections based on the streaming media reception status with respect to wall-clock timeline and the media playout timeline.
Progressive raster imagery beyond a means to overcome limited bandwidth
René Rosenbaum, Heidrun Schumann
Progressive refinement is a well-established approach to overcome bandwidth limitations in mobile environments. One outstanding benefit compared to relates approaches is the provision of meaningful content previews during data transfer or processing. Although highly relevant and useful, however, related literature only addresses the support of this functionality by certain communication stages or proposes systems for specific use cases. No publication is concerned with an abstraction or formalization of progression or takes advantage of its beneficial properties in other application fields. In this publication we want to give a general view to progression, its key concepts, attributes, and common data processing pipeline. Thereby, we abstract from specifics and usage scenarios in order to simplify the development of new algorithms and schemes and to derive guidelines for its general application. To show that progression is also able to solve problems beyond limited bandwidth, this contribution is also concerned with the introduction of new application areas. The novel idea of content-oriented refinement allows emphasizing important image regions by an animated tour-through-the-data. It will also be shown that progressive representations are a very effective means for device adaptation. Both applications are motivated, discussed, and illustrated by different examples.
H.264/AVC intra-only coding (iAVC) techniques for video over wireless networks
Ming Yang, Monica Trifas, Guolun Xiong, et al.
The requirement to transmit video data over unreliable wireless networks (with the possibility of packet loss) is anticipated in the foreseeable future. Significant compression ratio and error resilience are both needed for complex applications including tele-operated robotics, vehicle-mounted cameras, sensor network, etc. Block-matching based inter-frame coding techniques, including MPEG-4 and H.264/AVC, do not perform well in these scenarios due to error propagation between frames. Many wireless applications often use intra-only coding technologies such as Motion-JPEG, which exhibit better recovery from network data loss at the price of higher data rates. In order to address these research issues, an intra-only coding scheme of H.264/AVC (iAVC) is proposed. In this approach, each frame is coded independently as an I-frame. Frame copy is applied to compensate for packet loss. This approach is a good balance between compression performance and error resilience. It achieves compression performance comparable to Motion- JPEG2000 (MJ2), with lower complexity. Error resilience similar to Motion-JPEG (MJ) will also be accomplished. Since the intra-frame prediction with iAVC is strictly confined within the range of a slice, memory usage is also extremely low. Low computational complexity and memory usage are very crucial to mobile stations and devices in wireless network.
New side information-generation method based on multiple reference frames for distributed video coding
Rong Ke Liu, Zhi Yue, Wei Hu
With the rapid development of wireless video communication and remote monitoring, the encoder is required to be low-complexity and low-power. Distributed video coding effectively reduces the complexity of encoder, since it shifts a majority of computation to the decoder. Frame interpolation is a key component of typical DVC system. It reconstructs the WZ frame based on the intra coded key frame, namely side information generation. The fewer differences between the side information and original WZ frame corresponds to the lower translating bit rate. In this paper, we carry out a new motion compensated interpolation based on non-uniform motion trajectory to refine the side information. Multiple reference frames is applied to fulfill our proposal. Testing results show that the proposed scheme improves the PSNR by 0.3-0.5dB in comparison with hierarchical block matching algorithm, and upgrades the rate-distortion performance in a certain degree.
Large Media Processing
icon_mobile_dropdown
Resource-saving image browsing based on JPEG2000, blurring, and progression
René Rosenbaum, Heidrun Schumann
Due to resource limitations, the handling of large imagery in mobile environments is still problematic. This applies especially for browsing and navigation tasks. As existing technology fails to adapt to restrictions, new ideas for appropriate image handling are required. This publication proposes a new approach for image browsing based on JPEG2000, blurring, and progressive refinement. Blurring is an effective means for user guidance and thus is selected and combined with progression to increase usability aspects during browsing. The resource-saving implementation of the user interface is founded on the image compression standard JPEG2000. The DiscreteWavelet Transform and options for Random Spatial Access are the main features we take advantage of in order to extract the user interface directly from the encoded image data. A discussion of the results shows that this approach saves valuable computing power and bandwidth compared to traditional technology, and is an appropriate means to support browsing of large imagery on mobile devices.
Adaptation of web pages and images for mobile applications
Stephan Kopf, Benjamin Guthier, Hendrik Lemelson, et al.
In this paper, we introduce our new visualization service which presents web pages and images on arbitrary devices with differing display resolutions. We analyze the layout of a web page and simplify its structure and formatting rules. The small screen of a mobile device is used much better this way. Our new image adaptation service combines several techniques. In a first step, border regions which do not contain relevant semantic content are identified. Cropping is used to remove these regions. Attention objects are identified in a second step. We use face detection, text detection and contrast based saliency maps to identify these objects and combine them into a region of interest. Optionally, the seam carving technique can be used to remove inner parts of an image. Additionally, we have developed a software tool to validate, add, delete, or modify all automatically extracted data. This tool also simulates different mobile devices, so that the user gets a feeling of how an adapted web page will look like. We have performed user studies to evaluate our web and image adaptation approach. Questions regarding software ergonomics, quality of the adapted content, and perceived benefit of the adaptation were asked.
Graphics hardware accelerated panorama builder for mobile phones
Modern mobile communication devices frequently contain built-in cameras allowing users to capture highresolution still images, but at the same time the imaging applications are facing both usability and throughput bottlenecks. The difficulties in taking ad hoc pictures of printed paper documents with multi-megapixel cellular phone cameras on a common business use case, illustrate these problems for anyone. The result can be examined only after several seconds, and is often blurry, so a new picture is needed, although the view-finder image had looked good. The process can be a frustrating one with waits and the user not being able to predict the quality beforehand. The problems can be traced to the processor speed and camera resolution mismatch, and application interactivity demands. In this context we analyze building mosaic images of printed documents from frames selected from VGA resolution (640x480 pixel) video. High interactivity is achieved by providing real-time feedback on the quality, while simultaneously guiding the user actions. The graphics processing unit of the mobile device can be used to speed up the reconstruction computations. To demonstrate the viability of the concept, we present an interactive document scanning application implemented on a Nokia N95 mobile phone.
Safety and Location
icon_mobile_dropdown
An affordable wearable video system for emergency response training
Many emergency response units are currently faced with restrictive budgets that prohibit their use of advanced technology-based training solutions. Our work focuses on creating an affordable, mobile, state-of-the-art emergency response training solution through the integration of low-cost, commercially available products. The system we have developed consists of tracking, audio, and video capability, coupled with other sensors that can all be viewed through a unified visualization system. In this paper we focus on the video sub-system which helps provide real time tracking and video feeds from the training environment through a system of wearable and stationary cameras. These two camera systems interface with a management system that handles storage and indexing of the video during and after training exercises. The wearable systems enable the command center to have live video and tracking information for each trainee in the exercise. The stationary camera systems provide a fixed point of reference for viewing action during the exercise and consist of a small Linux based portable computer and mountable camera. The video management system consists of a server and database which work in tandem with a visualization application to provide real-time and after action review capability to the training system.
Development of mobile preventive notification system (PreNotiS)
The tasks achievable by mobile handsets continuously exceed our imagination. Statistics show that the mobile phone sales are soaring, rising exponentially year after year with predictions being that they will rise to a billion units in 2009, with a large section of these being smartphones. Mobile service providers, mobile application developers and researchers have been working closely over the past decade to bring about revolutionary and hardware and software advancements in hand-sets such as embedded digital camera, large memory capacity, accelerometer, touch sensitive screens, GPS, Wi- Fi capabilities etc. as well as in the network infrastructure to support these features. Recently we presented a multi-platform, massive data collection system from distributive sources such as cell phone users1 called PreNotiS. This technology was intended to significantly simplify the response to the events and help e.g. special agencies to gather crucial information in time and respond as quickly as possible to prevent or contain potential emergency situations and act as a massive, centralized evidence collection mechanism that effectively exploits the advancements in mobile application development platforms and the existing network infrastructure to present an easy-touse, fast and effective tool to mobile phone users. We successfully demonstrated the functionality of the client-server application suite to post user information onto the server. This paper presents a new version of the system PreNotiS, with a revised client application and with all new server capabilities. PreNotiS still puts forth the idea of having a fast, efficient client-server based application suite for mobile phones which through a highly simplified user interface will collect security/calamity based information in a structured format from first responders and relay that structured information to a central server where this data is sorted into a database in a predefined manner. This information which includes selections, images and text will be instantly available to authorities and action forces through a secure web portal thus helping them to make decisions in a timely and prompt manner. All the cell phones have self-localizing capability according to FCC E9112 mandate, thus the communicated information can be further tagged automatically by location and time information at the server making all this information available through the secure web-portal.
An assisted GPS support for GPS simulators for embedded mobile positioning
Pradeep Kashyap, Abhay Samant, Phani K. Sagiraju, et al.
During recent years, location technologies have emerged as a research area with many possible applications in wireless communications, surveillance, military equipment, etc. Location Based Services (LBS) such as safety applications have become very popular. For example, US Federal Communication Commission Enhanced 911 (E911) Mandate seeks to provide emergency services personnel with location information that will enable them to dispatch assistance to wireless 911 callers much more quickly. Assisted GPS (A-GPS) is an extension of the conventional Global Positioning System (GPS) which increases start-up sensitivity by as much as 25dB relative to conventional GPS and reduces start times to less than six seconds. In A-GPS assistance data is delivered to the receiver through communication links. This paper addresses the generation of the assistance for GPS simulators for testing A-GPS receivers. The proposed approach is to use IP-based links and location support standards for assistance delivery avoiding network-specific signaling mechanisms so that GPS receiver developers can use this information for testing A-GPS capabilities using basic GPS simulators. The approach is implemented for the GPS simulator developed by the National InstrumentsTM.
Contextual interaction for geospatial visual analytics on mobile devices
Limited display area creates unique challenges for information presentation and user exploration of data on mobile devices. Traditional scrolling, panning and zooming interfaces pose significant cognitive burdens on the user to assimilate the new context after each interaction. To overcome these limitations, we examine the uses of "focus + context" techniques, specifically for performing visual analytic tasks with geospatial data on mobile devices. In particular, we adapted the translucency-based "focus + context" technique called "blending lens" to mobile devices. The adaptation enhances the lens functionalities with dynamically changing features based on users' navigation intentions, for mobile interaction. We extend the concept of "spatial context" of this method to include relevant semantic content to aid spatial navigation and analytical tasks such as finding related data. With these adaptations, the lens can be used to view spatially clustered results of a search query, related data based on various proximity functions (such as distance, category and time) and other correlative information for immediate in-field analysis, all without losing the current geospatial context.
3D Video Delivery for Mobile Devices
icon_mobile_dropdown
Mobile 3D television: development of core technological elements and user-centered evaluation methods toward an optimized system
A European consortium of six partners has been developing core technological components of a mobile 3D television system over DVB-H channel. In this overview paper, we present our current results on developing optimal methods for stereo-video content creation, coding and transmission and emphasize their significance for the power-constrained mobile platform, equipped with auto-stereoscopic display. We address the user requirements by applying modern usercentered approaches taking into account different user groups and usage contexts in contrast to the laboratory assessment methods which, though standardized, offer limited applicability to real applications. To this end, we have been aiming at developing a methodological framework for the whole system development process. One of our goals has been to further develop the user-centered approach towards experienced quality of critical system components. In this paper, we classify different research methods and technological solutions analyzing their pros and constraints. Based on this analysis we present the user-centered methodological framework being used throughout the whole development process of the system and aimed at achieving the best performance and quality appealing to the end user.
Verification of 3D mobile broadcasting service based on depth-image based rendering technique in terrestrial-DMB
This paper presents a 3D (three dimensional) mobile broadcasting service based on depth-image-based rendering (DIBR) technique in terrestrial digital multimedia broadcasting (T-DMB). In designing and developing a 3D visual service based on mobile broadcasting system, we must consider system requirements such as the minimization of additional bit-rates for 3D depth information due to the limitation of transmission channel bandwidth, the assurance of backward compatibility with existing T-DMB, and the maximization of 3D effect while reducing eye fatigue. Therefore, the 3D mobile broadcasting service based on DIBR technique can be one of the solutions to meet such requirements because the allocated bit-rates of depth image with DIBR scheme is less than additional video bit-rates of another 3D format, while keeping 3D quality and guaranteeing backward-compatibility with T-DMB. In this paper, we introduce an implementation of DIBR-based 3D T-DMB system that supports the real-time rendering with good image quality and depth effect at the receiver, verifying that it could be available in the mobile broadcasting. The verification is achieved through objective and subjective evaluation, based on the simulation and implementation of the system. Finally, we will confirm that DIBR-based 3D mobile broadcasting service would be commercialized in near future.
Use scenarios: mobile 3D television and video
Dominik Strohmeier, Mandy Weitzel, Satu Jumisko-Pyykkö
The focus of 3D television and video has been in technical development while hardly any attention has been paid on user expectations and needs of related applications. The object of the study is to examine user requirements for mobile 3D television and video in depth. We conducted two qualitative studies, focus groups and probe studies, to improve the understanding of user approach. Eight focus groups were carried out with altogether 46 participants focusing on use scenario development. The data-collection of the probe study was done over the period of 4 weeks in the field with nine participants to reveal intrinsic user needs and expectations. Both studies were conducted and analyzed independently so that they did not influence each other. The results of both studies provide novel aspects of users, system and content, and context of use. In the paper, we present personas as first archetype users of mobile 3D television and video. Putting these personas into contexts, we summarize the results of our studies and previous related work in the form of use scenarios to guide the user-centered development of 3D television and video.
Imaging and display systems for 3D mobile phone application
In this paper features of stereoscopic imaging and display systems for 3D mobile application are discussed. In imaging part 3D imaging layouts that reflect the characteristics of a mobile phone use are presented. In displaying part autostereoscopic 3D displays are presented. Features of the parallel and radial types as 3D imaging layouts are experimented and compared. Autostereoscopic 3D displays, typical parallax barrier and slanted parallax barrier, are realized and their experimental results are compared for several applications. This paper also discusses the possibility of multi-view imaging and display systems with mobile phone application.
Efficient stereoscopic contents file format on the basis of ISO base media file format
Kyuheon Kim, Jangwon Lee, Doug Young Suh, et al.
A lot of 3D contents haven been widely used for multimedia services, however, real 3D video contents have been adopted for a limited applications such as a specially designed 3D cinema. This is because of the difficulty of capturing real 3D video contents and the limitation of display devices available in a market. However, diverse types of display devices for stereoscopic video contents for real 3D video contents have been recently released in a market. Especially, a mobile phone with a stereoscopic camera has been released in a market, which provides a user as a consumer to have more realistic experiences without glasses, and also, as a content creator to take stereoscopic images or record the stereoscopic video contents. However, a user can only store and display these acquired stereoscopic contents with his/her own devices due to the non-existence of a common file format for these contents. This limitation causes a user not share his/her contents with any other users, which makes it difficult the relevant market to stereoscopic contents is getting expanded. Therefore, this paper proposes the common file format on the basis of ISO base media file format for stereoscopic contents, which enables users to store and exchange pure stereoscopic contents. This technology is also currently under development for an international standard of MPEG as being called as a stereoscopic video application format.
Interactive Paper Session
icon_mobile_dropdown
Perceptual quality measurement for scalable video at low spatial resolution in mobile environments
Hosik Sohn, Hana Yoo, Cheon Seog Kim, et al.
Environments for the delivery and consumption of multimedia are often very heterogeneous, due to the use of various terminals in varying network conditions. One example of such an environment is a wireless network providing connectivity to a plethora of mobile devices. H.264/AVC Scalable Video Coding (SVC) can be utilized to deal with diverse usage environments. However, in order to optimally tailor scalable video content along the temporal, spatial, or perceptual quality axes, a quality metric is needed that reliably models subjective quality. The major contribution of this paper is the development of a novel quality metric for scalable video bit streams having a low spatial resolution, targeting consumption in wireless video applications. The proposed quality metric allows modeling the temporal, spatial, and perceptual quality characteristics of SVC bit streams. This is realized by taking into account several properties of the compressed bit streams, such as the temporal and spatial variation of the video content, the frame rate, and PSNR values. An extensive number of subjective experiments have been conducted to construct and verify the reliability of our quality metric. The experimental results show that the proposed quality metric is able to efficiently reflect subjective quality. Moreover, the performance of the quality metric is uniformly high for video sequences with different temporal and spatial characteristics.
A location-based notification- and visualization-system indicating social activities
Sammy David, Stefan Edlich
This paper examines the implementation of a notification- and visualization-system generally aiming to provide users with a comprehensive possibility to spontaneously get in touch and stay in contact with their friends and acquaintances. Essential part of the system is the mobile application, based on the Google Android platform, which can be used to keep track about the spatial positions of a user's contacts. One of the main aspects of the presented system is the automatic contact alert mechanism, which notifies users every time that one or more of their contacts are located nearby. In case a contact is in vicinity the application initiates a visual and/or acoustic signal on the mobile device. At any time users are able to easily take a glance at a geographical map displaying the surrounding area of their current position and all of their online contacts within this area. Moreover, users have the possibility to retrieve further information for each of their displayed contacts (if provided by the contact), such as their current activity or specific location metadata, e.g. the speed, direction and distance of a contact to the user's own location. Additionally, a user can explicitly look out for a specific contact, regardless of where the contact is located globally, as long as the contact is logged on to the system and shares his/her location information. Furthermore, the system supports the exchange of location- and event-messages, enabling users to easily share location data or set up appointments with their contacts. Another main feature of the prototype is the automatic location context determination, which provides users not just the raw location information on a map, but also the contextual meaning specified by the contact. That means users can see that a contact is e.g. at home or at work, when the user is indeed within the spatial range of his home or work location. The system can detect automatically if a user reaches one of his most common places and provide this information to contacts (allowance required).
An Android based location service using GSMCellID and GPS to obtain a graphical guide to the nearest cash machine
Jurma Jacobsen, Stefan Edlich
There is a broad range of potential useful mobile location-based applications. One crucial point seems to be to make them available to the public at large. This case illuminates the abilities of Android - the operating system for mobile devices - to fulfill this demand in the mashup way by use of some special geocoding web services and one integrated web service for getting the nearest cash machines data. It shows an exemplary approach for building mobile location-based mashups for everyone: 1. As a basis for reaching as many people as possible the open source Android OS is assumed to spread widely. 2. Everyone also means that the handset has not to be an expensive GPS device. This is realized by re-utilization of the existing GSM infrastructure with the Cell of Origin (COO) method which makes a lookup of the CellID in one of the growing web available CellID databases. Some of these databases are still undocumented and not yet published. Furthermore the Google Maps API for Mobile (GMM) and the open source counterpart OpenCellID are used. The user's current position localization via lookup of the closest cell to which the handset is currently connected to (COO) is not as precise as GPS, but appears to be sufficient for lots of applications. For this reason the GPS user is the most pleased one - for this user the system is fully automated. In contrary there could be some users who doesn't own a GPS cellular. This user should refine his/her location by one click on the map inside of the determined circular region. The users are then shown and guided by a path to the nearest cash machine by integrating Google Maps API with an overlay. Additionally, the GPS user can keep track of him- or herself by getting a frequently updated view via constantly requested precise GPS data for his or her position.