Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2014 | (2014) | Publications

Volume Details

Date Published: 20 February 2014

Contents: 6 Sessions, 25 Papers, 0 Presentations

Conference: IS&T/SPIE Electronic Imaging 2014

Volume Number: 9030

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Front Matter: Volume 9030
Multimedia Content for Education
Emerging Mobile Applications and Enabling Technologies
Coding and Algorithms
Multimedia and Mobile Content
Interactive Paper Session

Front Matter: Volume 9030

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 9030, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and Conference Committee listing.

Multimedia Content for Education

Conception of a course for professional training and education in the field of computer and mobile forensics, part III: network forensics and penetration testing

Knut Kröger, Reiner Creutzburg

Show abstract

IT security and computer forensics are important components in the information technology. From year to year, incidents and crimes increase that target IT systems or were done with their help. More and more companies and authorities have security problems in their own IT infrastructure. To respond to these incidents professionally, it is important to have well trained staff. The fact that many agencies and companies work with very sensitive data make it necessary to further train the own employees in the field of network forensics and penetration testing. Motivated by these facts, this paper - a continuation of a paper of January 2012 [1], which showed the conception of a course for professional training and education in the field of computer and mobile forensics - addresses the practical implementation important relationships of network forensic and penetration testing.

A remote laboratory for USRP-based software defined radio

Rudresh Gandhinagar Ekanthappa, Rodrigo Escobar, Achot Matevossian, et al.

Show abstract

Electrical and computer engineering graduates need practical working skills with real-world electronic devices, which are addressed to some extent by hands-on laboratories. Deployment capacity of hands-on laboratories is typically constrained due to insufficient equipment availability, facility shortages, and lack of human resources for in-class support and maintenance. At the same time, at many sites, existing experimental systems are usually underutilized due to class scheduling bottlenecks. Nowadays, online education gains popularity and remote laboratories have been suggested to broaden access to experimentation resources. Remote laboratories resolve many problems as various costs can be shared, and student access to instrumentation is facilitated in terms of access time and locations. Labs are converted to homeworks that can be done without physical presence in laboratories. Even though they are not providing full sense of hands-on experimentation, remote labs are a viable alternatives for underserved educational sites. This paper studies remote modality of USRP-based radio-communication labs offered by National Instruments (NI). The labs are offered to graduate and undergraduate students and tentative assessments support feasibility of remote deployments.

Emerging Mobile Applications and Enabling Technologies

Accessing multimedia content from mobile applications using semantic web technologies

Jörn Kreutel, Andrea Gerlach, Stefanie Klekamp, et al.

Show abstract

We describe the ideas and results of an applied research project that aims at leveraging the expressive power of semantic web technologies as a server-side backend for mobile applications that provide access to location and multimedia data and allow for a rich user experience in mobile scenarios, ranging from city and museum guides to multimedia enhancements of any kind of narrative content, including e-book applications. In particular, we will outline a reusable software architecture for both server-side functionality and native mobile platforms that is aimed at significantly decreasing the effort required for developing particular applications of that kind.

Real-time global illumination on mobile device

Minsu Ahn, Inwoo Ha, Hyong-Euk Lee, et al.

Show abstract

We propose a novel method for real-time global illumination on mobile devices. Our approach is based on instant radiosity, which uses a sequence of virtual point lights in order to represent the e ect of indirect illumination. Our rendering process consists of three stages. With the primary light, the rst stage generates a local illumination with the shadow map on GPU The second stage of the global illumination uses the re ective shadow map on GPU and generates the sequence of virtual point lights on CPU. Finally, we use the splatting method of Dachsbacher et al 1 and add the indirect illumination to the local illumination on GPU. With the limited computing resources in mobile devices, a small number of virtual point lights are allowed for real-time rendering. Our approach uses the multi-resolution sampling method with 3D geometry and attributes simultaneously and reduce the total number of virtual point lights. We also use the hybrid strategy, which collaboratively combines the CPUs and GPUs available in a mobile SoC due to the limited computing resources in mobile devices. Experimental results demonstrate the global illumination performance of the proposed method.

Micro modules for mobile shape, color and spectral imaging with smartpads in industry, biology and medicine

Dietrich Hofmann, Paul-Gerald Dittrich, Eric Düntsch, et al.

Show abstract

Aim of the paper is the demonstration of a paradigm shift in shape, color and spectral measurements in industry, biology and medicine as well as in measurement education and training. Innovative hardware apps (hwapps) and software apps (swapps) with smartpads are fundamental enablers for the transformation from conventional stationary working places towards innovative mobile working places with in-field measurements and point-of-care (POC) diagnostics. Mobile open online courses (MOOCs) are transforming the study habits. Practical examples for the application of innovative photonic micro shapemeters, colormeters and spectrometers will be given. The innovative approach opens so far untapped enormous markets for measurement science, engineering and training. These innovative working conditions will be fast accepted due to their convenience, reliability and affordability. A highly visible advantage of smartpads is the huge number of their distribution, their worldwide connectivity via Internet and cloud services, the standardized interfaces like USB and HDMI and the experienced capabilities of their users for practical operations, learned with their private smartpads.

A mobile phone user interface for image-based dietary assessment

Ziad Ahmad, Nitin Khanna, Deborah A. Kerr, et al.

Show abstract

Many chronic diseases, including obesity and cancer, are related to diet. Such diseases may be prevented and/or successfully treated by accurately monitoring and assessing food and beverage intakes. Existing dietary assessment methods such as the 24-hour dietary recall and the food frequency questionnaire, are burdensome and not generally accurate. In this paper, we present a user interface for a mobile telephone food record that relies on taking images, using the built-in camera, as the primary method of recording. We describe the design and implementation of this user interface while stressing the solutions we devised to meet the requirements imposed by the image analysis process, yet keeping the user interface easy to use.

Interactive real-time media streaming with reliable communication

Xunyu Pan, Kevin M. Free

Show abstract

Streaming media is a recent technique for delivering multimedia information from a source provider to an end- user over the Internet. The major advantage of this technique is that the media player can start playing a multimedia file even before the entire file is transmitted. Most streaming media applications are currently implemented based on the client-server architecture, where a server system hosts the media file and a client system connects to this server system to download the file. Although the client-server architecture is successful in many situations, it may not be ideal to rely on such a system to provide the streaming service as users may be required to register an account using personal information in order to use the service. This is troublesome if a user wishes to watch a movie simultaneously while interacting with a friend in another part of the world over the Internet. In this paper, we describe a new real-time media streaming application implemented on a peer-to-peer (P2P) architecture in order to overcome these challenges within a mobile environment. When using the peer-to-peer architecture, streaming media is shared directly between end-users, called peers, with minimal or no reliance on a dedicated server. Based on the proposed software pεvμa (pronounced [revma]), named for the Greek word meaning stream, we can host a media file on any computer and directly stream it to a connected partner. To accomplish this, pεvμa utilizes the Microsoft .NET Framework and Windows Presentation Framework, which are widely available on various types of windows-compatible personal computers and mobile devices. With specially designed multi-threaded algorithms, the application can stream HD video at speeds upwards of 20 Mbps using the User Datagram Protocol (UDP). Streaming and playback are handled using synchronized threads that communicate with one another once a connection is established. Alteration of playback, such as pausing playback or tracking to a different spot in the media file, will be reflected in all media streams. These techniques are designed to allow users at different locations to simultaneously view a full length HD video and interactively control the media streaming session. To create a sustainable media stream with high quality, our system supports UDP packet loss recovery at high transmission speed using custom File- Buffers. Traditional real-time streaming protocols such as Real-time Transport Protocol/RTP Control Protocol (RTP/RTCP) provide no such error recovery mechanism. Finally, the system also features an Instant Messenger that allows users to perform social interactions with one another while they enjoy a media file. The ultimate goal of the application is to offer users a hassle free way to watch a media file over long distances without having to upload any personal information into a third party database. Moreover, the users can communicate with each other and stream media directly from one mobile device to another while maintaining an independence from traditional sign up required by most streaming services.

Coding and Algorithms

Efficient burst image compression using H.265/HEVC

Hoda Roodaki-Lavasani, Jani Lainema

Show abstract

New imaging use cases are emerging as more powerful camera hardware is entering consumer markets. One family of such use cases is based on capturing multiple pictures instead of just one when taking a photograph. That kind of a camera operation allows e.g. selecting the most successful shot from a sequence of images, showing what happened right before or after the shot was taken or combining the shots by computational means to improve either visible characteristics of the picture (such as dynamic range or focus) or the artistic aspects of the photo (e.g. by superimposing pictures on top of each other). Considering that photographic images are typically of high resolution and quality and the fact that these kind of image bursts can consist of at least tens of individual pictures, an efficient compression algorithm is desired. However, traditional video coding approaches fail to provide the random access properties these use cases require to achieve near-instantaneous access to the pictures in the coded sequence. That feature is critical to allow users to browse the pictures in an arbitrary order or imaging algorithms to extract desired pictures from the sequence quickly. This paper proposes coding structures that provide such random access properties while achieving coding efficiency superior to existing image coders. The results indicate that using HEVC video codec with a single reference picture fixed for the whole sequence can achieve nearly as good compression as traditional IPPP coding structures. It is also shown that the selection of the reference frame can further improve the coding efficiency.

MPEG-4 solutions for virtualizing RDP-based applications

Bojan Joveski, Mihai Mitrea, Rama-Rao Ganji

Show abstract

The present paper provides the proof-of-concepts for the use of the MPEG-4 multimedia scene representations (BiFS and LASeR) as a virtualization tool for RDP-based applications (e.g. MS Windows applications). Two main applicative benefits are thus granted. First, any legacy application can be virtualized without additional programming effort. Second, heterogeneous mobile devices (different manufacturers, OS) can collaboratively enjoy full multimedia experiences. From the methodological point of view, the main novelty consists in (1) designing an architecture allowing the conversion of the RDP content into a semantic multimedia scene-graph and its subsequent rendering on the client and (2) providing the underlying scene graph management and interactivity tools. Experiments consider 5 users and two RDP applications (MS Word and Internet Explorer), and benchmark our solution against two state-of-the-art technologies (VNC and FreeRDP). The visual quality is evaluated by six objective measures (e.g. PSNR<37dB, SSIM<0.99). The network traffic evaluation shows that: (1) for text editing, the MPEG-based solutions outperforms the VNC by a factor 1.8 while being 2 times heavier then the FreeRDP; (2) for Internet browsing, the MPEG solutions outperform both VNC and FreeRDP by factors of 1.9 and 1.5, respectively. The average round-trip times (less than 40ms) cope with real-time application constraints.

Evaluation of in-network adaptation of scalable high efficiency video coding (SHVC) in mobile environments

James Nightingale, Qi Wang, Christos Grecos, et al.

Show abstract

High Efficiency Video Coding (HEVC), the latest video compression standard (also known as H.265), can deliver video streams of comparable quality to the current H.264 Advanced Video Coding (H.264/AVC) standard with a 50% reduction in bandwidth. Research into SHVC, the scalable extension to the HEVC standard, is still in its infancy. One important area for investigation is whether, given the greater compression ratio of HEVC (and SHVC), the loss of packets containing video content will have a greater impact on the quality of delivered video than is the case with H.264/AVC or its scalable extension H.264/SVC. In this work we empirically evaluate the layer-based, in-network adaptation of video streams encoded using SHVC in situations where dynamically changing bandwidths and datagram loss ratios require the real-time adaptation of video streams. Through the use of extensive experimentation, we establish a comprehensive set of benchmarks for SHVC-based highdefinition video streaming in loss prone network environments such as those commonly found in mobile networks. Among other results, we highlight that packet losses of only 1% can lead to a substantial reduction in PSNR of over 3dB and error propagation in over 130 pictures following the one in which the loss occurred. This work would be one of the earliest studies in this cutting-edge area that reports benchmark evaluation results for the effects of datagram loss on SHVC picture quality and offers empirical and analytical insights into SHVC adaptation to lossy, mobile networking conditions.

Spatial domain entertainment audio decompression/compression

Y. K. Chan, Ka Him Kevin Tam

Show abstract

The ARM7 NEON processor with 128bit SIMD hardware accelerator requires a peak performance of 13.99 Mega Cycles per Second for MP3 stereo entertainment quality decoding. For similar compression bit rate, OGG and AAC is preferred over MP3. The Patent Cooperation Treaty Application dated 28/August/2012 describes an audio decompression scheme producing a sequence of interleaving “min to Max” and “Max to min” rising and falling segments. The number of interior audio samples bound by “min to Max” or “Max to min” can be {0|1|…|N} audio samples. The magnitudes of samples, including the bounding min and Max, are distributed as normalized constants within the 0 and 1 of the bounding magnitudes. The decompressed audio is then a “sequence of static segments” on a frame by frame basis. Some of these frames needed to be post processed to elevate high frequency. The post processing is compression efficiency neutral and the additional decoding complexity is only a small fraction of the overall decoding complexity without the need of extra hardware. Compression efficiency can be speculated as very high as source audio had been decimated and converted to a set of data with only "segment length and corresponding segment magnitude" attributes. The PCT describes how these two attributes are efficiently coded by the PCT innovative coding scheme. The PCT decoding efficiency is obviously very high and decoding latency is basically zero. Both hardware requirement and run time is at least an order of magnitude better than MP3 variants. The side benefit is ultra low power consumption on mobile device. The acid test on how such a simplistic waveform representation can indeed reproduce authentic decompressed quality is benchmarked versus OGG(aoTuv Beta 6.03) by three pair of stereo audio frames and one broadcast like voice audio frame with each frame consisting 2,028 samples at 44,100KHz sampling frequency.

Multimedia and Mobile Content

Combining spherical harmonics and point lights for real-time photorealistic rendering

Inwoo Ha, Minsu Ahn, Hyungwook Lee, et al.

Show abstract

Photorealistic rendering with all frequency lights and materials in real time is a difficult problem. The environment lights and complex materials can be approximated with spherical harmonics defined in spherical Fourier domain. Then, low frequency components of complex environment lights and materials are projected on just a few bases of spherical harmonics, which makes real-time rendering possible in low dimensional space. However, high frequency components, such as small bright light and glossy reflection, are filtered out during the spherical harmonics projection. In the other hand, point lights are efficient to represent high frequency lights, while they are inefficient for low frequency lights, such as smooth area lights. Combining spherical harmonics and point lights approaches, we can render a scene in real time, preserving both of low and high frequency effects.

Multi-frame knowledge based text enhancement for mobile phone captured videos

Suleyman Ozarslan, P. Erhan Eren

Show abstract

In this study, we explore automated text recognition and enhancement using mobile phone captured videos of store receipts. We propose a method which includes Optical Character Resolution (OCR) enhanced by our proposed Row Based Multiple Frame Integration (RB-MFI), and Knowledge Based Correction (KBC) algorithms. In this method, first, the trained OCR engine is used for recognition; then, the RB-MFI algorithm is applied to the output of the OCR. The RB-MFI algorithm determines and combines the most accurate rows of the text outputs extracted by using OCR from multiple frames of the video. After RB-MFI, KBC algorithm is applied to these rows to correct erroneous characters. Results of the experiments show that the proposed video-based approach which includes the RB-MFI and the KBC algorithm increases the word character recognition rate to 95%, and the character recognition rate to 98%.

Interactive Paper Session

Possibilities for retracing of copyright violations on current video game consoles by optical disk analysis

Frank Irmler, Reiner Creutzburg

Show abstract

This paper deals with the possibilities of retracing copyright violations on current video game consoles (e.g. Microsoft Xbox, Sony PlayStation, ...) by studying the corresponding optical storage media DVD and Blu-ray. The possibilities of forensic investigation of DVD and Blu-ray Discs are presented. It is shown which information can be read by using freeware and commercial software for forensic examination. A detailed analysis is given on the visualization of hidden content and the possibility to find out information about the burning hardware used for writing on the optical discs. In connection with a forensic analysis of the Windows registry of a suspects PC a detailed overview of the crime scene for forged DVD and Blu-ray Discs can be obtained. Optical discs are examined under forensic aspects and the obtained results are implemented into automatic analysis scripts for the commercial forensics program EnCase Forensic. It is shown that for the optical storage media a possibility of identification of the drive used for writing can be obtained. In particular Blu-ray Discs contain the serial number of the burner. These and other findings were incorporated into the creation of various EnCase scripts for the professional forensic investigation with EnCase Forensic. Furthermore, a detailed flowchart for a forensic investigation of copyright infringement was developed.

Indoor positioning system using WLAN multipath signals as fingerprints for mobile devices

Anirban Saha, David Akopian

Show abstract

The Global Positioning System (GPS) has eventually become a common positioning technology. While GPS enabled many applications, satellite signals have yet to overcome many obstacles to enable indoor positioning. Meanwhile, due to the wide deployment of Wireless Local Area Networks (WLAN) in recent years, WLAN positioning algorithms have become popular for mobile device positioning in indoor environments. The most accurate WLAN positioning algorithms exploit the so-called fingerprinting concept which consists of two stages. In the offline stage (training), the Received Signal Strength Indicator (RSSI) from a set of available Access Points (AP) is measured for a number of reference locations and stored in a database. Due to the availability of many APs and the complex structure of indoor environments, this information is distinctive for each reference location and thus is called a position fingerprint. In the online stage (testing), a mobile device receives RSSIs from the APs and their fingerprint is compared to the fingerprints that are stored in the database for a best possible match. WLAN fingerprinting can provide high accuracy in indoor environments when there are many APs which ensure fingerprint uniqueness. This paper provides a novel approach to use WLAN multipath signals as possible fingerprints for positioning algorithms for applications with lower number of access points. The paper provides an initial study.

Human activity recognition by smartphones regardless of device orientation

Jafet Morales, David Akopian, Sos Agaian

Show abstract

A new method for activity recognition using smartphones is proposed. Using three-axes accelerometer and gyroscope signals, the proposed system is able to identify low level activities with a high level of accuracy. The method works regardless of orientation of the device with respect to the body part to which it is attached. The algorithm achieves a high level of accuracy when trained on a small set of users and tested on an unknown user.

Implementation of a forensic tool to examine the Windows registry

Christian Leube, Knut Kröger, Reiner Creutzburg

Show abstract

This paper describes the design and prototypic implementation of a forensic tool for the automated analysis of the Windows registry. The concept provides a complete object-oriented analysis of functional requirements as well as detailed descriptions of the program components and the software architecture of the tool. The prototypical implementation of the tool on basis of the developed concept shows its consistency. The implementation is partially described as object-oriented design. Here, special emphasis is placed on the ease of maintenance and extensibility of the program. Information to keys, which are to be read during the analysis are defined in the XML script files. The subsequently defined tests prove the consistency of the concept and the implementation [8].

Virtual tutorials, Wikipedia books, and multimedia-based teaching for blended learning support in a course on algorithms and data structures

Jenny Knackmuß, Reiner Creutzburg

Show abstract

The aim of this paper is to describe the benefit and support of virtual tutorials, Wikipedia books and multimedia-based teaching in a course on Algorithms and Data Structures. We describe our work and experiences gained from using virtual tutorials held in Netucate iLinc sessions and the use of various multimedia and animation elements for the support of deeper understanding of the ordinary lectures held in the standard classroom on Algorithms and Data Structures for undergraduate computer sciences students. We will describe the benefits, form, style and contents of those virtual tutorials. Furthermore, we mention the advantage of Wikipedia books to support the blended learning process using modern mobile devices. Finally, we give some first statistical measures of improved student’s scores after introducing this new form of teaching support.

Hacking and securing the AR.Drone 2.0 quadcopter: investigations for improving the security of a toy

Johann-Sebastian Pleban, Ricardo Band, Reiner Creutzburg

Show abstract

In this article we describe the security problems of the Parrot AR.Drone 2.0 quadcopter. Due to the fact that it is promoted as a toy with low acquisition costs, it may end up being used by many individuals which makes it a target for harmful attacks. In addition, the videostream of the drone could be of interest for a potential attacker due to its ability of revealing confidential information. Therefore, we will perform a security threat analysis on this particular drone. We will set the focus mainly on obvious security vulnerabilities like the unencrypted Wi-Fi connection or the user management of the GNU/Linux operating system which runs on the drone. We will show how the drone can be hacked in order to hijack the AR.Drone 2.0. Our aim is to sensitize the end-user of AR.Drones by describing the security vulnerabilities and to show how the AR.Drone 2.0 could be secured from unauthorized access. We will provide instructions to secure the drones Wi-Fi connection and its operation with the official Smartphone App and third party PC software.

A new 1D parameter-control chaotic framework

Zhongyun Hua, Yicong Zhou, Chi-Man Pun, et al.

Show abstract

This paper introduces a novel parameter-control framework to produce many new one-dimensional (1D) chaotic maps. It has a simple structure and consists of two 1D chaotic maps, in which one is used as a seed map while the other acts as a control map that controls the parameter of the seed map. Examples and analysis results show that these newly generated chaotic maps have more complex structures and better chaos performance than their corresponding seed and control maps.

A new collage steganographic algorithm using cartoon design

Shuang Yi, Yicong Zhou, Chi-Man Pun, et al.

Show abstract

Existing collage steganographic methods suffer from low payload of embedding messages. To improve the payload while providing a high level of security protection to messages, this paper introduces a new collage steganographic algorithm using cartoon design. It embeds messages into the least significant bits (LSBs) of color cartoon objects, applies different permutations to each object, and adds objects to a cartoon cover image to obtain the stego image. Computer simulations and comparisons demonstrate that the proposed algorithm shows significantly higher capacity of embedding messages compared with existing collage steganographic methods.

Fixed tile rate codec for bandwidth saving in video processors

Vladimir Lachine, Chon-Tam Le Dinh, Dinh Kha Le, et al.

Show abstract

The paper presents an image compression circuit for bandwidth saving in video display processors. This is intra frame tile based compression algorithm offering visually lossless quality for compression rates between 1.5 and 2.5. RGB and YCbCr (4:4:4, 4:2:2 and 4:2:0) video formats are supported for 8/10 bits video signals. The Band Width Compressor (BWC) consists of Lossless Compressor (LC) and Quantization Compressor (QC) that generate output bit streams for tiles of pixels. Size of output bit stream generated for a tile by the LC may be less or greater than a required size of output memory block. The QC generates bit stream that always fits output memory block of the required size. The output bit stream generated by the LC is transmitted if its size is less than the required size of the output memory block. Otherwise, the output bit stream generated by the QC is transmitted. The LC works on pixel basis. A difference between original and predicted pixel’s values for each pixel of a tile is encoded as prefix and suffix. The prefix is encoded by means of variable length code, and suffix is encoded as is. The QC divides a tile of pixels on a set of blocks and quantizes pixels of each block independently of the other blocks. The number of quantization bits for all pixels of a block depends on standard deviation calculated over the block. A difference between pixel’s value and average value over the block is quantized and transmitted.

Dealing with faulty measurements in WLAN indoor positioning

Jafet Morales, David Akopian, Sos Agaian

Show abstract

Recently indoor positioning methods based on WLAN signal measurements gained popularity because of high localization accuracy. These methods exploit radio maps obtained from wireless signal measurement surveys on location grids. Measurement sets from various WLAN access points are called fingerprints and characterize locations where the measurements are collected. As WLAN environments do not ensure continuous measurements availability, and faulty or rogue access points may unexpectedly change surveyed signal patterns, resiliency becomes an important issue to address using algorithmic methods. This paper first proposes a general fault model that integrates several reported models. Then performance degradations due to faults are studied for conventional fingerprinting methods. Two improvements to positioning systems are proposed for mitigating the impact of faulty measurements. The first improvement takes into account the intermittent unavailability of AP samples when calculating kNN. The second improvement allows the system to switch from a high accuracy method that works only under normal conditions, to a more resilient method whenever a high number of faults are suspected. Performance figures are provided for positioning with data surveyed from a real environment, to which varying amounts of faults have been introduced artificially.

Fast ice image retrieval based on a multilayer system

Guoyu Lu, Scott Sorensen, Chandra Kambhamettu

Show abstract

We propose a multilayer system to perform ice image retrieval. Ice images are typically texture-less, which adds difficulty in retrieving the images. To achieve high accuracy, high level local features are usually used in retrieving the images. However, most high level features contain high dimensionality that slows down the retrieval process. To overcome this problem, we divide the retrieval process into 3 steps. Each step filters out a large portion of images. As the features are constructed according to the ice image properties, one image can be quickly localized compared with the use of high-level features. The ice images are captured in Arctic, where the ice state changes dramatically due to the environmental and other influences. We build the first layer of the system on the utilization of color information and edges, as the color and the edges are the most critical characteristics of ice images. We divide the second layer into two sub-layers. The first sublayer is on the use of edge histogram. For the second sublayer, we detect salient points based on pixel values on the edge position and connect every adjacent points with straight lines. A new feature is built on the basis of distance scale of every adjacent salient points and the angles between connected lines. Our new feature is invariant to transformation, rotation and scaling. As the features in the first two layers are holistic features, the time performance is much better than high-level local features. The third layer is to apply Harris detector to find the correspondences between two features on a small set of filtered images. The experiments show that our system achieves good accuracy while maintaining much better time performance.