Proceedings Volume 8302

Imaging and Printing in a Web 2.0 World III

cover
Proceedings Volume 8302

Imaging and Printing in a Web 2.0 World III

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 24 February 2012
Contents: 6 Sessions, 22 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2012
Volume Number: 8302

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 8302
  • Web Printing and Analysis
  • Online Photo Services
  • Social Media and Mobile Document Applications
  • Layout Analysis and Creation
  • Content Understanding
Front Matter: Volume 8302
icon_mobile_dropdown
Front Matter: Volume 8302
This PDF file contains the front matter associated with SPIE Proceedings Volume 8302, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Web Printing and Analysis
icon_mobile_dropdown
Text documents as social networks
Helen Balinsky, Alexander Balinsky, Steven J. Simske
The extraction of keywords and features is a fundamental problem in text data mining. Document processing applications directly depend on the quality and speed of the identification of salient terms and phrases. Applications as disparate as automatic document classification, information visualization, filtering and security policy enforcement all rely on the quality of automatically extracted keywords. Recently, a novel approach to rapid change detection in data streams and documents has been developed. It is based on ideas from image processing and in particular on the Helmholtz Principle from the Gestalt Theory of human perception. By modeling a document as a one-parameter family of graphs with its sentences or paragraphs defining the vertex set and with edges defined by Helmholtz's principle, we demonstrated that for some range of the parameters, the resulting graph becomes a small-world network. In this article we investigate the natural orientation of edges in such small world networks. For two connected sentences, we can say which one is the first and which one is the second, according to their position in a document. This will make such a graph look like a small WWW-type network and PageRank type algorithms will produce interesting ranking of nodes in such a document.
A URL shortener for mobile web consumption
Hua Zhang, Xiao-Wei Wu, Yu Zhang, et al.
We present hp2.me, a URL shortener service for improving the mobile web consumption experience. Unlike other such services, given a short URL, hp2.me returns an image rendered from the salient regions of a web page. This approach to displaying web content improves mobile web reading experience through reduced latency and improved clarity. It is faster to load a few large image files over a cellular data network than many small files, and limited mobile screen real estate can be better utilized to display relevant content. The hp2.me service is currently in external beta, and we present results to illustrate the advantages of this technique.
HP Smart Print
Hua Zhang, Zhen Liu, Yue Yuan, et al.
We present HP SmartPrint, a novel web browser plug-in which automatically suggests print-worthy content within a web page and provides an intuitive UI for users to make corrections to the initial suggestion, if needed. The resulting prints contain only user desired content and excludes noise such as ads, thus increasing the desirability of the prints while minimizing the cost. This solution provides a streamlined web printing experience and will be shipping with most HP printers starting in 2011.
Online Photo Services
icon_mobile_dropdown
Kind of images in printed photo books
Reiner Fageth, Peter Schütz, Thomas Wagner
This paper describes the usage of images in tangible products as a function of its origin, coming from digital still cameras (DSC) or mobile devices. It is also shown, that pictures from mobile devices are mainly used to complete story telling in photo books, they are currently not a driver for generating this kind of high value products. Images taken from mobile devices generate to a great extent only prints mainly ordered via kiosk systems.
SmartFit: automatic photo fitting for variable data printing
Zachi Karni, Amir Gaash
We present an algorithm for smart image fitting: changing the size of an image so that it may fit "naturally" within a given frame. As the frame's dimensions will generally differ from that of the image, the algorithm preserves important details in their original aspect ratio, while less important details undergo more substantial deformations. This problem is useful for many commercial print applications. One example is the HP SmartStream Designer, which is a tool to create variable and personalized content documents.
All new custom path photo book creation
Wiley Wang, Russ Muzzolini
In this paper, we present an all new custom path to allow consumers to have full control to their photos and the format of their books, while providing them with guidance to make their creation fast and easy. The users can choose to fully automate the initial creation, and then customize every page. The system manage many design themes along with numerous design elements, such as layouts, backgrounds, embellishments and pattern bands. The users can also utilize photos from multiple sources including their computers, Shutterfly accounts, Shutterfly Share sites and Facebook. The users can also use a photo as background, add, move and resize photos and text - putting what they want where they want instead of being confined to templates. The new path allows users to add embellishments anywhere in the book, and the high-performance platform can support up to 1,000 photos per book and up to 25 pictures per page. The path offers either Smart Autofill or Storyboard features allowing customers to populate their books with photos so they can add captions and customize the pages.
Investigation of the role of aesthetics in differentiating between photographs taken by amateur and professional photographers
Shao-Fu Xue, Qian Lin, Daniel R. Tretter, et al.
Automatically quantifying the aesthetic appeal of images is an interesting problem in computer science and image processing. In this paper, we incorporate aesthetic properties and convert them into computable image features for classifying photographs taken by amateur and professional photographers. In particular, color histograms, spatial edge distribution, and repetition identification are used as features. Results of experiments on professional and amateur photograph data sets confirm the discriminative power of these features.
Social Media and Mobile Document Applications
icon_mobile_dropdown
Measuring engagement effectiveness in social media
Lei Li, Tong Sun, Wei Peng, et al.
Social media is becoming increasingly prevalent with the advent of web 2.0 technologies. Popular social media websites, such as Twitter and Facebook, are attracting a gigantic number of online users to post and share information. An interesting phenomenon under this trend involves that more and more users share their experiences or issues with regard to a product, and then the product service agents use commercial social media listening and engagement tools (e.g. Radian6, Sysomos, etc.) to response to users' complaints or issues and help them tackle their problems. This is often called customer care in social media or social customer relationship management (CRM). However, all these existing commercial social media tools only provide an aggregated level of trends, patterns and sentiment analysis based on the keyword-centric brand relevant data, which have little insights for answering one of the key questions in social CRM system: how effective is our social customer care engagement? In this paper, we focus on addressing the problem of how to measure the effectiveness of engagement for service agents in customer care. Traditional CRM effectiveness measurements are defined under the scenario of the call center, where the effectiveness is mostly based on the duration time per call and/or number of answered calls per day. Different from customer care in a call center, we can obtain detailed conversations between agents and customers in social media, and therefore the effectiveness can be measured by analyzing the content of conversations and the sentiment of customers.
Building a scalable storage for images in a social network
Jaime Medrano Navarro, Jesús Javier Maestro de la Calle
Images are one of the key components of a social network. A storage for images needs to be highly scalable and provide redundancy, high availability and the ability to grow its size. Efficiency is also required so that disk stage and the need for processing power can be minimized. Tuenti's image storage uses a Content Delivery Network (CDN) as a web cache that allows us to meet high throughput requirements. When an image is not cached in the CDN, it is requested from the Image Routing Layer (IRL), which is in charge of finding its physical location. If the IRL is not able to retrieve the image from one of the locations it can get it from the other copies available, preventing the CDN and the user from noticing the miss. If the requested size is not available in the storage, the IRL will automatically resize the best size available and serve it back. Expensive operations, such as finding the physical location or resizing, are only done when there is a cache miss on the CDN. The physical storage is split in homogeneous buckets that are spread across the storage servers. The growth strategy is to add more Storage Servers and to rebalance buckets towards them. Rebalancing not only provides free space on full servers but also allows the upload bandwidth to increase because there will be fewer buckets per server, and so fewer uploads per server.
Color correction of smartphone photos with prior knowledge
Human visual system has the property of perceiving the object color to remain constant regardless of the prevailing illumination. However, digital cameras usually lack this capability, and the captured images are digitally corrected to discount the color of the scene light based on the estimated illuminant. Illumination estimation might be erroneous in some artificial or chromatic lighting conditions. A method was proposed to correct digital photos captured with a smartphone camera using the smartphone owner's face as the reference. Taking the advantage of the latest smartphones with two build-in cameras, we could use the front camera to capture the smartphone owner's face and compare with the saved reference face image in order to estimate the scene illuminant. After that, we could properly adjust the capture setting for the main camera in order to take a decent target image; or we could automatically correct the target image based on the estimated illumination by comparing two face images. The method was implemented on the iOS mobile platform. Experimental result shows that the adjusted images using the proposed method are generally more favorable than the pictures taken directly by the default camera application.
XML data compression in web publishing
Ruiheng Qiu, Wei Hu, Zhi Tang, et al.
XML is widely used in various document formats on the web. But it has caused negative impacts such as expensive document distribution time over the web, and long content jumping and rendering delay, especially on mobile devices. Hence we proposed a Schema-based efficient queryable XML compressor, called XTrim, which significantly improves compression ratio by utilizing optimized information in XML Schema while supporting efficient queries. Firstly, XTrim draws structure information from XML document and corresponding XML Schema. Then a novel technique is used to transform the XML tree-like structure into a compact indexed form to support efficient queries. At the same time, text values are obtained, and a language-based text trim method (LTT) that facilitates language-specific text compressors is adopted to reduce the size of text values in various languages. In LTT a word composition detection method is proposed to better process text in non-Latin languages. To evaluate the performance of XTrim, we have implemented a compressor and query engine prototype. Via extensive experiments, results show that XTrim outperforms XMill and existing queryable alternatives in terms of compression ratio, as well as the query efficiency. By applying XTrim to documents, the storage space can save up to 30% and the content jumping and rendering delay is reduced to less than 100ms from 4 seconds.
Layout Analysis and Creation
icon_mobile_dropdown
Layout hierarchies for interactive design reuse
Darryl Greig, Andrew Hunter, David Slatter
The advent of viable long tail & self-publishing solutions ([1], [2]) has spawned new requirements for automatic layout technologies. In most cases these attempt to lay out whole pages, spreads or documents based on complete content data. In this paper we introduce a new approach to document layout based on the principle of interactive design reuse, in which a new design is created from an existing high quality design via a sequence of simple steps to establish the final content. Based on our experience building such a system we propose a method of building layout hierarchies and discuss the implementation of editing operations appropriate to this new paradigm.
Automatic page composition with combined image crop and layout metrics
Andrew Hunter, Darryl Greig
Automatic layout algorithms simplify the composition of image-rich documents, but they still require users to have sufficient artistry to supply well cropped and composed imagery. Combining an automatic cropping technology with a document layout system enables better results to be produced faster by less-skilled users. This paper reviews prior work in automatic image cropping and automatic page layout and presents a case for a combined crop and layout technology. We describe one such technology in a system for interactive publication design by amateur self-publishers and show that providing an automatic cropping system with additional information about the layout context can enable it to generate a more appropriate set of ranked crop options for a given image. Furthermore, we show that providing an automatic layout system with sets of ranked crop options for images can enable it to compose more appropriate page layouts.
Psychophysical evaluation of document visual similarity
Aziza Satkhozhina, Ildus Ahmadullin, Seungyon Lee, et al.
Applications that classify and search documents based on their visual appearance need to recognize what document features are the most critical to human perception when humans compare the documents. This paper presents the results of a psychophysical experiment where subjects were asked to group the documents based on their visual similarity. Results from 15 subjects were saved into similarity matrices, and tested for inter-rater agreement. The similarity matrix averaged across the subjects was analyzed using agglomerative hierarchical clustering to identify the clusters. The humans' clustering was approximated with the weighted sum of four distance matrices that we calculated based on four document features. We identified the relative importance of the document features using an optimization method. Then, we tested the approximation using K-fold cross validation and the K-nearest neighbor algorithm. The results of the testing confirm the effectiveness of our approach.
Similarity pyramid: browsing a document database with respect to visual similarity
Ildus Ahmadullin, Jan Allebach
Managing large document databases has become an important task. Sorting documents with respect to their visual similarity and layout features, and visualization of the whole document database is a desirable application. A user may wish to search for documents in a database that are similar to a query in temrs of their stylistic features, or he/she may want to browse the whole database. In these tasks, clustering similar documents and organizing the document database with respect to the clusters is preferable to presenting documents in a random order. In this paper, we propose organization of single-page documents in a 3-D hierarchical structure called a similarity pyramid. The pyramid is constructed from a stack of document database embeddings on a 2-D surface with the help of a nonlinear dimensionality reduction algorithm called Isomap. The mapping algorithm preserves similarity distances between documents by mapping documents that are close to each other in a feature space to points on low-dimensional surface that are close to each other. Higher levels of the pyramid consist of document image icons that represent a large group of roughly similar documents, whereas lower levels contain document image icons representing small groups of very similar documents. A user can browse the database by moving along a certain level of a pyramid by moving between dierent levels
Automatic design of magazine covers
Ali Jahanian, Jerry Liu, Daniel R. Tretter, et al.
In this paper, we propose a system for automatic design of magazine covers that quantifies a number of concepts from art and aesthetics. Our solution to automatic design of this type of media has been shaped by input from professional designers, magazine art directors and editorial boards, and journalists. Consequently, a number of principles in design and rules in designing magazine covers are delineated. Several techniques are derived and employed in order to quantify and implement these principles and rules in the format of a software framework. At this stage, our framework divides the task of design into three main modules: layout of magazine cover elements, choice of color for masthead and cover lines, and typography of cover lines. Feedback from professional designers on our designs suggests that our results are congruent with their intuition.
Content Understanding
icon_mobile_dropdown
Automatic content recognition for the next-generation TV experience
Smart TVs has been introduced. Second, applications running on mobile devices (so called "second-screen apps") have significantly enriched TV watching experience. As an enabler of content-aware TVs and apps, automatic content recognition (ACR) is attracting a lot of attention recently. This paper presents an overview of ACR in this context. It attempts to answer a number of questions: Why do we need ACR for the next generation TV experience? What is the relationship between ACR and existing technologies? What are the unique requirements and challenges on ACR in those applications? What are the typical implementation architectures? It also describes the existing products in this space.
Marketing image categorization using hybrid human-machine combinations
Nathan Gnanasambandam, Himanshu Madhu
Marketing instruments with nested, short-form, symbol loaded content need to be studied differently. Image classification in the Web2.0 world can dynamically use a configurable amount of internal and external data as well as varying levels of crowd-sourcing. Our work is one such examination of how to construct a hybrid technique involving learning and crowd-sourcing. Through a parameter called turkmix and a multitude of crowd-sourcing techniques available we show that we can control the trend of metrics such as precision and recall on the hybrid categorizer.
Global image analysis to determine suitability for text-based image personalization
Lately, image personalization is becoming an interesting topic. Images with variable elements such as text usually appear much more appealing to the recipients. In this paper, we describe a method to pre-analyze the image and automatically suggest to the user the most suitable regions within an image for text-based personalization. The method is based on input gathered from experiments conducted with professional designers. It has been observed that regions that are spatially smooth and regions with existing text (e.g. signage, banners, etc.) are the best candidates for personalization. This gives rise to two sets of corresponding algorithms: one for identifying smooth areas, and one for locating text regions. Furthermore, based on the smooth and text regions found in the image, we derive an overall metric to rate the image in terms of its suitability for personalization (SFP).
Chrominance watermark embed using a full-color visibility model
A watermark embed scheme has been developed to insert a watermark with the maximum signal strength for a user selectable visibility constraint. By altering the watermark strength and direction to meet a visibility constraint, the maximum watermark signal for a particular image is inserted. The method consists of iterative embed software and a full color human visibility model plus a watermark signal strength metric. The iterative approach is based on the intersections between hyper-planes, which represent visibility and signal models, and the edges of a hyper-volume, which represent output device visibility and gamut constraints. The signal metric is based on the specific watermark modulation and detection methods and can be adapted to other modulation approaches. The visibility model takes into account the different contrast sensitivity functions of the human eye to L, a and b, and masking due to image content.
Document image orientation based on both text and image
This paper investigated the problem of orientation detection for document images with Chinese characters. These images may be in four orientations: right side up, up-side down, 90° and 270° rotated counterclockwise. First, we presented the structure of text-recognition-based orientation detection algorithm. Text line verification and orientation judgment methods were mainly discussed, afterwards multiple experiments were carried. Distance-difference based text line verification and confidence based text line verification were proposed and compared with methods without text line verification. Then, a picture-based orientation detection framework was adopted for the situation where no text line was detected. This high-level classification problem was solved by relatively low-level vision features including Color Moments (CM) and Edge Direction Histogram (EDH), with distant-based classification scheme. Finally, confidencebased classifier combination strategy was employed in order to make full use of the complementarity between different features and classifiers. Experiments showed that both text line verification methods were able to improve the accuracy of orientation detection, and picture-based orientation detection had a good performance for no-text image set.