Imaging and Printing in a Web 2.0 World II

Front Matter: Volume 7879

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 7879, including the Title Page, Copyright information, Table of Contents, Introduction, and the Conference Committee listing.

Web-based magazine design for self publishers

Andrew Hunter, David Slatter, Darryl Greig

Show abstract

Short run printing technology and web services such as MagCloud provide new opportunities for long-tail magazine publishing. They enable self publishers to supply magazines to a wide range of communities, including groups that are too small to be viable as target communities for conventional publishers. In a Web 2.0 world where users constantly discover new services and where they may be infrequent patrons of any single service, it is unreasonable to expect users to learn the complex service behaviors. Furthermore, we want to open up publishing opportunities to novices who are unlikely to have prior experience of publishing and who lack design expertise. Magazine design automation is an ambitious goal, but recent progress with another web service, Autophotobook, proves that some level of automation of publication design is feasible. This paper describes our current research effort to extend the automation capabilities of Autophotobook to address the issues of magazine design so that we can provide a service to support professional-quality self publishing by novice users for a wide range of community types and sizes.

Improve artwork designs through data ranking system

Wiley Wang, Russ Muzzolini

Show abstract

In personalized digital printing, such as greeting cards, calendars and photo books, people select artworks to match their photos at their preference. Art work design elements are often categorized by occasions, styles, and products. The amount of designs grows significantly, as customers demand more choices and the trends of popular designs rise and fade season by season. It is crucial to manage and understand how design elements are used in order to create most desirable productions. In this paper, we analyze and compare different design tracking systems. Art work designs are labeled, ranked, and cross referenced. For each system, we demonstrate the scale of applications, data collection techniques and its advantages and disadvantages.

DOM-based print-link detection for web article extraction

Sam Liu, Suk-Hwan Lim, Jerry Liu

Show abstract

Web article pages usually have hyperlinks (or links) that lead to print-friendly web pages containing mainly the article content. Content extraction using these print-friendly pages is generally easier and more reliable, but there are many variations of the print-link representations in HTML that made robust print-link detection more difficult than it first appears. First, the link can be text-based, image-based, or both. For example, there is a lexicon of phrases used to indicate print-friendly pages, such as "print", "print article", "print-friendly version", etc. In addition, some links use printer-resembling image icons with or without a print phrase present. To complicate the matter further, not all the links contain a valid URL, but instead the pages are dynamically generated either by the client Javascript or by the server, so no URL is available for extraction. We estimate that there are more than 90% of the Web article pages have print-links, of which about 35% of them have valid print-friendly URLs, which is a good percentage. Our solution to the print-link extraction problem takes on two stages: (1) the detection of the print-link, (2) the retrieval of the print-friendly page URL from the link attributes, including the test for its validity. Experimental results based on roughly 2000 web article pages suggest our solution is capable of achieving over 99% precision and 97% recall performance measures.

A web-based troubleshooting tool to help customers self-solve color issues with a digital printing workflow

Hector J. Santos-Villalobos, Victor Loewen, Mark Lehto, et al.

Show abstract

Current printing technologies enable customers to reproduce high quality, realistic, and colorful hard copies of their digital documents. Although the activity of printing is transparent to the customers, the progression of a customer's document through the color printing workflow (CPW) is a complex process that may alter the colors in the print job. Given the complexity of the CPW, it is a difficult problem to diagnose the source of the color issue. Novel tools and methods that address this challenge are beneficial for both the manufacturer and its customers. We propose a Web-based troubleshooting tool that helps customers to self-solve color issues with electrophotographic laser printers when printing solid colors in graphics and text. The tool helps the customer to reconfigure his/her CPW following printing best practices. If the issue is still unresolved, the tool guides the user to search the gamut of the printer for his/her color preference. The usability of the tool was carefully evaluated with human subject experiments. Also, the description and organization of the troubleshooting tasks were continuously reviewed and improved in regular meetings of the development team. In this paper, we describe the troubleshooting strategy, the color preference search algorithm, and the results of the usability experiments.

Language-based color editing for mobile device

Yonghui Zhao, Raja Bala, Karen M. Braun, et al.

Show abstract

Natural language color (NLC) was initially developed as a web-based application and then deployed in one Xerox print driver. NLC changes the image-editing paradigm from the use of curves, sliders, and knobs, to the use of verbal text-based commands such as "make light green much less yellowish". The technology appeals to a common user who has no expert knowledge in color science, and this naturally leads one to think about its use in mobile devices. A prototype GUI design for a language-based color editing on iPhone platform will be presented that uses several of its haptic interfaces (e.g. "slot-machine", shaking, swiping, etc.). A textual interface is provided to select a color to be modified within the image and a direction of change for the modification. A swipe interface is provided to select a magnitude and polarity for the modification. Actions on the textual and swipe interface are converted to natural language commands that are in turn used to derive a color transformation that is applied to relevant portions of the image to yield a modified image. The modifications are displayed in real time to the user.

Personalized imaging: moving closer to reality

Hengzhou Ding, Raja Bala, Zhigang Fan, et al.

Show abstract

The availability of web and on-line image sharing services makes image personalization and customization a more interesting topic. Nonetheless, designing a personalized image is a time-consuming task, requiring hours of work by expert designers. Observing the potential opportunity to make the design process easier and more amenable to ordinary users, we presented a semi-automatic tool for designing personalized images in the Electronic Imaging (EI) symposium last year.1, 2 As a follow-up, we present several improvements to the original semi-automatic tool, for both text insertion and text replacement on planar surfaces. We also describe our effort in implementing the tool as a true web-based service, which eliminates the need for installation of any software or packages by the user. We believe that we have made the technology of image personalization more friendly and accessible to ordinary users.

Document similarity measures and document browsing

Ildus Ahmadullin, Jian Fan, Niranjan Damera-Venkata, et al.

Show abstract

Managing large document databases is an important task today. Being able to automatically com- pare document layouts and classify and search documents with respect to their visual appearance proves to be desirable in many applications. We measure single page documents' similarity with respect to distance functions between three document components: background, text, and saliency. Each document component is represented as a Gaussian mixture distribution; and distances between dierent documents' components are calculated as probabilistic similarities between corresponding distributions. The similarity measure between documents is represented as a weighted sum of the components' distances. Using this document similarity measure, we propose a browsing mechanism operating on a document dataset. For these purposes, we use a hierarchical browsing environment which we call the document similarity pyramid. It allows the user to browse a large document dataset and to search for documents in the dataset that are similar to the query. The user can browse the dataset on dierent levels of the pyramid, and zoom into the documents that are of interest.

Adaptive removal of background and white space from document images using seam categorization

Claude Fillion, Zhigang Fan, Vishal Monga

Show abstract

Document images are obtained regularly by rasterization of document content and as scans of printed documents. Resizing via background and white space removal is often desired for better consumption of these images, whether on displays or in print. While white space and background are easy to identify in images, existing methods such as naïve removal and content aware resizing (seam carving) each have limitations that can lead to undesirable artifacts, such as uneven spacing between lines of text or poor arrangement of content. An adaptive method based on image content is hence needed. In this paper we propose an adaptive method to intelligently remove white space and background content from document images. Document images are different from pictorial images in structure. They typically contain objects (text letters, pictures and graphics) separated by uniform background, which include both white paper space and other uniform color background. Pixels in uniform background regions are excellent candidates for deletion if resizing is required, as they introduce less change in document content and style, compared with deletion of object pixels. We propose a background deletion method that exploits both local and global context. The method aims to retain the document structural information and image quality.

Aesthetic role of transparency and layering in the creation of photo layouts

Maria V. Ortiz Segovia, Niranjan Damera-Venkata, Eamonn O'Brien-Strain, et al.

Show abstract

Even though technology has allowed us to measure many different aspects of images, it is still a challenge to objectively measure their aesthetic appeal. A more complex challenge is presented when an arrangement of images is to be analyzed, such as in a photo-book page. Several approaches have been proposed to measure the appeal of a document layout that, in general, make use of geometric features such as the position and size of a single object relative to the overall layout. Fewer efforts have been made to include in a metric the influence of the content and composition of images in the layout. Many of the aesthetic characteristics that graphic designers and artists use in their daily work have been either left out of the analysis or only roughly approximated in an effort to materialize the concepts. Moreover, graphic design tools such as transparency and layering play an important role in the professional creation of layouts for documents such as posters and flyers. The main goal of our study is to apply similar techniques within an automated photo-layout generation tool. Among other design techniques, the tool makes use of layering and transparency in the layout to produce a professional-looking arrangement of the pictures. Two series of experiments with people from different levels of expertise with graphic design provided us with the tools to make the results of our system more appealing. In this paper, we discuss the results of our experiments in the context of distinct graphic design concepts.

Automatic picture orientation detection based on classifier combination

Yuejia Sun, Changsong Liu, Xiaoqing Ding, et al.

Show abstract

Automatic picture orientation recognition is of great significance in many applications such as consumer gallery management, webpage browsing, content-based searching or web printing. We try to solve this high-level classification problem by relatively low-level features including Spacial Color Moment (CM) and Edge Direction Histogram (EDH). An improved distance-based classification scheme is adopted as our classifier. We propose an input-vector-rotating strategy, which is computationally more efficient than several conventional schemes, instead of collecting and training samples for all four classes. Then we research on the classifier combination algorithm to make full use of the complementarity between different features and classifiers. Our classifier combination methods include two levels: feature-level and measurement-level. And we present two classifier combination structures (parallel and cascaded) at measurement-level with a rejection option. As the precondition of measurement-level methods, the theory of Classifier's Confidence Analysis (CCA) is introduced with the definition of concepts such as classifier's confidence and generalized confidence. The classification system finally approached 90% recognition accuracy on a wide unconstrained consumer picture set.

Whiteboard sharing: capture, process, and print or email

Michael Gormish, Berna Erol, Daniel G. Van Olst, et al.

Show abstract

Whiteboards support face to face meetings by facilitating the sharing of ideas, focusing attention, and summarizing. However, at the end of the meeting participants desire some record of the information from the whiteboard. While there are whiteboards with built-in printers, they are expensive and relatively uncommon. We consider the capture of the information on a whiteboard with a mobile phone, improving the image quality with a cloud service, and sharing the results. This paper describes the algorithm for improving whiteboard image quality, the user experience for both a web widget and a smartphone application, and the necessary adaptations for providing this as a web service. The web widget, and mobile apps for both iPhone and Android are currently freely available, and have been used by more than 50,000 people.

Building a print on demand web service

Prakash Reddy, Benedict Rozario, Shariff Dudekula, et al.

Show abstract

There is considerable effort underway to digitize all books that have ever been printed. There is need for a service that can take raw book scans and convert them into Print on Demand (POD) books. Such a service definitely augments the digitization effort and enables broader access to a wider audience. To make this service practical we have identified three key challenges that needed to be addressed. These are: a) produce high quality image images by eliminating artifacts that exist due to the age of the document or those that are introduced during the scanning process b) develop an efficient automated system to process book scans with minimum human intervention; and c) build an eco system which allows us the target audience to discover these books.

iULib: where UDL and Wikipedia could meet

Yonghong Tian, Tiejun Huang, Wen Gao

Show abstract

Empowering the group collaboration and knowledge-sharing capabilities for the Universal Digital Library (UDL) is definitely an important work after more than 1.5 million digitalized books were open to access online. One motivation of developing such a platform is the emergence of Web 2.0 in recent years, especially with the rapidly increased popularity of Wikipedia. This paper presents our vision, which we call iULib, about where and how UDL and Wikipedia could meet. In the first phase, we directly apply the Wiki architecture and software in UDL to upgrade the digital library as an interactive platform that facilitates community and collaboration. Preliminary implementation shows the feasibility and reliability of our design. Furthermore, as a free encyclopedia that assembles contributions from different users, Wikipedia may also be used as a knowledge base for UDL. As a result, UDL can be upgraded as an intelligent platform for information retrieval and knowledge sharing. Our practice at the WikipediaMM task in the ImgeCLEF 2008 shows that the knowledge network constructed from Wikipedia can be used to effectively expand the query semantics of image retrieval. It is expected that Wikipedia and digital library can integrate each other's valuable results and best practices to benefit each other.

Book Widget: embedding automated photo-document publication on the Web and in mobile devices

Eamonn O'Brien-Strain, Andrew Hunter, Jerry Liu, et al.

Show abstract

We describe a cloud-based automated-publishing platform that allows third party developers to embed our software components into their applications, enabling their users to rapidly create documents for interactive viewing, or fulfillment via mail or retail printing. We also describe how applications built on this platform can integrate with a variety of different consumer digital ecosystems, and how we will address the quality and scaling challenges.

Semantic photo books: leveraging blogs and social media for photo book creation

Mohamad Rabbath, Philipp Sandhaus, Susanne Boll

Show abstract

Recently, we observed a substantial increase in the users' interest in sharing their photos online in travel blogs, social communities and photo sharing websites. An interesting aspect of these web platforms is their high level of user-media interaction and thus a high-quality source of semantic annotations: Users comment on the photos of each others, add external links to their travel blogs, tag each other in the social communities and add captions and descriptions to their photos. However, while those media assets are shared online, many users still highly appreciate the representation of these media in appealing physical photo books where the semantics are represented in form of descriptive text, maps, and external elements in addition to their related photos. Thus, in this paper we aim at fulfilling this need and provide an approach for creating photo books from Web 2.0 resources. We concentrate on two kinds of online shared media as resources for printable photo books: (a) Blogs especially travel blogs (b) Social community websites like Facebook which witness a rapidly growing number of shared media elements including photos. We introduce an approach to select media elements including photos, geographical maps and texts from both blogs and social networks semi-automatically, and then use these elements to create a printable photo book with an appealing layout. Because the selected media elements can be too many for the resulting book, we choose the most proper ones by exploiting content based, social based, and interactive based criteria. Additionally we add external media elements such as geographical maps, texts and externally hosted photos from linked resources. Having selected the important media, our approach uses a genetic algorithm to create an appealing layout using aesthetical rules, such as positioning the photo with the related text or map in a way that respects the golden ratio and symmetry. Distributing the media over the pages is done by optimizing the distribution according to several rules such that no pages with purely textual elements without photos are produced. For the page layout appropriate photos are chosen for the background based on their salience. Other media assets, such as texts, photos and geographical maps are positioned in the foreground by a dynamic page layout algorithm respecting both the content of the photos and the background, and common rules for visual layout. The result of our system is a photo book in a printable format. We implemented our approach as web services that analyze the media elements, enrich them, and create the layout in order to finally publish a photo book. The connection to those services is implemented in two interfaces. The first is a tool to select entries from personal blogs, and the second is a Facebook application that allows the user to select photos from his albums.

Automatic image selection scheme utilizing comments for insertion of images into weblogs

Tomoaki Konno, Emi Myodo, Koichi Takagi, et al.

Show abstract

This paper presents a scheme which utilizes comments given to images on an image-sharing site in order to obtain an appropriate image for insertion into poem-like weblogs (blogs) as a way to represent their atmosphere (impression). The result shows that utilizing comments is effective. To achieve this purpose, there are two issues: how impression words are extracted from blogs and how images representing the impression words are obtained. Assuming that it is important to obtain images representing the impression words, this paper focuses on only the latter issue. We hypothesize that comments and tags extracted from an image-sharing site can be adequate for obtaining images corresponding to impression words at low cost. In particular, utilizing comments can be more appropriate for the image search with impression words than utilizing tags because the impression words are often used in comments. Therefore, we propose a scheme which utilizes comments to obtain appropriate images. In order to investigate the effectiveness of utilizing comments, conformance between impression words and the images was evaluated. The rating for conformance is 3.5 on a scale of 1 to 5 when utilizing comments, which is 0.6 higher than when utilizing tags.

Title identification of web article pages using HTML and visual features

Jian Fan, Ping Luo, Parag Joshi

Show abstract

Extracting informative content from Web article pages has many applications such as printing and content reuse. Title is a very significant and unique component of an article. However, identifying the true title is not an easy problem even for human readers. In this paper, we present a title identification method that takes into account of several features including the title field of the HTML page and HTML tag of a DOM node as well as font size and horizontal alignment. We tested our method on a ground truth data set consisting of 1993 pages from 98 web sites and achieved 97.5% accuracy, about 20% above a baseline method based on only the font size.

Creating 3D realistic head: from two orthogonal photos to multiview face contents

Yuan Lin, Qian Lin, Feng Tang, et al.

Show abstract

3D Head models have many applications, such as virtual conference, 3D web game, and so on. The existing several web-based face modeling solutions that can create a 3D face model from one or two user uploaded face images, are limited to generating the 3D model of only face region. The accuracy of such reconstruction is very limited for side views, as well as hair regions. The goal of our research is to develop a framework for reconstructing the realistic 3D human head based on two approximate orthogonal views. Our framework takes two images, and goes through segmentation, feature points detection, 3D bald head reconstruction, 3D hair reconstruction and texture mapping to create a 3D head model. The main contribution of the paper is that the processing steps are applies to both the face region as well as the hair region.

Mobile multimedia understanding applications: an overview

Xiaofan Lin

Show abstract

In recent years, mobile devices are quickly reaching almost every corner of our daily life in a variety of forms: personal media players, smart phones, netbooks, and tablets. Besides the more powerful, smaller, and more versatile hardware, another driving force is the vast number of software applications ("apps") on those mobile devices. A number of mobile apps employ intelligent multimedia understanding (MU) technologies. This paper gives an overview of such apps. The focus is not on the underlying MU techniques, which are already covered by a huge amount of literature. Instead, it attempts to shed some light on the junction of mobile apps and MU. For this purpose, it addresses a number of important aspects: unique requirements and characteristics of MU-related apps, values brought in by MU, typical MU technologies, various system architectures, available development tools, and related standards.

Learning object detectors from online image search

Feng Tang, Daniel R. Tretter

Show abstract

Being able to detect distinguishable objects is a key component in many high level computer vision applications. Traditional methods for building such detectors require a large amount of carefully collected and cleaned data. For example to build a face detector, a large number of face images need to be collected and faces in each image need to be cropped and aligned as the data for training. This process is tedious and error-pruning. Recently more and more people are sharing their photos on the internet, if we could leverage these data for building a detector, it will save tremendous amount of effort in collecting training data. Popular internet search engines and community photo websites like Google image search, Picassa, Flickr make it possible to harvesting online images for image understanding tasks. In this paper, we develop a method leveraging images obtained from online image search to build an object detector. The proposed method can automatically identify the most distinguishable features across the downloaded images. Using these learned features, a detector can be built to detect the object in a new image. Experiments show promising results of our approach.

Image categorization for marketing purposes

Mishari I. Almishari, Haengju Lee, Nathan Gnanasambandam

Show abstract

Images meant for marketing and promotional purposes (i.e. coupons) represent a basic component in incentivizing customers to visit shopping outlets and purchase discounted commodities. They also help department stores in attracting more customers and potentially, speeding up their cash flow. While coupons are available from various sources - print, web, etc. categorizing these monetary instruments is a benefit to the users. We are interested in an automatic categorizer system that aggregates these coupons from different sources (web, digital coupons, paper coupons, etc) and assigns a type to each of these coupons in an efficient manner. While there are several dimensions to this problem, in this paper we study the problem of accurately categorizing/classifying the coupons. We propose and evaluate four different techniques for categorizing the coupons namely, word-based model, n-gram-based model, externally weighing model, weight decaying model which take advantage of known machine learning algorithms. We evaluate these techniques and they achieve high accuracies in the range of 73.1% to 93.2%. We provide various examples of accuracy optimizations that can be performed and show a progressive increase in categorization accuracy for our test dataset.

Text extraction from web images

Changsong Liu, Cheng Yang, Xiaoqing Ding, et al.

Show abstract

Web images constitute an important part of web document and become a powerful medium of expression, especially for the images containing text. The text embedded in web images often carry semantic information related to layout and content of the pages. Statistics show that there is a significant need to detect and recognize text from web images. In this paper, we first give a short review of these methods proposed for text detection and recognition in web images; then a framework to extract from web images is presented, including stages of text localization and recognition. In text localization stage, localization method is applied to generate text candidates and a two-stage strategy is utilized to select text candidates, then text regions are localized using a coarse-to-fine text lines extraction algorithm. For text recognition, two text region binarization methods have been proposed to improve the performance of text recognition in web images. Experimental results for text localization and recognition prove the effectiveness of these methods. Additionally, a recognition evaluation for text regions in web images has been conducted for benchmark.

Web image annotation using two-step filtering on social tags

Sunyoung Cho, Jaeseong Cha, Hyeran Byun

Show abstract

Web image annotation has become an important issue with exploding web images and the necessity of effective image search. The social tags have recently utilized at image annotation because they can reflect the user's tagging tendency, and reduce the semantic gap. However, an effective filtering procedure is required to extract the relevant tags since the user's subjectivity and noisy tags. In this paper, we propose a two-step filtering on social tags for image annotation. This method conducts the filtering and verification tasks by analyzing the tags of visual neighbor images using voting method and co-occurrence analysis. Our method consists of the following three steps: 1) the tag candidate set is founded by searching the visual neighbor images, 2) from a given tag candidate set, coarse filtering is conducted by tag grouping and voting technique, 3) the dense filtering is conducted by using similarity verification for coarse filtered candidate tag set. To evaluate the performance of our approach, we conduct the experiments on a social-tagged image dataset obtained from Flickr. We compare the annotation accuracy between the voting method and our proposed method. Our experimental results show that our method has an improvement in image annotation.

Imaging and Printing in a Web 2.0 World II

Volume Details

Table of Contents

Table of Contents