Visualization and Data Analysis 2008

Front Matter: Volume 6809

Show abstract

This PDF file contains the front matter associated with SPIE-IS&T Proceedings Volume 6809, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.

Visualizing extreme-scale data

Kwan-Liu Ma

Show abstract

The ability to extract meaning from the huge amounts of data obtained from simulations, experiments, sensors, or the world wide web gives one tremendous advantage over others in the respective area of business or study. Visualization becomes a hot topic because it enables that ability. As data size is growing from terascale to petascale and exascale, new visualization techniques must be developed and integrated into data analysis tools and problem solving environments so the collected data can be fully exploited. In this talk, I will point out a few important directions for advancing the visualization technology, which include parallel visualization, knowledge-assisted visualization, intelligent visualization, and in situ visualization. I will use some of the projects we have done at UC Davis in my discussion.

Analysis of eye-tracking experiments performed on a Tobii T60

Chris Weigle, David C. Banks

Show abstract

Commercial eye-gaze trackers have the potential to be an important tool for quantifying the benefits of new visualization techniques. The expense of such trackers has made their use relatively infrequent in visualization studies. As such, it is difficult for researchers to compare multiple devices - obtaining several demonstration models is impractical in cost and time, and quantitative measures from real-world use are not readily available. In this paper, we present a sample protocol to determine the accuracy of a gaze-tacking device.

Visualizing multidimensional query results using animation

Amit P. Sawant, Christopher G. Healey

Show abstract

Effective representation of large, complex collections of information (datasets) presents a difficult challenge. Visualization is a solution that uses a visual interface to support efficient analysis and discovery within the data. Our primary goal in this paper is a technique that allows viewers to compare multiple query results representing user-selected subsets of a multidimensional dataset. We present an algorithm that visualizes multidimensional information along a space-filling spiral. Graphical glyphs that vary their position, color, and texture appearance are used to represent attribute values for the data elements in each query result. Guidelines from human perception allow us to construct glyphs that are specifically designed to support exploration, facilitate the discovery of trends and relationships both within and between data elements, and highlight exceptions. A clustering algorithm applied to a user-chosen ranking attribute bundles together similar data elements. This encapsulation is used to show relationships across different queries via animations that morph between query results. We apply our techniques to the MovieLens recommender system, to demonstrate their applicability in a real-world environment, and then conclude with a simple validation experiment to identify the strengths and limitations of our design, compared to a traditional side-by-side visualization.

Extending the dimensionality of flatland with attribute view probabilistic models

Eric Neufeld, Mikelis Bickis, Kevin Grant

Show abstract

In much of Bertin's Semiology of Graphics, marks representing individuals are arranged on paper according to their various attributes (components). Paper and computer monitors can conveniently map two attributes to width and height, and can map other attributes into nonspatial dimensions such as texture, or colour. Good visualizations exploit the human perceptual apparatus so that key relationships are quickly detected as interesting patterns. Graphical models take a somewhat dual approach with respect to the original information. Components, rather than individuals, are represented as marks. Links between marks represent conceptually simple, easily computable, and typically probabilistic relationships of possibly varying strength, and the viewer studies the diagram to discover deeper relationships. Although visually annotated graphical models have been around for almost a century, they have not been widely used. We argue that they have the potential to represent multivariate data as generically as pie charts represent univariate data. The present work suggests a semiology for graphical models, and discusses the consequences for information visualization.

Concept relationship editor: a visual interface to support the assertion of synonymy relationships between taxonomic classifications

Paul Craig, Jessie Kennedy

Show abstract

An increasingly common approach being taken by taxonomists to define the relationships between taxa in alternative hierarchical classifications is to use a set-based notation which states relationship between two taxa from alternative classifications. Textual recording of these relationships is cumbersome and difficult for taxonomists to manage. While text based GUI tools are beginning to appear which ease the process, these have several limitations. Interactive visual tools offer greater potential to allow taxonomists to explore the taxa in these hierarchies and specify such relationships. This paper describes the Concept Relationship Editor, an interactive visualisation tool designed to support the assertion of relationships between taxonomic classifications. The tool operates using an interactive space-filling adjacency layout which allows users to expand multiple lists of taxa with common parents so they can explore and assert relationships between two classifications.

Visual and analytical extensions for the table lens

Mathias John, Christian Tominski, Heidrun Schumann

Show abstract

Many visualization approaches teach us that ease of use is the key to effective visual data analysis. The Table Lens is an excellent example of a simple, yet expressive visual method that can help in analyzing even larger volumes of data. In this work, we present two extensions of the original Table Lens approach. In particular, we extend the Table Lens by Two-Tone Pseudo Coloring (TTPC) and a hybrid clustering. By integrating TTPC into the Table Lens, we obtain visual representations that can communicate larger volumes of data while still maintaining precision. Secondly, we propose to integrate a data analysis step that implements a hybrid clustering based on self-organizing maps and hierarchical clustering. The analysis step helps to extract and communicate complementary structural information about the data and also serves to drive interactive information drill-down.

Visual analytics techniques for large multi-attribute time series data

Ming C. Hao, Umeshwar Dayal, Daniel A. Keim

Show abstract

Time series data commonly occur when variables are monitored over time. Many real-world applications involve the comparison of long time series across multiple variables (multi-attributes). Often business people want to compare this year's monthly sales with last year's sales to make decisions. Data warehouse administrators (DBAs) want to know their daily data loading job performance. DBAs need to detect the outliers early enough to act upon them. In this paper, two new visual analytic techniques are introduced: The color cell-based Visual Time Series Line Charts and Maps highlight significant changes over time in a long time series data and the new Visual Content Query facilitates finding the contents and histories of interesting patterns and anomalies, which leads to root cause identification. We have applied both methods to two real-world applications to mine enterprise data warehouse and customer credit card fraud data to illustrate the wide applicability and usefulness of these techniques.

Streamline visualization of multiple 2D vector fields

Timothy Urness, Victoria Interrante

Show abstract

The analysis of data that consists of multiple vector fields can be greatly facilitated by the simultaneous visualization of the vector fields. An effective visualization must accurately reflect the key physical structures of the fields in a way that does not allow for an unintended bias toward one distribution. While there are several effective techniques to visualize a single vector field through equally spaced streamlines, applying these techniques to individual vector fields and combining them in a single image yields undesirable artifacts. In this paper, we present strategies for the effective visualization of two vector fields through the use of streamlines.

Interactive view-driven evenly spaced streamline placement

Zhanping Liu, Robert J. Moorhead II

Show abstract

This paper presents an Interactive View-Driven Evenly Spaced Streamline placement algorithm (IVDESS) for 3D explorative visualization of large complex planar or curved surface flows. IVDESS rapidly performs accurate streamline integration in 3D physical space, i.e., the flow field, while achieving high quality streamline density control in 2D view space, i.e., the output image. The correspondence between the two spaces is established by using a projection-unprojection pair constituted through geometric surface rendering. An inter-frame physical-space seeding strategy based on streamline reuse+lengthening is adopted, on top of intra-frame view-space seeding, to not only enable coherent flow navigation but also speedup placement generation. IVDESS employs a view-sensitive streamline representation that is well suited for streamline reuse, lengthening, and rendering. In addition, it avoids temporal incoherence caused by streamline splitting and jaggy lines caused by unprojection errors. Our algorithm can run at interactive frame rates (9FPS for placement generation) to allow for 3D exploration of surface flows with smooth evolution of high-density (1%) evenly spaced streamlines in a large window (990 x 700 pixels) on an ordinary PC without either pre-processing or GPU support.

Exploration of uncertainty in bidirectional vector fields

Torre Zuk, Jon Downton, David Gray, et al.

Show abstract

While their importance is increasingly recognized, there remain many challenges in the development of uncertainty visualizations. We introduce two uncertainty visualizations for 2D bidirectional vector fields: one based on a static glyph and the other based on animated flow. These visualizations were designed for the task of understanding and interpreting anisotropic rock property models in the domain of seismic data processing. Aspects of the implementations are discussed relating to design, interaction, and tasks.

Zooming in multispectral datacubes using PCA

Alexander Broersen, Robert van Liere, Ron M. A. Heeren

Show abstract

Imaging mass spectrometry is a technique to determine of which materials a small, physical sample is made. Current feature extraction techniques fail to extract certain small, high resolution characteristics from these multi-spectral datacubes. Causes are a low signal-to-noise ratio, the presence of dominant but uninteresting features, and the huge amount of variables in the dataset. In this paper, we present a zooming technique based on principal component analysis (PCA) to select regions in a datacube for enhanced feature extraction at the highest possible resolution. It enables the selection of spectral and spatial regions at a low resolution and recursively apply PCA to zoom in on interesting, correlated features. This approach is not based on complex and data-specific denoising algorithms. Moreover, it decreases execution time when additional filters have to be applied. The technique utilizes a higher signal-to-noise ratio in the data, without losing the high resolution characteristics. Less interesting and/or dominating features can be excluded in the spectral and spatial dimension. For these reasons, more features can be distinguished and in greater detail. Analysts can zoom into a feature of interest by increasing the resolution.

Image analysis of hyperspectral and multispectral data using projection pursuit

Nilofar Azizi, Julian Meng

Show abstract

Given recent advancements of modern hyperspectral (HS) sensors, the potential for information extraction has increased drastically given the continual improvements in spatial and spectral resolution. As a result, more sophisticated feature extraction and target detection (TD) algorithms are needed to improve the performance of the image analyst, whether computer-based or human. In this paper, a novel TD algorithm based on Projection Pursuit (PP) is proposed and implemented. PP is a well-known technique for dimensionality reduction in multi-band data sets without loss of any critical information. This technique highlights different features of interest in an image, thus improving and simplifying subsequent anomaly detection. The new target detection technique is based on a hybrid of PP and Reed_Xiaoli (RX) anomaly detector. In this study, the combining of PP with the RX detector (PPRX) adds some extra value to the standard RX detection technique and leads the development of a TD method that can be applied on hyperspectral/multispectral (MS) data sets. This novel technique, after being trained by using the Projection Index (PI) and a priori information of target of interest, utilizes RX detector to evaluate each potential projection. The main drawback of previously introduced PP methods such as those based on Information Divergence and Kurtosis/Skewness is that these techniques are sensitive to statistical outliers and cannot be used to highlight a specific target of interest. This study uses three data sets: (1) 4-band IKONOS multispectral data (2) 210-band HYDICE, and (3) 200-band simulated hyperspectral data set.

G-Space: a linear time graph layout

Brian Wylie, Jeffrey Baumes, Timothy M. Shead

Show abstract

We describe G-Space, a straightforward linear time layout algorithm that draws undirected graphs based purely on their topological features. The algorithm is divided into two phases. The first phase is an embedding of the graph into a 2-D plane using the graph-theoretical distances as coordinates. These coordinates are computed with the same process used by HDE (High-Dimensional Embedding) algorithms. In our case we do a Low-Dimensional Embedding (LDE), and directly map the graph distances into a two dimensional geometric space. The second phase is the resolution of the many-to-one mappings that frequently occur within the low dimensional embedding. The resulting layout appears to have advantages over existing methods: it can be computed rapidly, and it can be used to answer topological questions quickly and intuitively.

Visual analysis and exploration of complex corporate shareholder networks

Tatiana Tekušová, Jörn Kohlhammer

Show abstract

The analysis of large corporate shareholder network structures is an important task in corporate governance, in financing, and in financial investment domains. In a modern economy, large structures of cross-corporation, cross-border shareholder relationships exist, forming complex networks. These networks are often difficult to analyze with traditional approaches. An efficient visualization of the networks helps to reveal the interdependent shareholding formations and the controlling patterns. In this paper, we propose an effective visualization tool that supports the financial analyst in understanding complex shareholding networks. We develop an interactive visual analysis system by combining state-of-the-art visualization technologies with economic analysis methods. Our system is capable to reveal patterns in large corporate shareholder networks, allows the visual identification of the ultimate shareholders, and supports the visual analysis of integrated cash flow and control rights. We apply our system on an extensive real-world database of shareholder relationships, showing its usefulness for effective visual analysis.

Visualizing the temporal distribution of terminologies for biological ontology development

Tak-eun Kim, Hodong Lee, Jinah Park, et al.

Show abstract

Communities in biology have developed a number of ontologies that provide standard terminologies for the characteristics of various concepts and their relationships. However, it is difficult to construct and maintain such ontologies in biology, since it is a non-trivial task to identify commonly used potential member terms in a particular ontology, in the presence of constant changes of such terms over time as the research in the field advances. In this paper, we propose a visualization system, called BioTermViz, which presents the temporal distribution of ontological terms from the text of published journal abstracts. BioTermViz shows such a temporal distribution of terms for journal abstracts in the order of published time, occurrences of the annotated Gene Ontology concepts per abstract, and the ontological hierarchy of the terms. With a combination of these three types of information, we can capture the global tendency in the use of terms, and identify a particular term or terms to be created, modified, segmented, or removed, effectively developing biological ontologies in an interactive manner. In order to demonstrate the practical utility of BioTermViz, we describe several scenarios for the development of an ontology for a specific sub-class of proteins, or ubiquitin-protein ligases.

The forensic validity of visual analytics

Robert F. Erbacher

Show abstract

The wider use of visualization and visual analytics in wide ranging fields has led to the need for visual analytics capabilities to be legally admissible, especially when applied to digital forensics. This brings the need to consider legal implications when performing visual analytics, an issue not traditionally examined in visualization and visual analytics techniques and research. While digital data is generally admissible under the Federal Rules of Evidence [10][21], a comprehensive validation of the digital evidence is considered prudent. A comprehensive validation requires validation of the digital data under rules for authentication, hearsay, best evidence rule, and privilege. Additional issues with digital data arise when exploring digital data related to admissibility and the validity of what information was examined, to what extent, and whether the analysis process was sufficiently covered by a search warrant. For instance, a search warrant generally covers very narrow requirements as to what law enforcement is allowed to examine and acquire during an investigation. When searching a hard drive for child pornography, how admissible is evidence of an unrelated crime, i.e. drug dealing. This is further complicated by the concept of "in plain view". When performing an analysis of a hard drive what would be considered "in plain view" when analyzing a hard drive. The purpose of this paper is to discuss the issues of digital forensics and the related issues as they apply to visual analytics and identify how visual analytics techniques fit into the digital forensics analysis process, how visual analytics techniques can improve the legal admissibility of digital data, and identify what research is needed to further improve this process. The goal of this paper is to open up consideration of legal ramifications among the visualization community; the author is not a lawyer and the discussions are not meant to be inclusive of all differences in laws between states and countries.

Polar stratospheric cloud visualization: volume reconstruction from intersecting curvilinear cross sections

Jessica R. Crouch, Chris Weigle, Jonathan Gleason, et al.

Show abstract

The CALIPSO satellite launched by NASA in 2006 uses an on-board LIDAR instrument to measure the vertical distribution of clouds and aerosols along the orbital path. This satellite's dense vertical sampling of the atmosphere provides previously unavailable information about the altitude and composition of clouds, including the polar stratospheric clouds (PSCs) that play an important role in the annual formation of polar ozone holes. Reconstruction of cloud surfaces through interpolation of CALIPSO data is challenging due to the sparsity of the data in the non-vertical dimensions and the complex sampling pattern created by intersecting non-planar orbital paths. This paper presents a method for computing cloud surfaces by reconstructing a continuous cloud surface distance field. The distance field reconstruction is performed via shape-based interpolation of the cloud contours on each cross section using a medial axis representation of each contour. The interpolation algorithm employs a projection operator that is defined in terms of (latitude, longitude, altitude) coordinates, so that projection between cross sections follows the earth's curved atmosphere and preserves cloud altitude. This process successfully interpolated cloud contours from CALIPSO data acquired during the 2006 polar winter and enabled three-dimensional visualization of the PSCs.

Integration of information and volume visualization for analysis of cell lineage and gene expression during embryogenesis

Andrej Cedilnik, Jeffrey Baumes, Luis Ibanez, et al.

Show abstract

Dramatic technological advances in the field of genomics have made it possible to sequence the complete genomes of many different organisms. With this overwhelming amount of data at hand, biologists are now confronted with the challenge of understanding the function of the many different elements of the genome. One of the best places to start gaining insight on the mechanisms by which the genome controls an organism is the study of embryogenesis. There are multiple and inter-related layers of information that must be established in order to understand how the genome controls the formation of an organism. One is cell lineage which describes how patterns of cell division give rise to different parts of an organism. Another is gene expression which describes when and where different genes are turned on. Both of these data types can now be acquired using fluorescent laser-scanning (confocal or 2-photon) microscopy of embryos tagged with fluorescent proteins to generate 3D movies of developing embryos. However, analyzing the wealth of resulting images requires tools capable of interactively visualizing several different types of information as well as being scalable to terabytes of data. This paper describes how the combination of existing large data volume visualization and the new Titan information visualization framework of the Visualization Toolkit (VTK) can be applied to the problem of studying the cell lineage of an organism. In particular, by linking the visualization of spatial and temporal gene expression data with novel ways of visualizing cell lineage data, users can study how the genome regulates different aspects of embryonic development.

A phrase-driven grammar system for interactive data visualization

Sang Yun Lee, Ulrich Neumann

Show abstract

A Phrase-Driven Grammar System (PDGS) is a novel GUI for facilitating the visualization of data. The PDGS integrates data source applications and external visualization tools into its framework and functions as a middle-layer application to coordinate their operations. It allows users to formulate data query and visualization descriptions by selecting graphical icons in a menu or on a map. To specify data query and visualization intuitively and efficiently, we designed Graphical User Interface and a natural-language-like grammar, Phrase-Driven Grammar (PDG). The formulation of PDG data query and visualization descriptions is a constrained natural-language phrase building process. PDG phrases produce graphical visualizations of the data query, allowing users to interactively explore meaningful data relationships, trends, and exceptions.

Efficient sequence classification by R2-Kernel

Hansheng Lei

Show abstract

A novel kernel, named R²-kernel, is presented for efficient sequence classification based on the Support Vector Machine (SVM). As an intuitive similarity measure, R² naturally introduces a stop technique in multi-class SVM evaluation, that is, when there exist K support vectors from class i that has similarity beyond a certain threshold with testing sequence X, we can assign sequence X to class i directly and stop the SVM evaluation. The stop technique is seamless integrated into multi-class SVM to improve its evaluation efficiency without negative impact on the performance. Experimental results confirmed the efficiency of the stop technique introduced by the R²-kernel.

Visualization of multidimensional database

Chung Lee

Show abstract

The concept of multidimensional databases has been extensively researched and wildly used in actual database application. It plays an important role in contemporary information technology, but due to the complexity of its inner structure, the database design is a complicated process and users are having a hard time fully understanding and using the database. An effective visualization tool for higher dimensional information system helps database designers and users alike. Most visualization techniques focus on displaying dimensional data using spreadsheets and charts. This may be sufficient for the databases having three or fewer dimensions but for higher dimensions, various combinations of projection operations are needed and a full grasp of total database architecture is very difficult. This study reviews existing visualization techniques for multidimensional database and then proposes an alternate approach to visualize a database of any dimension by adopting the tool proposed by Kiviat for software engineering processes. In this diagramming method, each dimension is represented by one branch of concentric spikes. This paper documents a C++ based visualization tool with extensive use of OpenGL graphics library and GUI functions. Detailed examples of actual databases demonstrate the feasibility and effectiveness in visualizing multidimensional databases.

Visually exploring worldwide incidents tracking system data

Shree D. Chhatwal, Stuart J. Rose

Show abstract

This paper presents and explores the application of a visualization and analysis tool - Juxter - as an interface for exploration of incidents described within the Worldwide Incidents Tracking System and describes several refinements that improve user interactions and aid identification of patterns and trends.

Visualization and Data Analysis 2008

Volume Details

Table of Contents

Table of Contents