Share Email Print

Proceedings Paper

Visually comparing multiple partitions of data with applications to clustering
Author(s): Jianping Zhou; Shawn Konecni; Georges Grinstein
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Tightly coupled visualization and analysis is a powerful approach to data exploration especially for clustering. We describe such a specific integration of analysis and visualization for the evaluation of multiple partitions of a data set. Partitions are decompositions of a dataset into a family of disjoint subsets. They may be the results of clustering, of groupings of categorical dimensions, of binned numerical dimensions, of predetermined class labeling dimensions, or of prior knowledge structured in mutually exclusive format (one data item associated with one and only one outcome). Partition or cluster stability analysis can be used to identify near-optimal structures, build ensembles, or conduct validation. We extend Parallel Sets to a new visualization tool which provides for the mutual comparison and evaluation of multiple partitions of the same dataset. We describe a novel layout algorithm for informatively rearranging the order of records and dimensions. We provide examples of its application to data stability and correlation at the record, cluster, and dimension levels within a single interactive display.

Paper Details

Date Published: 20 January 2009
PDF: 12 pages
Proc. SPIE 7243, Visualization and Data Analysis 2009, 72430J (20 January 2009); doi: 10.1117/12.810093
Show Author Affiliations
Jianping Zhou, Univ. of Massachusetts, Lowell (United States)
Shawn Konecni, Univ. of Massachusetts, Lowell (United States)
Georges Grinstein, Univ. of Massachusetts, Lowell (United States)

Published in SPIE Proceedings Vol. 7243:
Visualization and Data Analysis 2009
Katy Börner; Jinah Park, Editor(s)

© SPIE. Terms of Use
Back to Top