Share Email Print
cover

Proceedings Paper

Visual analytics of anomaly detection in large data streams
Author(s): Ming C. Hao; Umeshwar Dayal; Daniel A. Keim; Ratnesh K. Sharma; Abhay Mehta
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Most data streams usually are multi-dimensional, high-speed, and contain massive volumes of continuous information. They are seen in daily applications, such as telephone calls, retail sales, data center performance, and oil production operations. Many analysts want insight into the behavior of this data. They want to catch the exceptions in flight to reveal the causes of the anomalies and to take immediate action. To guide the user in finding the anomalies in the large data stream quickly, we derive a new automated neighborhood threshold marking technique, called AnomalyMarker. This technique is built on cell-based data streams and user-defined thresholds. We extend the scope of the data points around the threshold to include the surrounding areas. The idea is to define a focus area (marked area) which enables users to (1) visually group the interesting data points related to the anomalies (i.e., problems that occur persistently or occasionally) for observing their behavior; (2) discover the factors related to the anomaly by visualizing the correlations between the problem attribute with the attributes of the nearby data items from the entire multi-dimensional data stream. Mining results are quickly presented in graphical representations (i.e., tooltip) for the user to zoom into the problem regions. Different algorithms are introduced which try to optimize the size and extent of the anomaly markers. We have successfully applied this technique to detect data stream anomalies in large real-world enterprise server performance and data center energy management.

Paper Details

Date Published: 20 January 2009
PDF: 10 pages
Proc. SPIE 7243, Visualization and Data Analysis 2009, 72430B (20 January 2009); doi: 10.1117/12.810945
Show Author Affiliations
Ming C. Hao, Hewlett-Packard Labs. Palo Alto (United States)
Umeshwar Dayal, Hewlett-Packard Labs. Palo Alto (United States)
Daniel A. Keim, Univ. of Konstanz (Germany)
Ratnesh K. Sharma, Hewlett-Packard Labs. Palo Alto (United States)
Abhay Mehta, Hewlett-Packard Labs. Palo Alto (United States)


Published in SPIE Proceedings Vol. 7243:
Visualization and Data Analysis 2009
Katy Börner; Jinah Park, Editor(s)

© SPIE. Terms of Use
Back to Top