Share Email Print

Proceedings Paper

A boosted distance metric: application to content based image retrieval and classification of digitized histopathology
Author(s): Jay Naik; Scott Doyle; Ajay Basavanhally; Shridar Ganesan; Michael D. Feldman; John E. Tomaszewski; Anant Madabhushi
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Distance metrics are often used as a way to compare the similarity of two objects, each represented by a set of features in high-dimensional space. The Euclidean metric is a popular distance metric, employed for a variety of applications. Non-Euclidean distance metrics have also been proposed, and the choice of distance metric for any specific application or domain is a non-trivial task. Furthermore, most distance metrics treat each dimension or object feature as having the same relative importance in determining object similarity. In many applications, such as in Content-Based Image Retrieval (CBIR), where images are quantified and then compared according to their image content, it may be beneficial to utilize a similarity metric where features are weighted according to their ability to distinguish between object classes. In the CBIR paradigm, every image is represented as a vector of quantitative feature values derived from the image content, and a similarity measure is applied to determine which of the database images is most similar to the query. In this work, we present a boosted distance metric (BDM), where individual features are weighted according to their discriminatory power, and compare the performance of this metric to 9 other traditional distance metrics in a CBIR system for digital histopathology. We apply our system to three different breast tissue histology cohorts - (1) 54 breast histology studies corresponding to benign and cancerous images, (2) 36 breast cancer studies corresponding to low and high Bloom-Richardson (BR) grades, and (3) 41 breast cancer studies with high and low levels of lymphocytic infiltration. Over all 3 data cohorts, the BDM performs better compared to 9 traditional metrics, with a greater area under the precision-recall curve. In addition, we performed SVM classification using the BDM along with the traditional metrics, and found that the boosted metric achieves a higher classification accuracy (over 96%) in distinguishing between the tissue classes in each of 3 data cohorts considered. The 10 different similarity metrics were also used to generate similarity matrices between all samples in each of the 3 cohorts. For each cohort, each of the 10 similarity matrices were subjected to normalized cuts, resulting in a reduced dimensional representation of the data samples. The BDM resulted in the best discrimination between tissue classes in the reduced embedding space.

Paper Details

Date Published: 3 March 2009
PDF: 12 pages
Proc. SPIE 7260, Medical Imaging 2009: Computer-Aided Diagnosis, 72603F (3 March 2009); doi: 10.1117/12.813931
Show Author Affiliations
Jay Naik, Rutgers Univ. (United States)
Scott Doyle, Rutgers Univ. (United States)
Ajay Basavanhally, Rutgers Univ. (United States)
Shridar Ganesan, Cancer Institute of New Jersey (United States)
Michael D. Feldman, Univ. of Pennsylvania (United States)
John E. Tomaszewski, Univ. of Pennsylvania (United States)
Anant Madabhushi, Rutgers Univ. (United States)

Published in SPIE Proceedings Vol. 7260:
Medical Imaging 2009: Computer-Aided Diagnosis
Nico Karssemeijer; Maryellen L. Giger, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?