Computer-assisted diagnosis in cervical histopathology

Computer-facilitated histology image analysis may play an important role in clinical diagnosis of cancers and identification of prognostic and therapeutic targets.
07 December 2010
Lei He, L. Rodney Long, Sameer Antani and George Thoma

Histopathology,1 the microscopic study of biopsies to locate and classify disease, is usually performed by histopathologists through examination of a thin slice of tissue under an optical or electron microscope. With different imaging technologies,2 histology images of diverse modalities are produced for manual or automated analysis. In general, such images are taken at low magnification and include many objects of interest, such as cells and prominent cellular structures (e.g., nuclei). These are widely distributed in the images and surrounded by different neighboring tissues (for example, in the cervix, epithelium, and stroma: see Figure 1). In histology image analysis, histopathologists visually examine the cellular morphology and tissue distribution to deduce whether tissue regions are cancerous, and to determine the malignancy level. Presently, such diagnosis is regarded the gold standard for clinical diagnosis of cancers, as well as for identification of prognostic and therapeutic targets.


Figure 1.Cervical histology image example.

As manual analysis of histology images continues to be the primary method for identification of cancerous tissues, the method is heavily dependent on the expertise and experience of the practitioner. Aside from being time consuming, a difficulty remains in reproducibly grading cancers because of intra- and interobservational variations in the grading process. To overcome these problems, computer-assisted diagnostic (CAD) systems are becoming increasingly important in this field.3 The National Library of Medicine, in collaboration with the National Cancer Institute, is developing a new CAD system for automated cervical intraepithelial neoplasia (CIN) detection and grading. A typical CAD system includes functions for image preprocessing, segmentation, feature extraction, dimension reduction, disease detection and grading, and postprocessing (see Figure 2). Image preprocessing improves the input image quality to allow image segmentation for accurate extraction of the regions of interest. Feature extraction and dimension reduction identify a small number of mathematical features that are used by pattern-recognition and machine-learning techniques for disease identification and classification. For applications such as image annotation, postprocessing may be needed to derive high-level knowledge from the analysis results.


Figure 2.Computer-assisted diagnostic-system flowchart.

Using the identical histology image in Figure 1, we tested our current segmentation algorithm.4 The objective was to extract four target classes from the hematoxylin- and eosin-stained cervix histology image: nuclei, red blood cells (in the stroma), cytoplasm (in the epithelium), and background. The boundary between stroma and epithelium is mixed with all different target classes. We compared our results against other well-known approaches, including multiple thresholding, K-means clustering, Markov random-field segmentation by graph cut,5 and multiphase active contours (see Figure 3).6,7 We applied Gaussian-mixture models to estimate the distributions of different object classes and extracted the four classes in different color channels. The stroma (left: dark-gray regions) and epithelium (right: light-gray regions) are well separated by our algorithm. Furthermore, compared to the other approaches, our methodology shows more accurate segmentation of cellular features.


Figure 3.Cervical histology image-segmentation results. From left to right: (top row) Our result,4multiple thresholding, K-means clustering. (bottom row) Markov random-field segmentation by graph cut,5Chan-Vese active contour,6 Samson's model.7

Our future work includes further development of CAD modules for feature extraction and classification. The eventual goal of consistent and rapid CIN grading will significantly save diagnostic time, overcome intra- and interobservational variation and ultimately benefit the patient.

This research is supported by the Intramural Research Program of the National Institutes of Health, National Library of Medicine, and the Lister Hill National Center for Biomedical Communications.


Lei He, L. Rodney Long, Sameer Antani, George Thoma
National Library of Medicine
Bethesda, MD

Lei He works on histology image analysis. He received his PhD in electrical engineering from the University of Cincinnati. His research interests include medical-image analysis, computer vision, and machine learning.

Rodney Long is an electronics engineer for the Communications Engineering Branch. He worked for 14 years in industry as a software developer and systems engineer. His research interests are telecommunications, image processing, and scientific/biomedical databases.

Sameer Antani, staff scientist at the National Library of Medicine and the National Institutes of Health, studies multimodal (image and text) biomedical informatics and next-generation, multimedia-rich scientific publications. He is a member of the IEEE, IEEE Computer Society, and SPIE.

George Thoma, a branch chief, directs research and development programs in document-image analysis, biomedical imaging, and related areas. He earned a BS from Swarthmore College, and his MS and PhD from the University of Pennsylvania, all in electrical engineering. He is a SPIE Fellow.


PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research