Proceedings Volume 10574

Medical Imaging 2018: Image Processing

cover
Proceedings Volume 10574

Medical Imaging 2018: Image Processing

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 22 May 2018
Contents: 17 Sessions, 115 Papers, 47 Presentations
Conference: SPIE Medical Imaging 2018
Volume Number: 10574

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 10574
  • Brain: Shapes and Biomarkers
  • Deep Learning: Segmentation
  • Image Enhancement
  • Machine Learning
  • Registration
  • Keynote and Highlights
  • fMRI and DTI
  • Motion
  • Image Features
  • Deep Learning: Lesions and Pathologies
  • Deep Learning: Generative Adversarial Networks
  • Poster Session: Enhancement
  • Poster Session: Machine Learning
  • Poster Session: Quantification and Modeling
  • Poster Session: Registration
  • Poster Session: Segmentation
Front Matter: Volume 10574
icon_mobile_dropdown
Front Matter: Volume 10574
This PDF file contains the front matter associated with SPIE Proceedings Volume 10574, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Brain: Shapes and Biomarkers
icon_mobile_dropdown
Sulcal depth-based cortical shape analysis in normal healthy control and schizophrenia groups
Ilwoo Lyu, Hakmook Kang, Neil D. Woodward, et al.
Sulcal depth is an important marker of brain anatomy in neuroscience/neurological function. Previously, sulcal depth has been explored at the region-of-interest (ROI) level to increase statistical sensitivity to group differences. In this paper, we present a fully automated method that enables inferences of ROI properties from a sulcal region- focused perspective consisting of two main components: 1) sulcal depth computation and 2) sulcal curve-based refined ROIs. In conventional statistical analysis, the average sulcal depth measurements are employed in several ROIs of the cortical surface. However, taking the average sulcal depth over the full ROI blurs overall sulcal depth measurements which may result in reduced sensitivity to detect sulcal depth changes in neurological and psychiatric disorders. To overcome such a blurring effect, we focus on sulcal fundic regions in each ROI by filtering out other gyral regions. Consequently, the proposed method results in more sensitive to group differences than a traditional ROI approach. In the experiment, we focused on a cortical morphological analysis to sulcal depth reduction in schizophrenia with a comparison to the normal healthy control group. We show that the proposed method is more sensitivity to abnormalities of sulcal depth in schizophrenia; sulcal depth is significantly smaller in most cortical lobes in schizophrenia compared to healthy controls (p < 0.05).
Skull segmentation from MR scans using a higher-order shape model based on convolutional restricted Boltzmann machines
Oula Puonti, Koen Van Leemput, Jesper D. Nielsen, et al.
Transcranial brain stimulation (TBS) techniques such as transcranial magnetic stimulation (TMS), transcranial direct current stimulation (tDCS) and others have seen a strong increase as tools in therapy and research within the last 20 years. In order to precisely target the stimulation, it is important to accurately model the individual head anatomy of a subject. Of particular importance is accurate reconstruction of the skull, as it has the strongest impact on the current pathways due to its low conductivity. Thus providing automated tools, which can reliably reconstruct the anatomy of the human head from magnetic resonance (MR) scans would be highly valuable for the application of transcranial stimulation methods. These head models can also be used to inform source localization methods such as EEG and MEG. Automated segmentation of the skull from MR images is, however, challenging as the skull emits very little signal in MR. In order to avoid topological defects, such as holes in the segmentations, a strong model of the skull shape is needed. In this paper we propose a new shape model for skull segmentation based on the so-called convolutional restricted Boltzmann machines (cRBMs). Compared to traditionally used lower-order shape models, such as pair-wise Markov random fields (MRFs), the cRBMs model local shapes in larger spatial neighborhoods while still allowing for efficient inference. We compare the skull segmentation accuracy of our approach to two previously published methods and show significant improvement.
Imaging biomarkers for the diagnosis of Prion disease
Liane S. Canas, Benjamin Yvernault, Carole Sudre, et al.
Prion diseases are a group of progressive neurodegenerative conditions which cause cognitive impairment and neurological deficits. To date, there is no accurate measure that can be used to diagnose this illness, or to quantify the evolution of symptoms over time. Prion disease, due to its rarity, is in fact commonly mistaken for other types of dementia. A robust tool to diagnose and quantify the progression of the disease is key as it would lead to more appropriately timed clinical trials, and thereby improve patients’ quality of life. The approaches used to study other types of neurodegenerative diseases are not satisfactory to capture the progression of human form of Prion disease. This is due to the large heterogeneity of phenotypes of Prion disease and to the lack of consistent geometrical pattern of disease progression. In this paper, we aim to identify and select imaging biomarkers that are relevant for the diagnostic on Prion disease. We extract features from magnetic resonance imaging data and use genetic and demographic information from a cohort affected by genetic forms of the disease. The proposed framework consists of a multi-modal subjectspecific feature extraction step, followed by a Gaussian Process classifier used to calculate the probability of a subject to be diagnosed with Prion disease. We show that the proposed method improves the characterisation of Prion disease.
Constructing statistically unbiased cortical surface templates using feature-space covariance
The choice of surface template plays an important role in cross-sectional subject analyses involving cortical brain surfaces because there is a tendency toward registration bias given variations in inter-individual and inter-group sulcal and gyral patterns. In order to account for the bias and spatial smoothing, we propose a feature-based unbiased average template surface. In contrast to prior approaches, we factor in the sample population covariance and assign weights based on feature information to minimize the influence of covariance in the sampled population. The mean surface is computed by applying the weights obtained from an inverse covariance matrix, which guarantees that multiple representations from similar groups (e.g., involving imaging, demographic, diagnosis information) are down-weighted to yield an unbiased mean in feature space. Results are validated by applying this approach in two different applications. For evaluation, the proposed unbiased weighted surface mean is compared with un-weighted means both qualitatively and quantitatively (mean squared error and absolute relative distance of both the means with baseline). In first application, we validated the stability of the proposed optimal mean on a scan-rescan reproducibility dataset by incrementally adding duplicate subjects. In the second application, we used clinical research data to evaluate the difference between the weighted and unweighted mean when different number of subjects were included in control versus schizophrenia groups. In both cases, the proposed method achieved greater stability that indicated reduced impacts of sampling bias. The weighted mean is built based on covariance information in feature space as opposed to spatial location, thus making this a generic approach to be applicable to any feature of interest.
Deep Learning: Segmentation
icon_mobile_dropdown
Segmentation of anatomical structures in cardiac CTA using multi-label V-Net
Hui Tang, Mehdi Moradi, Ahmed El Harouni, et al.
Segmenting anatomical structures in the chest is a crucial step in many automatic disease detection applications. Multi-atlas based methods are developed for this task, however, due to the required deformable registration step, they are often computationally expensive and create a bottle neck in terms of processing time. In contrast, convolutional neural networks (CNNs) with 2D or 3D kernels, although slow to train, are very fast in the deployment stage and have been employed to solve segmentation tasks in medical imaging. A recent improvement in performance of neural networks in medical image segmentation was recently reported when dice similarity coefficient (DSC) was used to optimize the weights in a fully convolutional architecture called V-Net. However, in the previous work, only the DSC calculated for one foreground object is optimized, as a result the DSC based segmentation CNNs are only able to perform a binary segmentation. In this paper, we extend the V-Net binary architecture to a multi-label segmentation network and use it for segmenting multiple anatomical structures in cardiac CTA. The method uses multi-label V-Net optimized by the sum over DSC for all the anatomies, followed by a post-processing method to refine the segmented surface. Our method takes averagely less than 3 sec to segment a full CTA volume. In contrast, the fastest multi-atlas based methods published so far take around 10 mins. Our method achieves an average DSC of 76% for 16 segmented anatomies using four-fold cross validation, which is close to the state-of-the-art.
Iterative convolutional neural networks for automatic vertebra identification and segmentation in CT images
Segmentation and identification of the vertebrae in CT images are important steps for automatic analysis of the spine. This paper presents an automatic method based on iterative convolutional neural networks. These utilize the inherent order of the vertebral column to simplify the detection problem, so that the network can be trained with as little as ten manual reference segmentations. Vertebrae are segmented and identified one- by-one in sequential order, using an iterative procedure. Vertebrae are first roughly localized and identified in low-resolution images that enable the analysis of context information, and afterwards reanalyzed in the original high-resolution images to obtain a fine segmentation. The method was trained and evaluated with 15 spine CT scans from the MICCAI CSI 2014 workshop challenge. These scans cover the whole thoracic and lumbar part of the spine of healthy young adults. In contrast to a non-iterative convolutional neural network, which made labeling mistakes, the proposed iterative method correctly identified all vertebrae. Our method achieved a mean Dice coefficient of 0.948 and a mean surface distance of 0.29 mm and thus outperforms the best method that participated in the original challenge.
Splenomegaly segmentation using global convolutional kernels and conditional generative adversarial networks
Spleen volume estimation using automated image segmentation technique may be used to detect splenomegaly (abnormally enlarged spleen) on Magnetic Resonance Imaging (MRI) scans. In recent years, Deep Convolutional Neural Networks (DCNN) segmentation methods have demonstrated advantages for abdominal organ segmentation. However, variations in both size and shape of the spleen on MRI images may result in large false positive and false negative labeling when deploying DCNN based methods. In this paper, we propose the Splenomegaly Segmentation Network (SSNet) to address spatial variations when segmenting extraordinarily large spleens. SSNet was designed based on the framework of image-to-image conditional generative adversarial networks (cGAN). Specifically, the Global Convolutional Network (GCN) was used as the generator to reduce false negatives, while the Markovian discriminator (PatchGAN) was used to alleviate false positives. A cohort of clinically acquired 3D MRI scans (both T1 weighted and T2 weighted) from patients with splenomegaly were used to train and test the networks. The experimental results demonstrated that a mean Dice coefficient of 0.9260 and a median Dice coefficient of 0.9262 using SSNet on independently tested MRI volumes of patients with splenomegaly.
Segmentation of left ventricle myocardium in porcine cardiac cine MR images using a hybrid of fully convolutional neural networks and convolutional LSTM
Dongqing Zhang, Ilknur Icke, Belma Dogdas, et al.
In the development of treatments for cardiovascular diseases, short axis cardiac cine MRI is important for the assessment of various structural and functional properties of the heart. In short axis cardiac cine MRI, Cardiac properties including the ventricle dimensions, stroke volume, and ejection fraction can be extracted based on accurate segmentation of the left ventricle (LV) myocardium. One of the most advanced segmentation methods is based on fully convolutional neural networks (FCN) and can be successfully used to do segmentation in cardiac cine MRI slices. However, the temporal dependency between slices acquired at neighboring time points is not used. Here, based on our previously proposed FCN structure, we proposed a new algorithm to segment LV myocardium in porcine short axis cardiac cine MRI by incorporating convolutional long short-term memory (Conv-LSTM) to leverage the temporal dependency. In this approach, instead of processing each slice independently in a conventional CNN-based approach, the Conv-LSTM architecture captures the dynamics of cardiac motion over time. In a leave-one-out experiment on 8 porcine specimens (3,600 slices), the proposed approach was shown to be promising by achieving average mean Dice similarity coefficient (DSC) of 0.84, Hausdorff distance (HD) of 6.35 mm, and average perpendicular distance (APD) of 1.09 mm when compared with manual segmentations, which improved the performance of our previous FCN-based approach (average mean DSC=0.84, HD=6.78 mm, and APD=1.11 mm). Qualitatively, our model showed robustness against low image quality and complications in the surrounding anatomy due to its ability to capture the dynamics of cardiac motion.
Towards dense volumetric pancreas segmentation in CT using 3D fully convolutional networks
Holger Roth, Masahiro Oda, Natsuki Shimizu, et al.
Pancreas segmentation in computed tomography imaging has been historically difficult for automated methods because of the large shape and size variations between patients. In this work, we describe a custom-build 3D fully convolutional network (FCN) that can process a 3D image including the whole pancreas and produce an automatic segmentation. We investigate two variations of the 3D FCN architecture; one with concatenation and one with summation skip connections to the decoder part of the network. We evaluate our methods on a dataset from a clinical trial with gastric cancer patients, including 147 contrast enhanced abdominal CT scans acquired in the portal venous phase. Using the summation architecture, we achieve an average Dice score of 89.7 ± 3.8 (range [79.8, 94.8])% in testing, achieving the new state-of-the-art performance in pancreas segmentation on this dataset.
An effective fully deep convolutional neural networks for mitochondria segmentation based on ATUM-SEM
Recent studies have empowered that the relation between mitochondrial function and degenerative disorder- s is related to aging diseases. Due to the rapid development of electron microscope (EM), stacks delivered by EM can be used to investigate the link between mitochondrial function and physical structure. Whereas, one of the main challenges in mitochondria research is developing suitable segmentation algorithms to obtain the shapes of mitochondria. Nowadays, Deep Neural Network (DNN) has been widely applied in solving the neuron membrane segmentation problems in virtue of its exceptional performance. For this reason, its appli- cation to mitochondria segmentation holds great promise. In this paper, we propose an effective deep learning approach to realize mitochondria segmentation in Automated Tape-Collecting Ultra Microtome Scanning Elec- tron Microscopy (ATUM-SEM) stacks. The proposed algorithm contains three parts: (1) utilizing histogram equalization algorithm as image preprocessing to keep the consistency of dataset; (2) putting forward a fusion fully convolution network (FCN), which is motivated by the principle the deeper, the better, to build a much deeper network architecture for more accurate mitochondria segmentation; and (3) employing fully connected conditional random field (CRF) to optimize segmentation results. Evaluation was performed on a dataset of a stack of 31 slices from ATUM-SEM, with 20 images used for training and 11 images for testing. For comparison, U-Net approach was evaluated through the same dataset. Jaccard index between the automated segmentation and expert manual segmentations indicates that our method (90.1%) outperforms U-Net (87.9%) and has a preferable performance on mitochondria segmentation with different shapes and sizes.
Image Enhancement
icon_mobile_dropdown
A log-Euclidean and total variation based variational framework for computational sonography
Jyotirmoy Banerjee, Premal A. Patel, Fred Ushakov, et al.
We propose a spatial compounding technique and variational framework to improve 3D ultrasound image quality by compositing multiple ultrasound volumes acquired from different probe orientations. In the composite volume, instead of intensity values, we estimate a tensor at every voxel. The resultant tensor image encapsulates the directional information of the underlying imaging data and can be used to generate ultrasound volumes from arbitrary, potentially unseen, probe positions. Extending the work of Hennersperger et al.,1 we introduce a log-Euclidean framework to ensure that the tensors are positive-definite, eventually ensuring non-negative images. Additionally, we regularise the underpinning ill-posed variational problem while preserving edge information by relying on a total variation penalisation of the tensor field in the log domain. We present results on in vivo human data to show the efficacy of the approach.
Visualization of coronary artery calcium in dual energy chest radiography using automatic rib suppression
Bo Zhou, Yi Jiang, Di Wen, et al.
Coronary artery calcification (CAC) as assessed with CT calcium score is the best biomarker for predicting future cardiac events. Dual energy (DE) chest radiography offers an inexpensive, low radiation-dose alternative to CT. We have shown CAC can be visualized using non-gated, 2-shot, DE chest x-ray imaging. However, calcium signal from the ribs, superimposed on the CAC signal, can interfere with both image registration and reader detection of CAC. To improve the registration algorithm and the detectability of CAC, we created an enhanced CAC visualization algorithm with automatic rib segmentation and suppression. In this paper, we trained an active appearance model to detect the spatial location of rib in the DE bone images. Then, we suppressed the rib signal in the single kVp images using a kernel-filter based method. After the rib suppression, we used affine and non-rigid registration to align the CAC in low and high kVp image. 1800 dual energy cases were used in our experiment after data augmentation on 60 cases. We evaluated the segmentation from the rib AAM using six-fold cross validation. The average dice coefficient of the rib segmentation in the heart region is 0.92. The reduction of CAC mis-registration by the enhanced CAC registration algorithm is evaluated using both simulated and clinical images. Adding rib suppression improves the image registration significantly. Our results show that in both simulated synthetic images and clinical images, the CAC overlap improves from 0% to > 50% after rib suppression. Results suggest that CAC visualization and registration can be significantly improved using the rib suppression when rib signal confounds CAC signal in DE chest radiography.
Radiation dose reduction in digital breast tomosynthesis (DBT) by means of deep-learning-based supervised image processing
Junchi Liu, Amin Zarshenas, Ammar Qadir, et al.
To reduce cumulative radiation exposure and lifetime risks for radiation-induced cancer from breast cancer screening, we developed a deep-learning-based supervised image-processing technique called neural network convolution (NNC) for radiation dose reduction in DBT. NNC employed patched-based neural network regression in a convolutional manner to convert lower-dose (LD) to higher-dose (HD) tomosynthesis images. We trained our NNC with quarter-dose (25% of the standard dose: 12 mAs at 32 kVp) raw projection images and corresponding “teaching” higher-dose (HD) images (200% of the standard dose: 99 mAs at 32 kVp) of a breast cadaver phantom acquired with a DBT system (Selenia Dimensions, Hologic, CA). Once trained, NNC no longer requires HD images. It converts new LD images to images that look like HD images; thus the term “virtual” HD (VHD) images. We reconstructed tomosynthesis slices on a research DBT system. To determine a dose reduction rate, we acquired 4 studies of another test phantom at 4 different radiation doses (1.35, 2.7, 4.04, and 5.39 mGy entrance dose). Structural SIMilarity (SSIM) index was used to evaluate the image quality. For testing, we collected half-dose (50% of the standard dose: 32±14 mAs at 33±5 kVp) and full-dose (standard dose: 68±23 mAs at 33±5 kvp) images of 10 clinical cases with the DBT system at University of Iowa Hospitals and Clinics. NNC converted half-dose DBT images of 10 clinical cases to VHD DBT images that were equivalent to full dose DBT images. Our cadaver phantom experiment demonstrated 79% dose reduction.
Image enhancement method for digital mammography
Nikolai V. Slavine, Stephen Seiler, Timothy J. Blackburn, et al.
Purpose: To evaluate in clinical use a practical iterative deconvolution method to enhance contrast and image resolution in digital breast tomosynthesis. A novel rapidly converging, iterative deconvolution algorithm for improving the quantitative accuracy of previously reconstructed breast images by commercial breast tomosynthesis system is demonstrated. Materials and Methods: The method was tested on phantoms and clinical breast imaging data. Data acquisition was performed on a commercial Hologic Selenia Dimensions digital breast tomosynthesis system. The method was applied to patient breast images previously processed with Hologic Selenia conventional and C-View software to determine improvements in resolution and contrast to noise ratio. Results: In all of the phantom and patients’ breast studies the post-processed images proved to have higher resolution and contrast as compared with images reconstructed by Hologic methods. In general, the values of CNR reached a plateau at around 8 iterations with an average improvement factor of about 1.8 for processed Hologic Selenia images. Improvements in image resolution after the application of the method are also demonstrated. Conclusions: A rapidly converging, iterative deconvolution algorithm with a novel resolution subsets-based approach that operates on patient DICOM images has been used for quantitative improvement in digital breast tomosynthesis. The method can be applied to clinical breast images to improve image quality to diagnostically acceptable levels and will be crucial in order to facilitate diagnosis of tumor progression at the earliest stages. The method can be considered as an extended blind deblurring (or Richardson-Lucy like) algorithm with multiple resolution levels
Image reconstruction using priors from deep learning
Devi Ayyagari, Nisha Ramesh, Dimitri Yatsenko, et al.
Tomosynthesis, i.e. reconstruction of 3D volumes using projections from a limited perspective is a classical inverse, ill-posed or under constrained problem. Data insufficiency leads to reconstruction artifacts that vary in severity depending on the particular problem, the reconstruction method and also on the object being imaged. Machine learning has been used successfully in tomographic problems where data is insufficient, but the challenge with machine learning is that it introduces bias from the learning dataset. A novel framework to improve the quality of the tomosynthesis reconstruction that limits the learning dataset bias by maintaining consistency with the observed data is proposed. Convolutional Neural Networks (CNN) are embedded as regularizers in the reconstruction process to introduce the expected features and characterstics of the likely imaged object. The minimization of the objective function keeps the solution consistent with the observations and limits the bias introduced by the machine learning regularizers, improving the quality of the reconstruction. The proposed method has been developed and studied in the specific problem of Cone Beam Tomosynthesis Flouroscopy (CBT-fluoroscopy)1 but it is a general framework that can be applied to any image reconstruction problem that is limited by data insufficiency.
Machine Learning
icon_mobile_dropdown
Automated abdominal plane and circumference estimation in 3D US for fetal screening
Ultrasound is increasingly becoming a 3D modality. Mechanical and matrix array transducers are able to deliver 3D images with good spatial and temporal resolution. The 3D imaging facilitates the application of automated image analysis to enhance workflows, which has the potential to make ultrasound a less operator dependent modality. However, the analysis of the more complex 3D images and definition of all examination standards on 2D images pose barriers to the use of 3D in daily clinical practice. In this paper, we address a part of the canonical fetal screening program, namely the localization of the abdominal cross-sectional plane with the corresponding measurement of the abdominal circumference in this plane. For this purpose, a fully automated pipeline has been designed starting with a random forest based anatomical landmark detection. A feature trained shape model of the fetal torso including inner organs with the abdominal cross-sectional plane encoded into the model is then transformed into the patient space using the landmark localizations. In a free-form deformation step, the model is individualized to the image, using a torso probability map generated by a convolutional neural network as an additional feature image. After adaptation, the abdominal plane and the abdominal torso contour in that plane are directly obtained. This allows the measurement of the abdominal circumference as well as the rendering of the plane for visual assessment. The method has been trained on 126 and evaluated on 42 abdominal 3D US datasets. An average plane offset error of 5.8 mm and an average relative circumference error of 4.9 % in the evaluation set could be achieved.
Left ventricle segmentation in 3D ultrasound by combining structured random forests with active shape models
F. Khellaf, S. Leclerc, J. D. Voorneveld, et al.
Segmentation of the left ventricle (LV) in 3D echocardiography is essential to evaluate cardiac function. It is however a challenging task due to the anisotropy of speckle structure and typical artifacts associated with echocardiography. Several methods have been designed to segment the LV in 3D echocardiograms, but the development of more robust algorithms is still actively investigated. In this paper, we propose a new framework combining Structured Random Forests (SRF), a machine learning technique that shows great potential for edge detection, with Active Shape Models and we compare our segmentation results with state-of-the-art algorithms. We have tested our algorithm on the multi-center, multi-vendor CETUS challenge database, consisting of 45 sequences of 3D echocardiographic volumes. Segmentation was performed and evaluated for end-diastolic (ED) and end-systolic (ES) phases. The results show that combining machine learning with a shape model provides a very competitive LV segmentation, with a mean surface distance of 2.04 ± 0.48 mm for ED and 2.18 ± 0.79 mm for ES. The ejection fraction correlation coefficient reaches 0.87. The overall segmentation score outperforms the best results obtained during the challenge, while there is still room for further improvement, e.g. by increasing the size of the training set for the SRF or by implementing an automatic method to initialize our segmentation.
Fine segmentation of tiny blood vessel based on full connected conditional random field
Chenglong Wang, Masahiro Oda, Yasushi Yoshino, et al.
In this paper, we present an efficient trainable conditional random field (CRF) model using a newly proposed scale-targeted loss function to improve the segmentation accuracy on tiny blood vessels in 3D medical images. Blood vessel segmentation is still a big challenge in medical image processing field due to its elongated structure and low contrast. Conventional local neighboring CRF model has poor segmentation performance on tiny elongated structures due to its poor capability capturing pairwise potentials. To overcome this drawback, we use a fully-connected CRF model to capture the pairwise potentials. This paper also introduces a new scale-targeted loss function aiming to improve the segmentation accuracy on tiny blood vessels. Experimental results on both phantom data and clinical CT data showed that the proposed approach contributes to the segmentation accuracy on tiny blood vessels. Compared to previous loss function, our proposed loss function improved about 10% sensitivity on phantom data and 14% on clinical CT data.
Automatic and fast CT liver segmentation using sparse ensemble with machine learned contexts
A fast and automatic method, using machine learning and min-cuts on a sparse graph, for segmenting Liver from CT Contrast enhanced (CTCE) datasets is proposed. The method first localizes the liver by estimating its centroid using a machine learnt model with features that capture global contextual information. Individual ‘N’ rapid segmentations are carried out by running a min-cut on a sparse 3D rectilinear graph placed at the estimated liver centroid with fractional offsets. Edges of the graph are assigned a cost that is a function of a conditional probability, predicted using a second machine learnt model, which encodes relative location along with a local context. The costs represent the likelihood of the edge crossing the liver boundary. Finally, 3D ensembles of ‘N’ such low resolution, high variance sparse segmentations gives a final high resolution, low variance semantic segmentation. The proposed method is tested on three publically available challenge databases (SLIVER07, 3Dircadb1 and Anatomy3) with M-fold cross validation. On the most popular database: SLIVER07 alone, consisting of 20 datasets we obtained a mean dice score of 0.961 with 4-fold cross validation and an average run-time of 6.22s on a commodity hardware (Intel 3.6GHz dual core, with no GPU). On a combined database of 60 datasets from all three, we obtained a mean dice score of 0.934 with 6-fold cross validation.
Nearest neighbor 3D segmentation with context features
Evelin Hristova, Heinrich Schulz, Tom Brosch, et al.
Automated and fast multi-label segmentation of medical images is challenging and clinically important. This paper builds upon a supervised machine learning framework that uses training data sets with dense organ annotations and vantage point trees to classify voxels in unseen images based on similarity of binary feature vectors extracted from the data. Without explicit model knowledge, the algorithm is applicable to different modalities and organs, and achieves high accuracy. The method is successfully tested on 70 abdominal CT and 42 pelvic MR images. With respect to ground truth, an average Dice overlap score of 0.76 for the CT segmentation of liver, spleen and kidneys is achieved. The mean score for the MR delineation of bladder, bones, prostate and rectum is 0.65. Additionally, we benchmark several variations of the main components of the method and reduce the computation time by up to 47% without significant loss of accuracy. The segmentation results are – for a nearest neighbor method – surprisingly accurate, robust as well as data and time efficient.
Detecting multiple myeloma via generalized multiple-instance learning
Jan Hering, Jan Kybic, Lukáš Lambert
We address the task of automatic detection of lesions caused by multiple myeloma (MM) in femurs or other long bones from CT data. Such detection is already an important part of the multiple myeloma diagnosis and staging. However, it is so far performed mostly manually, which is very time consuming. We formulate the detection as a multiple instance learning (MIL) problem, where instances are grouped into bags and only bag labels are available. In our case, instances are regions in the image and bags correspond to images. This has the advantage of requiring only subject-level annotation (ground truth), which is much easier to get than voxel-level manual segmentation. We consider a generalization of the standard MIL formulation where we introduce a threshold on the number of required positive instances in positive bags. This corresponds better to the classification procedure used by the radiology experts and is more robust with respect to false positive instances. We extend several existing MIL algorithms to solve the generalized case by estimating the threshold during learning. We compare the proposed methods with the baseline method on a dataset of 220 subjects. We show that the generalized MIL formulation outperforms standard MIL methods for this task. For the task of distinguishing between healthy controls and MM patients with infiltrations, our best method makes almost no mistakes with a mean AUC of 0.982 and F1 = 0.965. We outperform the baseline method significantly in all conducted experiments.
Registration
icon_mobile_dropdown
A multilevel Markov Chain Monte Carlo approach for uncertainty quantification in deformable registration
Sandra Schultz, Heinz Handels, Jan Ehrhardt
Image guided diagnostics and therapy comprise decisions concerning treatment and intervention based on registered image data of patients in many clinical settings. Therefore, knowledge about the reliability of the registration result is crucial. In this paper, we tackle this issue by estimating the registration uncertainty based on Bayesian analysis, i.e. examining the posterior distribution of parameters describing the underlying transformations. The intractability of posterior distributions allows only an approximation, usually realized by Monte Carlo sampling methods. Conventional Markov Chain Monte Carlo (MCMC) algorithms require a large number of posterior samples to ensure robust estimates, which inflicts a high computational burden. The contribution of this work is the embedding of the MCMC approach into a cost reducing multilevel framework. Multilevel MCMC fits into the multi-resolution framework usually applied for image registration. In this work we evaluate the performance of our method using a B-spline transformation framework, i.e. the B-spline coefficients are the parameters to estimate. We demonstrate its correctness by comparison with a ground-truth of the posterior distribution, evaluate the efficiency through examination of the cost reduction and show the reliability as uncertainty estimator on brain MRI images.
Quadratic: quality of dice in registration circuits
Image registration involves identification of a transformation to fit a target image to a reference image space. The success of the registration process is vital for correct interpretation of the results of many medical image-processing applications, including multi-atlas segmentation. While there are several validation metrics employed in rigid registration to examine the accuracy of the method, non-rigid registrations (NRR) are validated subjectively in most cases, validated in offline cases, or based on image similarity metrics, all of which have been shown to poorly correlate with true registration quality. In this paper, we model the error for each target scan by expanding on the idea of Assessing Quality Using Image Registration Circuits (AQUIRC), which created a model for error “quality” associated with NRR. In this paper, we model the Dice similarity coefficient (DSC) error in the network, for a more interpretable measure. We test four functional models using a leave-one-out strategy to evaluate the relationship between edge DSC and circuit DSC: linear, quadratic, third order, or multiplicative models. We found that the quadratic model most accurately learns the NRR-DSC, with a median correlation coefficient of 0.58 with the true NRR-DSC, we call this the QUADRATIC (QUAlity of Dice in RegistrATIon Circuits) model. The QUADRATIC model is used for multi-atlas segmentation based on majority vote. Choosing the four best atlases predicted from the QUDRATIC model resulted in a 7% increase in the DSC between segmented image and true labels.
Self-reference-based and during-registration detection of motion artifacts in spatio-temporal image data
Eike Mücke, Heinz Handels, René Werner
Respiration-correlated or 4D CT imaging represents the standard of care in radiation therapy treatment planning for patients with tumors subject to significant breathing-induced motion. Applications like motion field estimation, correspondence modeling and 4D dose simulation further rely on deformable image registration (DIR) of the individual phase images of the 4D CT data set with DIR accuracy and reliability of derived information being impeded by common 4D CT motion artifacts. Development of image-based approaches for reduction of artifacts or dampening their influence on DIR would benefit from precise artifact detection and localization. In this work, we propose applying groupwise non-linear registration of the 4D CT phase images and during-registration analysis of phase-based contributions to the DIR cost function to detect and localize artifacts. In detail, we build on the B-spline-based elastix framework and focus on the variance metric with the rational being that contributions of artifact-affected phase images and image regions to the variance metric and respective distances to the implicit reference frame (= self reference) are significantly larger than those of non-affected. Evaluation is based on selected artifact-free 4D CT data sets of lung tumor patients. By manipulation of the 4D CT reconstruction, we introduced artifacts at specific breathing phases and known localization. Results show that both detecting artifact-affected breathing phases and localizing the artifacts during registration is feasible. The present proof-of-concept opens up the opportunity for targeted local adjustment of, e.g., regularization weights for artifact-affected image regions to increase robustness of DIR in artifact-affected spatio-temporal image data.
GPU-based stochastic-gradient optimization for non-rigid medical image registration in time-critical applications
Parag Bhosale, Marius Staring, Zaid Al-Ars, et al.
Currently, non-rigid image registration algorithms are too computationally intensive to use in time-critical applications. Existing implementations that focus on speed typically address this by either parallelization on GPU-hardware, or by introducing methodically novel techniques into CPU-oriented algorithms. Stochastic gradient descent (SGD) optimization and variations thereof have proven to drastically reduce the computational burden for CPU-based image registration, but have not been successfully applied in GPU hardware due to its stochastic nature. This paper proposes 1) NiftyRegSGD, a SGD optimization for the GPU-based image registration tool NiftyReg, 2) random chunk sampler, a new random sampling strategy that better utilizes the memory bandwidth of GPU hardware. Experiments have been performed on 3D lung CT data of 19 patients, which compared NiftyRegSGD (with and without random chunk sampler) with CPU-based elastix Fast Adaptive SGD (FASGD) and NiftyReg. The registration runtime was 21.5s, 4.4s and 2.8s for elastix-FASGD, NiftyRegSGD without, and NiftyRegSGD with random chunk sampling, respectively, while similar accuracy was obtained. Our method is publicly available at https://github.com/SuperElastix/NiftyRegSGD.
Deformable image registration using convolutional neural networks
Koen A. J. Eppenhof, Maxime W. Lafarge, Pim Moeskops, et al.
Deformable image registration can be time-consuming and often needs extensive parameterization to perform well on a specific application. We present a step towards a registration framework based on a three-dimensional convolutional neural network. The network directly learns transformations between pairs of three-dimensional images. The outputs of the network are three maps for the x, y, and z components of a thin plate spline transformation grid. The network is trained on synthetic random transformations, which are applied to a small set of representative images for the desired application. Training therefore does not require manually annotated ground truth deformation information. The methodology is demonstrated on public data sets of inspiration-expiration lung CT image pairs, which come with annotated corresponding landmarks for evaluation of the registration accuracy. Advantages of this methodology are its fast registration times and its minimal parameterization.
Keynote and Highlights
icon_mobile_dropdown
Foveal fully convolutional nets for multi-organ segmentation
Most fully automatic segmentation approaches target a single anatomical structure in a specific combination of image modalities and are often difficult to extend to other modalities and protocols or segmentation tasks. More recently, deep learning-based approaches promise to be readily adaptable to new applications as long as a suitable training set is available, although most deep learning architectures are still tuned towards a specific application and data domain. In this paper, we propose a novel fully convolutional neural network architecture for image segmentation and show that the same architecture with the same learning parameters can be used to train models for 20 different organs on two different protocols, while still achieving segmentation accuracy that is on par with the state-of-the-art. In addition, the architecture was designed to minimize the amount of GPU memory required for processing large images, which facilitates the application to full-resolution whole-body CT scans. We have evaluated our method on the publicly available data set of the VISCERAL multi-organ segmentation challenge and compared the performance of our method with those of the challenge and two recently proposed deep learning-based approaches. We achieved the highest Dice similarity coefficients for 17 out of 20 organs for the contrast enhanced CT scans and for 10 out of 20 organs for the uncontrasted CT scans in a cross-comparison between our method and participating methods.
A novel framework for the local extraction of extra-axial cerebrospinal fluid from MR brain images
Mahmoud Mostapha, Mark D. Shen, SunHyung Kim, et al.
The quantification of cerebrospinal fluid (CSF) in the human brain has shown to play an important role in early postnatal brain developmental. Extr a-axial fluid (EA-CSF), which is characterized by the CSF in the subarachnoid space, is promising in the early detection of children at risk for neurodevelopmental disorders. Currently, though, there is no tool to extract local EA-CSF measurements in a way that is suitable for localized analysis. In this paper, we propose a novel framework for the localized, cortical surface based analysis of EA-CSF. In our proposed processing, we combine probabilistic brain tissue segmentation, cortical surface reconstruction as well as streamline based local EA-CSF quantification. For streamline computation, we employ the vector field generated by solving a Laplacian partial differential equation (PDE) between the cortical surface and the outer CSF hull. To achieve sub-voxel accuracy while minimizing numerical errors, fourth-order Runge-Kutta (RK4) integration was used to generate the streamlines. Finally, the local EA-CSF is computed by integrating the CSF probability along the generated streamlines. The proposed local EA-CSF extraction tool was used to study the early postnatal brain development in typically developing infants. The results show that the proposed localized EA-CSF extraction pipeline can produce statistically significant regions that are not observed in previous global approach.
A statistical model for image registration performance: effect of tissue deformation
M. D. Ketcha, T. De Silva, R. Han, et al.
Purpose: The accuracy of image registration is a critical factor in image-guidance systems, so it is important to quantifiably understand factors that fundamentally limit performance of the registration task. In this work, we extend a recently derived model for the effect of quantum noise on registration error to a more “generalized” model in which tissue deformation is incorporated as an additional source of “noise” described by a power-law distribution, analogous to “anatomical clutter” in signal detection theory.

Methods: We apply a statistical framework that incorporates objective image quality factors such as spatial resolution and image noise combined with a statistical representation of anatomical clutter to predict the root-mean-squared error (RMSE) of transformation parameters in a rigid registration. Model predictions are compared to simulation studies in CT-to-CT slice registration using the cross-correlation (CC) similarity metric.

Results: RMSE predictions are shown to accurately model the impact of dose and soft-tissue clutter on measured RMSE performance. Further, these predictions reveal dose levels at which the registration becomes soft-tissue clutter limited, where further increase provides no improvement in registration performance.

Conclusions: Incorporating tissue deformation into a statistical registration model is an important step in understanding the limits of image registration performance and selecting pertinent registration methods for a particular registration task. The generalized noise model and RMSE analysis provide insight on how to optimize registration tasks with respect to image acquisition protocol (e.g., dose, reconstruction parameters) and registration method (e.g., level of blur).
fMRI and DTI
icon_mobile_dropdown
SHARD: spherical harmonic-based robust outlier detection for HARDI methods
High Angular Resolution Diffusion Imaging (HARDI) models are used to capture complex intra-voxel microarchitectures. The magnetic resonance imaging sequences that are sensitized to diffusion are often highly accelerated and prone to motion, physiologic, and imaging artifacts. In diffusion tensor imaging, robust statistical approaches have been shown to greatly reduce these adverse factors without human intervention. Similar approaches would be possible with HARDI methods, but robust versions of each distinct HARDI approach would be necessary. To avoid the computational and pragmatic burdens of creating individual robust HARDI analysis variants, we propose a robust outlier imputation model to mitigate outliers prior to traditional HARDI analysis. This model uses a weighted spherical harmonic fit of diffusion weighted magnetic resonance imaging scans to estimate the values which had been corrupted during acquisition to restore them. Briefly, spherical harmonics of 6th order were used to generate basis function which were weighted by diffusion signal for detection of outliers. For validation, a single healthy volunteer was scanned for a single session comprising of two scans one without head movement and the other with deliberate head movement at a b-value of 3000 s/mm2 with 64 diffusion weighted directions with a single b0 (5 averages) per scan. The deliberate motion from the volunteer created natural artifacts in the acquisition of one of the scans. The imputation model shows reduction in root mean squared error of the raw signal intensities and improvement for the HARDI method Q-ball in terms of the Angular Correlation Coefficient. The results reveal that there is quantitative and qualitative improvement. The proposed model can be used as general pre-processing model before implementing any HARDI model in general to restore the artifacts which are created because of the outlier diffusion signal in certain gradient volumes.
Regional autonomy changes in resting-state functional MRI in patients with HIV associated neurocognitive disorder
Adora M. DSouza, Anas Z. Abidin, Udaysankar Chockanathan, et al.
In this study, we investigate whether there are discernable changes in influence that brain regions have on themselves once patients show symptoms of HIV Associated Neurocognitive Disorder (HAND) using functional MRI (fMRI). Simple functional connectivity measures, such as correlation cannot reveal such information. To this end, we use mutual connectivity analysis (MCA) with Local Models (LM), which reveals a measure of influence in terms of predictability. Once such measures of interaction are obtained, we train two classifiers to characterize difference in patterns of regional self-influence between healthy subjects and subjects presenting with HAND symptoms. The two classifiers we use are Support Vector Machines (SVM) and Localized Generalized Matrix Learning Vector Quantization (LGMLVQ). Performing machine learning on fMRI connectivity measures is popularly known as multi-voxel pattern analysis (MVPA). By performing such an analysis, we are interested in studying the impact HIV infection has on an individual’s brain. The high area under receiver operating curve (AUC) and accuracy values for 100 different train/test separations using MCA-LM self-influence measures (SVM: mean AUC=0.86, LGMLVQ: mean AUC=0.88, SVM and LGMLVQ: mean accuracy=0.78) compared with standard MVPA analysis using cross-correlation between fMRI time-series (SVM: mean AUC=0.58, LGMLVQ: mean AUC=0.57), demonstrates that self-influence features can be more discriminative than measures of interaction between time-series pairs. Furthermore, our results suggest that incorporating measures of self-influence in MVPA analysis used commonly in fMRI analysis has the potential to provide a performance boost and indicate important changes in dynamics of regions in the brain as a consequence of HIV infection.
Tensor-based vs. matrix-based rank reduction in dynamic brain connectivity
Fatemeh Mokhtari, Rhiannon E. Mayhugh, Christina E. Hugenschmidt, et al.
The spatio-temporal information associated with dynamic connectivity from functional magnetic resonance imaging (fMRI) data can be fully represented using a multi-modal tensorial structure. Following a correlation analysis using a sliding-window, the dynamic connectivity data is represented by a 3rd-order tensor with three modes: 1-2) connectivity and 3) time. In typical dynamic connectivity analysis of fMRI data, the tensor is often flattened into matrix format resulting in mixed information embedded within the different modes. If a tensor-based data analysis is used, the information underlying the data structure is preserved rather than mixed. In this study, data dimension reduction was performed on dynamic brain networks from two fMRI datasets processed using tensor-based higher-order singular value decomposition (HOSVD) and regular matrix-based SVD. In the first dataset, brain networks were used to predict walking speed in a population of older adults enrolled in a weight loss study. For the second dataset, fMRI networks were collected from moderate-heavy alcohol consumers and classification was performed to identify networks associated with resting state vs. an emotional stress task. We hypothesized that the reduced-rank dynamic connectivity from the HOSDV would result in superior classification compared to matrix-based SVD using the same linear support vector machine with a 50 random-sampling cross-validation procedure. Results demonstrated that HOSVD (accuracy > 90% for both datasets) significantly outperformed regular SVD that failed to correctly identify the grouping status (accuracy ~ 50%).
Extrapolated nonnegative decompositions for the analysis of functional connectivity
Nicolas Honnorat, Christos Davatzikos
Functional MRI (fMRI) captures brain function by recording the oxygen consumption of a large number of brain voxels simultaneously along time. The set of time series obtained is typically decomposed using Principal Component Analysis (PCA) or Independent Component Analysis (ICA) to reveal the regions and networks organizing the brain. In this work, we introduce a novel decomposition approach. We separate brain activations and de-activation, and we separately decompose co-activations, captured by the correlation between the activations, co-deactivations measured by the correlation between the de-activations, and the correlations between activations and de-activations. The decomposition is performed by a nonnegative factorization method known to generate sparse decompositions, which we accelerate by extrapolation. As a result, our approach produces in reasonable time compact fMRI scans decompositions offering a rich interpretation of the interactions between brain regions. The experiments presented here, performed on a dataset of forty scans provided by the Human Connectome Project, demonstrate the quality of our decompositions and indicate that a speedup of an order of magnitude is offered by the extrapolation.
Strain map of the tongue in normal and ALS speech patterns from tagged and diffusion MRI
Fangxu Xing, Jerry L. Prince, Maureen Stone, et al.
Amyotrophic Lateral Sclerosis (ALS) is a neurological disease that causes death of neurons controlling muscle movements. Loss of speech and swallowing functions is a major impact due to degeneration of the tongue muscles. In speech studies using magnetic resonance (MR) techniques, diffusion tensor imaging (DTI) is used to capture internal tongue muscle fiber structures in three-dimensions (3D) in a non-invasive manner. Tagged magnetic resonance images (tMRI) are used to record tongue motion during speech. In this work, we aim to combine information obtained with both MR imaging techniques to compare the functionality characteristics of the tongue between normal and ALS subjects. We first extracted 3D motion of the tongue using tMRI from fourteen normal subjects in speech. The estimated motion sequences were then warped using diffeomorphic registration into the b0 spaces of the DTI data of two normal subjects and an ALS patient. We then constructed motion atlases by averaging all warped motion fields in each b0 space, and computed strain in the line of action along the muscle fiber directions provided by tractography. Strain in line with the fiber directions provides a quantitative map of the potential active region of the tongue during speech. Comparison between normal and ALS subjects explores the changing volume of compressing tongue tissues in speech facing the situation of muscle degradation. The proposed framework provides for the first time a dynamic map of contracting fibers in ALS speech patterns, and has the potential to provide more insight into the detrimental effects of ALS on speech.
TRAFIC: fiber tract classification using deep learning
Prince D. Ngattai Lam, Gaetan Belhomme, Jessica Ferrall, et al.
We present TRAFIC, a fully automated tool for the labeling and classification of brain fiber tracts. TRAFIC classifies new fibers using a neural network trained using shape features computed from previously traced and manually corrected fiber tracts. It is independent from a DTI Atlas as it is applied to already traced fibers. This work is motivated by medical applications where the process of extracting fibers from a DTI atlas, or classifying fibers manually is time consuming and requires knowledge about brain anatomy. With this new approach we were able to classify traced fiber tracts obtaining encouraging results. In this report we will present in detail the methods used and the results achieved with our approach.
Evaluation of inter-site bias and variance in diffusion-weighted MRI
An understanding of the bias and variance of diffusion weighted magnetic resonance imaging (DW-MRI) acquisitions across scanners, study sites, or over time is essential for the incorporation of multiple data sources into a single clinical study. Studies that combine samples from various sites may be introducing confounding factors due to site-specific artifacts and patterns. Differences in bias and variance across sites may render the scans incomparable, and, without correction, inferences obtained from these data may be misleading. We present an analysis of the bias and variance of scans of the same subjects across different sites and evaluate their impact on statistical analyses. In previous work, we presented a simulation extrapolation (SIMEX) technique for bias estimation as well as a wild bootstrap technique for variance estimation in metrics obtained from a Q-ball imaging (QBI) reconstruction of empirical high angular resolution diffusion imaging (HARDI) data. We now apply those techniques to data acquired from 5 healthy volunteers on 3 independent scanners under closely matched acquisition protocols. The bias and variance of GFA measurements were estimated on a voxel-wise basis for each scan and compared across study sites to identify site-specific differences. Further, we provide model recommendations that can be used to determine the extent of the impact of bias and variance as well as aspects of the analysis to account for these differences. We include a decision tree to help researchers determine if model adjustments are necessary based on the bias and variance results.
Motion
icon_mobile_dropdown
A novel filtering approach for 3D harmonic phase analysis of tagged MRI
Xiaokai Wang, Maureen L. Stone, Jerry L. Prince, et al.
Harmonic phase analysis has been used to perform noninvasive organ motion and strain estimation using tagged magnetic resonance imaging (MRI). The filtering process, which is used to produce harmonic phase images used for tissue tracking, influences the estimation accuracy. In this work, we evaluated different filtering approaches, and propose a novel high-pass filter for volumes tagged in individual directions. Testing was done using an open benchmarking dataset and synthetic images obtained using a mechanical model. We compared estimation results from our filtering approach with results from the traditional filtering approach. Our results indicate that 1) the proposed high-pass filter outperforms the traditional filtering approach reducing error by as much as 50% and 2) the accuracy improvements are especially marked in complex deformations.
Feasibility of intra-acquisition motion correction for 4D DSA reconstruction for applications in the thorax and abdomen
Martin Wagner, Paul Laeseke, Colin Harari, et al.
The recently proposed 4D DSA technique enables reconstruction of time resolved 3D volumes from two C-arm CT acquisitions. This provides information on the blood flow in neurovascular applications and can be used for the diagnosis and treatment of vascular diseases. For applications in the thorax and abdomen, respiratory motion can prevent successful 4D DSA reconstruction and cause severe artifacts. The purpose of this work is to propose a novel technique for motion compensated 4D DSA reconstruction to enable applications in the thorax and abdomen. The approach uses deformable 2D registration to align the projection images of a non-contrast and a contrast enhanced scan. A subset of projection images is then selected, which are acquired in a similar respiratory state and an iterative simultaneous multiplicative algebraic reconstruction is applied to determine a 3D constraint volume. A 2D-3D registration step then aligns the remaining projection images with the 3D constraint volume. Finally, a constrained back-projection is performed to create a 3D volume for each projection image. A pig study has been performed, where 4D DSA acquisitions were performed with and without respiratory motion to evaluate the feasibility of the approach. The dice similarity coefficient between the reference 3D constraint volume and the motion compensated reconstruction was 51.12 % compared to 35.99 % without motion compensation. This technique could improve the workflow for procedures in interventional radiology, e.g. liver embolizations, where changes in blood flow have to be monitored carefully.
Deep-learning-based CT motion artifact recognition in coronary arteries
T. Elss, H. Nickisch, T. Wissel, et al.
The detection and subsequent correction of motion artifacts is essential for the high diagnostic value of non- invasive coronary angiography using cardiac CT. However, motion correction algorithms have a substantial computational footprint and possible failure modes which warrants a motion artifact detection step to decide whether motion correction is required in the first place. We investigate how accurately motion artifacts in the coronary arteries can be predicted by deep learning approaches. A forward model simulating cardiac motion by creating and integrating artificial motion vector fields in the filtered back projection (FBP) algorithm allows us to generate training data from nine prospectively ECG-triggered high quality clinical cases. We train a Convolutional Neural Network (CNN) classifying 2D motion-free and motion-perturbed coronary cross-section images and achieve a classification accuracy of 94:4% ± 2:9% by four-fold cross-validation.
Population-based respiratory 4D motion atlas construction and its application for VR simulations of liver punctures
Virtual reality (VR) training simulators of liver needle insertion in the hepatic area of breathing virtual patients often need 4D image data acquisitions as a prerequisite. Here, first a population-based breathing virtual patient 4D atlas is built and second the requirement of a dose-relevant or expensive acquisition of a 4D CT or MRI data set for a new patient can be mitigated by warping the mean atlas motion. The breakthrough contribution of this work is the construction and reuse of population-based, learned 4D motion models.
Sensitivity analysis of Jacobian determinant used in treatment planning for lung cancer
Wei Shao, Sarah E. Gerard, Yue Pan, et al.
Four-dimensional computed tomography (4DCT) is regularly used to visualize tumor motion in radiation therapy for lung cancer. These 4DCT images can be analyzed to estimate local ventilation by finding a dense correspondence map between the end inhalation and the end exhalation CT image volumes using deformable image registration. Lung regions with ventilation values above a threshold are labeled as regions of high pulmonary function and are avoided when possible in the radiation plan. This paper investigates a sensitivity analysis of the relative Jacobian error to small registration errors. We present a linear approximation of the relative Jacobian error. Next, we give a formula for the sensitivity of the relative Jacobian error with respect to the Jacobian of perturbation displacement field. Preliminary sensitivity analysis results are presented using 4DCT scans from 10 individuals. For each subject, we generated 6400 random smooth biologically plausible perturbation vector fields using a cubic B-spline model. We showed that the correlation between the Jacobian determinant and the Frobenius norm of the sensitivity matrix is close to -1, which implies that the relative Jacobian error in high-functional regions is less sensitive to noise. We also showed that small displacement errors on the average of 0.53 mm may lead to a 10% relative change in Jacobian determinant. We finally showed that the average relative Jacobian error and the sensitivity of the system for all subjects are positively correlated (close to +1), i.e. regions with high sensitivity has more error in Jacobian determinant on average.
Image Features
icon_mobile_dropdown
HoDOr: histogram of differential orientations for rigid landmark tracking in medical images
Abhishek Tiwari, Kedar Anil Patwardhan
Feature extraction plays a pivotal role in pattern recognition and matching. An ideal feature should be invariant to image transformations such as translation, rotation, scaling, etc. In this work, we present a novel rotation-invariant feature, which is based on Histogram of Oriented Gradients (HOG). We compare performance of the proposed approach with the HOG feature on 2D phantom data, as well as 3D medical imaging data. We have used traditional histogram comparison measures such as Bhattacharyya distance and Normalized Correlation Coefficient (NCC) to assess efficacy of the proposed approach under effects of image rotation. In our experiments, the proposed feature performs 40%, 20%, and 28% better than the HOG feature on phantom (2D), Computed Tomography (CT-3D), and Ultrasound (US-3D) data for image matching, and landmark tracking tasks respectively.
Topological leakage detection and freeze-and-grow propagation for improved CT-based airway segmentation
Syed Ahmed Nadeem, Eric A. Hoffman, Jered P. Sieren, et al.
Numerous large multi-center studies are incorporating the use of computed tomography (CT)-based characterization of the lung parenchyma and bronchial tree to understand chronic obstructive pulmonary disease status and progression. To the best of our knowledge, there are no fully automated airway tree segmentation methods, free of the need for user review. A failure in even a fraction of segmentation results necessitates manual revision of all segmentation masks which is laborious considering the thousands of image data sets evaluated in large studies. In this paper, we present a novel CT-based airway tree segmentation algorithm using topological leakage detection and freeze-and-grow propagation. The method is fully automated requiring no manual inputs or post-segmentation editing. It uses simple intensity-based connectivity and a freeze-and-grow propagation algorithm to iteratively grow the airway tree starting from an initial seed inside the trachea. It begins with a conservative parameter and then, gradually shifts toward more generous parameter values. The method was applied on chest CT scans of fifteen subjects at total lung capacity. Airway segmentation results were qualitatively assessed and performed comparably to established airway segmentation method with no major visual leakages.
Inter-scanner variation independent descriptors for constrained diffeomorphic demons registration of retina OCT
Purpose: OCT offers high in-plane micrometer resolution, enabling studies of neurodegenerative and ocular-disease mechanisms via imaging of the retina at low cost. An important component to such studies is inter-scanner deformable image registration. Image quality of OCT, however, is suboptimal with poor signal-to-noise ratio and through-plane resolution. Geometry of OCT is additionally improperly defined. We developed a diffeomorphic deformable registration method incorporating constraints accommodating the improper geometry and a decentralized-modality-insensitiveneighborhood-descriptors (D-MIND) robust against degradation of OCT image quality and inter-scanner variability. Method: The method, called D-MIND Demons, estimates diffeomorphisms using D-MINDs under constraints on the direction of velocity fields in a MIND-Demons framework. Descriptiveness of D-MINDs with/without denoising was ranked against four other shape/texture-based descriptors. Performance of D-MIND Demons and its variants incorporating other descriptors was compared for cross-scanner, intra- and inter-subject deformable registration using clinical retina OCT data. Result: D-MINDs outperformed other descriptors with the difference in mutual descriptiveness between high-contrast and homogenous regions > 0.2. Among Demons variants, D-MIND-Demons was computationally efficient, demonstrating robustness against OCT image degradation (noise, speckle, intensity-non-uniformity, and poor throughplane resolution) and consistent registration accuracy [(4±4 μm) and (4±6 μm) in cross-scanner intra- and inter-subject registration] regardless of denoising. Conclusions: A promising method for cross-scanner, intra- and inter-subject OCT image registration has been developed for ophthalmological and neurological studies of retinal structures. The approach could assist image segmentation, evaluation of longitudinal disease progression, and patient population analysis, which in turn, facilitate diagnosis and patient-specific treatment.
A temporal-frequency variant on robust-principle component analysis for segmentation of motile cilia in optical coherence tomography images (Conference Presentation)
James P. McLean, Yuye Ling, Christine Hendon
Optical Coherence Tomography (OCT) has established itself as an important tool for studying the role of cilia in Mucociliary Clearance (MCC) due to its ability to observe the cilia’s temporal characteristics over a large field of view. To obtain useful, quantitative measures of this dynamic morphology, the ciliated layer of tissue needs to be segmented from other static components. This is currently accomplished using Speckle Variance processing, a technique whose success relies on subjective thresholding and lacks sensitivity to other sources of speckle noise. We present a modified, frequency constrained, version of Robust Principle Component Analysis (RPCA) which we call Frequency Constrained RPCA (FC-RPCA) as an alternative method for dynamic segmentation of cilia from time-varying OCT B-scans. Based in Sparse Representation theory, FC-RPCA decomposes stacks of images in time into low-rank (static) and a sparse (dynamic) matrices. The sparse matrix represents the segmented cilia layer because of the sparse frequency spectrum exhibited by their characteristic beating pattern. This novel algorithm introduces an additional feature, a user defined frequency constraint on the sparse component, which prevents other sources of speckle noise, like slow moving mucus clouds at the tissues surface, from being segmented with the cilia. The algorithm was used to segment motile cilia in 17 datasets of ex-vivo human ciliated epithelium with high accuracy. Furthermore, FC-RPCA requires no parameter tuning across datasets, demonstrating its capability as a robust tool for processing large volumes of data. When compared with the standard Speckle Variance method, FC-RPCA performed with improved accuracy and selectivity.
Classification of malignant and benign liver tumors using a radiomics approach
Martijn P. A. Starmans, Razvan L. Miclea, Sebastian R. van der Voort, et al.
Correct diagnosis of the liver tumor phenotype is crucial for treatment planning, especially the distinction between malignant and benign lesions. Clinical practice includes manual scoring of the tumors on Magnetic Resonance (MR) images by a radiologist. As this is challenging and subjective, it is often followed by a biopsy. In this study, we propose a radiomics approach as an objective and non-invasive alternative for distinguishing between malignant and benign phenotypes. T2-weighted (T2w) MR sequences of 119 patients from multiple centers were collected. We developed an efficient semi-automatic segmentation method, which was used by a radiologist to delineate the tumors. Within these regions, features quantifying tumor shape, intensity, texture, heterogeneity and orientation were extracted. Patient characteristics and semantic features were added for a total of 424 features. Classification was performed using Support Vector Machines (SVMs). The performance was evaluated using internal random-split cross-validation. On the training set within each iteration, feature selection and hyperparameter optimization were performed. To this end, another cross validation was performed by splitting the training sets in training and validation parts. The optimal settings were evaluated on the independent test sets. Manual scoring by a radiologist was also performed. The radiomics approach resulted in 95% confidence intervals of the AUC of [0.75, 0.92], specificity [0.76, 0.96] and sensitivity [0.52, 0.82]. These approach the performance of the radiologist, which were an AUC of 0.93, specificity 0.70 and sensitivity 0.93. Hence, radiomics has the potential to predict the liver tumor benignity in an objective and non-invasive manner.
Quantitative phase and texture angularity analysis of brain white matter lesions in multiple sclerosis
Shalese Baxandall, Shrushrita Sharma, Peng Zhai, et al.
Structural changes to nerve fiber tracts are extremely common in neurological diseases such as multiple sclerosis (MS). Accurate quantification is vital. However, while nerve fiber damage is often seen as multi-focal lesions in magnetic resonance imaging (MRI), measurement through visual perception is limited. Our goal was to characterize the texture pattern of the lesions in MRI and determine how texture orientation metrics relate to lesion structure using two new methods: phase congruency and multi-resolution spatial-frequency analysis. The former aims to optimize the detection of the ‘edges and corners’ of a structure, and the latter evaluates both the radial and angular distributions of image texture associated with the various forming scales of a structure. The radial texture spectra were previously confirmed to measure the severity of nerve fiber damage, and were thus included for validation. All measures were also done in the control brain white matter for comparison. Using clinical images of MS patients, we found that both phase congruency and weighted mean phase detected invisible lesion patterns and were significantly greater in lesions, suggesting higher structure complexity, than the control tissue. Similarly, multi-angular spatial-frequency analysis detected much higher texture across the whole frequency spectrum in lesions than the control areas. Such angular complexity was consistent with findings from radial texture. Analysis of the phase and texture alignment may prove to be a useful new approach for assessing invisible changes in lesions using clinical MRI and thereby lead to improved management of patients with MS and similar disorders.
Deep Learning: Lesions and Pathologies
icon_mobile_dropdown
MRI tumor segmentation with densely connected 3D CNN
Lele Chen, Yue Wu, Adora M. DSouza, et al.
Glioma is one of the most common and aggressive types of primary brain tumors. The accurate segmentation of subcortical brain structures is crucial to the study of gliomas in that it helps the monitoring of the progression of gliomas and aids the evaluation of treatment outcomes. However, the large amount of required human labor makes it difficult to obtain the manually segmented Magnetic Resonance Imaging (MRI) data, limiting the use of precise quantitative measurements in the clinical practice. In this work, we try to address this problem by developing a 3D Convolutional Neural Network (3D CNN) based model to automatically segment gliomas. The major difficulty of our segmentation model comes with the fact that the location, structure, and shape of gliomas vary significantly among different patients. In order to accurately classify each voxel, our model captures multiscale contextual information by extracting features from two scales of receptive fields. To fully exploit the tumor structure, we propose a novel architecture that hierarchically segments different lesion regions of the necrotic and non-enhancing tumor (NCR/NET), peritumoral edema (ED) and GD-enhancing tumor (ET). Additionally, we utilize densely connected convolutional blocks to further boost the performance. We train our model with a patch-wise training schema to mitigate the class imbalance problem. The proposed method is validated on the BraTS 2017 dataset1 and it achieves Dice scores of 0.72, 0.83 and 0.81 for the complete tumor, tumor core and enhancing tumor, respectively. These results are comparable to the reported state-of-the-art results, and our method is better than existing 3D-based methods in terms of compactness, time and space efficiency.
Quantification of lung abnormalities in cystic fibrosis using deep networks
Filipe Marques, Florian Dubost, Mariette Kemner-van de Corput, et al.
Cystic fibrosis is a genetic disease which may appear in early life with structural abnormalities in lung tissues. We propose to detect these abnormalities using a texture classification approach. Our method is a cascade of two convolutional neural networks. The first network detects the presence of abnormal tissues. The second network identifies the type of the structural abnormalities: bronchiectasis, atelectasis or mucus plugging.We also propose a network computing pixel-wise heatmaps of abnormality presence learning only from the patch-wise annotations. Our database consists of CT scans of 194 subjects. We use 154 subjects to train our algorithms and the 40 remaining ones as a test set. We compare our method with random forest and a single neural network approach. The first network reaches a sensitivity of 0,62 for disease detection, 0,10 higher than the random forest classifier and 0,17 higher than the single neural network. Our cascade approach yields a final class-averaged F1-score of 0,38, outperforming the baseline method and the single network by 0,15 and 0,10 .
Deep learning for biomarker regression: application to osteoporosis and emphysema on chest CT scans
Germán González, George R. Washko, Raúl San José Estépar
Introduction: Biomarker computation using deep-learning often relies on a two-step process, where the deep learning algorithm segments the region of interest and then the biomarker is measured. We propose an alternative paradigm, where the biomarker is estimated directly using a regression network. We showcase this image-tobiomarker paradigm using two biomarkers: the estimation of bone mineral density (BMD) and the estimation of lung percentage of emphysema from CT scans. Materials and methods: We use a large database of 9,925 CT scans to train, validate and test the network for which reference standard BMD and percentage emphysema have been already computed. First, the 3D dataset is reduced to a set of canonical 2D slices where the organ of interest is visible (either spine for BMD or lungs for emphysema). This data reduction is performed using an automatic object detector. Second, The regression neural network is composed of three convolutional layers, followed by a fully connected and an output layer. The network is optimized using a momentum optimizer with an exponential decay rate, using the root mean squared error as cost function. Results: The Pearson correlation coefficients obtained against the reference standards are r = 0.940 (p < 0.00001) and r = 0.976 (p < 0.00001) for BMD and percentage emphysema respectively. Conclusions: The deep-learning regression architecture can learn biomarkers from images directly, without indicating the structures of interest. This approach simplifies the development of biomarker extraction algorithms. The proposed data reduction based on object detectors conveys enough information to compute the biomarkers of interest.
Microaneurysm detection using deep learning and interleaved freezing
Piotr Chudzik, Somshubra Majumdar, Francesco Caliva, et al.
Diabetes affects one in eleven adults. Diabetic retinopathy is a microvascular complication of diabetes and the leading cause of blindness in the working-age population. Microaneurysms are the earliest clinical signs of diabetic retinopathy. This paper proposes an automatic method for detecting microaneurysms in fundus photographies. A novel patch-based fully convolutional neural network for detection of microaneurysms is proposed. Compared to other methods that require five processing stages, it requires only two. Furthermore, a novel network fine-tuning scheme called Interleaved Freezing is presented. This procedure significantly reduces the amount of time needed to re-train a network and produces competitive results. The proposed method was evaluated using publicly available and widely used datasets: E-Ophtha and ROC. It outperforms the state-of-the-art methods in terms of free-response receiver operatic characteristic (FROC) metric. Simplicity, performance, efficiency and robustness of the proposed method demonstrate its suitability for diabetic retinopathy screening applications.
Dataset variability leverages white-matter lesion segmentation performance with convolutional neural network
Domen Ravnik, Tim Jerman, Franjo Pernuš, et al.
Performance of a convolutional neural network (CNN) based white-matter lesion segmentation in magnetic resonance (MR) brain images was evaluated under various conditions involving different levels of image preprocessing and augmentation applied and different compositions of the training dataset. On images of sixty multiple sclerosis patients, half acquired on one and half on another scanner of different vendor, we first created a highly accurate multi-rater consensus based lesion segmentations, which were used in several experiments to evaluate the CNN segmentation result. First, the CNN was trained and tested without preprocessing the images and by using various combinations of preprocessing techniques, namely histogram-based intensity standardization, normalization by whitening, and train dataset augmentation by flipping the images across the midsagittal plane. Then, the CNN was trained and tested on images of the same, different or interleaved scanner datasets using a cross-validation approach. The results indicate that image preprocessing has little impact on performance in a same-scanner situation, while between-scanner performance benefits most from intensity standardization and normalization, but also further by incorporating heterogeneous multi-scanner datasets in the training phase. Under such conditions the between-scanner performance of the CNN approaches that of the ideal situation, when the CNN is trained and tested on the same scanner dataset.
Deep Learning: Generative Adversarial Networks
icon_mobile_dropdown
Modelling the progression of Alzheimer's disease in MRI using generative adversarial networks
Christopher Bowles, Roger Gunn, Alexander Hammers, et al.
Being able to accurately model the progression of Alzheimer’s disease (AD) is important for the diagnosis and prognosis of the disease, as well as to evaluate the effect of disease modifying treatments. Whilst there has been success in modeling the progression of AD related clinical biomarkers and image derived features over the course of the disease, modeling the expected progression as observed by magnetic resonance (MR) images directly remains a challenge. Here, we apply some recently developed ideas from the field of generative adversarial networks (GANs) which provide a powerful way to model and manipulate MR images directly though the technique of image arithmetic. This allows for synthetic images based upon an individual subject’s MR image to be produced expressing different levels of the features associated with AD. We demonstrate how the model can be used to both introduce and remove AD-like features from two regions in the brain, and show that these predicted changes correspond well to the observed changes over a longitudinal examination. We also propose a modification to the GAN training procedure to encourage the model to better represent the more extreme cases of AD present in the dataset. We show the benefit of this modification by comparing the ability of the resulting models to encode and reconstruct real images with high atrophy and other unusual features.
Learning implicit brain MRI manifolds with deep learning
Camilo Bermudez, Andrew J. Plassard, Larry T. Davis, et al.
An important task in image processing and neuroimaging is to extract quantitative information from the acquired images in order to make observations about the presence of disease or markers of development in populations. Having a low-dimensional manifold of an image allows for easier statistical comparisons between groups and the synthesis of group representatives. Previous studies have sought to identify the best mapping of brain MRI to a low-dimensional manifold, but have been limited by assumptions of explicit similarity measures. In this work, we use deep learning techniques to investigate implicit manifolds of normal brains and generate new, high-quality images. We explore implicit manifolds by addressing the problems of image synthesis and image denoising as important tools in manifold learning. First, we propose the unsupervised synthesis of T1-weighted brain MRI using a Generative Adversarial Network (GAN) by learning from 528 examples of 2D axial slices of brain MRI. Synthesized images were first shown to be unique by performing a cross-correlation with the training set. Real and synthesized images were then assessed in a blinded manner by two imaging experts providing an image quality score of 1-5. The quality score of the synthetic image showed substantial overlap with that of the real images. Moreover, we use an autoencoder with skip connections for image denoising, showing that the proposed method results in higher PSNR than FSL SUSAN after denoising. This work shows the power of artificial networks to synthesize realistic imaging data, which can be used to improve image processing techniques and provide a quantitative framework to structural changes in the brain.
Chest x-ray generation and data augmentation for cardiovascular abnormality classification
Ali Madani, Mehdi Moradi, Alexandros Karargyris, et al.
Medical imaging datasets are limited in size due to privacy issues and the high cost of obtaining annotations. Augmentation is a widely used practice in deep learning to enrich the data in data-limited scenarios and to avoid overfitting. However, standard augmentation methods that produce new examples of data by varying lighting, field of view, and spatial rigid transformations do not capture the biological variance of medical imaging data and could result in unrealistic images. Generative adversarial networks (GANs) provide an avenue to understand the underlying structure of image data which can then be utilized to generate new realistic samples. In this work, we investigate the use of GANs for producing chest X-ray images to augment a dataset. This dataset is then used to train a convolutional neural network to classify images for cardiovascular abnormalities. We compare our augmentation strategy with traditional data augmentation and show higher accuracy for normal vs abnormal classification in chest X-rays.
Contextual loss functions for optimization of convolutional neural networks generating pseudo CTs from MRI
M. van Stralen, Y. Zhou, P. J. Wozny, et al.
Predicting pseudo CT images from MRI data has received increasing attention for use in MRI-only radiation therapy planning and PET-MRI attenuation correction, eliminating the need for harmful CT scanning. Current approaches focus on voxelwise mean absolute error (MAE) and peak signal-to-noise-ratio (PSNR) for optimization and evaluation. Contextual losses such as structural similarity (SSIM) are known to promote perceptual image quality. We investigate the use of these contextual losses for optimization.

Patch-based 3D fully convolutional neural networks (FCN) were optimized for prediction of pseudo CT images from 3D gradient echo pelvic MRI data and compared to ground truth CT data of 26 patients. CT data was non-rigidly registered to MRI for training and evaluation. We compared voxelwise L1 and L2 loss functions, with contextual multi-scale L1 and L2 (MSL1 and MSL2), and SSIM. Performance was evaluated using MAE, PSNR, SSIM and the overlap of segmented cortical bone in the reconstructions, by the dice similarity metric. Evaluation was carried out in cross-validation.

All optimizations successfully converged well with PSNR between 25 and 30 HU, except for one of the folds of SSIM optimizations. MSL1 and MSL2 are at least on par with their single-scale counterparts. MSL1 overcomes some of the instabilities of the L1 optimized prediction models. MSL2 optimization is stable, and on average, outperforms all the other losses, although quantitative evaluations based on MAE, PSNR and SSIM only show minor differences. Direct optimization using SSIM visually excelled in terms subjective perceptual image quality at the expense of a voxelwise quantitative performance drop.

Contextual loss functions can improve prediction performance of FCNs without change of the network architecture. The suggested subjective superiority of contextual losses in reconstructing local structures merits further investigations.
Poster Session: Enhancement
icon_mobile_dropdown
Multi-grid nonlocal techniques for x-ray scatter correction
Yingying Gu, Jun Zhang, Ping Xue
In this work, we used nonlocal priors in a Bayesian approach for X-ray scatter correction. The control parameters of our algorithms such as the patch sizes and search areas were set in such a way that significant improvement in correction results can be achieved. This, however, led to drastic increases in computation time. To solve this problem, we developed a novel multi-grid technique based on some observations on the matching process involved in the nonlocal priors. Experimental results have demonstrated that this technique is effective, accelerating the computation time significantly while maintaining the quality of correction results. In addition to scatter correction, it can also be used for other image processing applications where fast high-dimensional nonlocal filtering is needed.
A denoising algorithm for CT image using low-rank sparse coding
Yang Lei, Dong Xu, Zhengyang Zhou, et al.
We propose a denoising method of CT image based on low-rank sparse coding. The proposed method constructs an adaptive dictionary of image patches and estimates the sparse coding regularization parameters using the Bayesian interpretation. A low-rank approximation approach is used to simultaneously construct the dictionary and achieve sparse representation through clustering similar image patches. A variable-splitting scheme and a quadratic optimization are used to reconstruct CT image based on achieved sparse coefficients. We tested this denoising technology using phantom, brain and abdominal CT images. The experimental results showed that the proposed method delivers state-of-art denoising performance, both in terms of objective criteria and visual quality.
A new method to reduce cone beam artifacts by optimal combination of FDK and TV-IR images
In a cone beam computed tomography (CT) system, Feldkamp, Davis, and Kress (FDK) algorithm produces cone beam artifacts due to the missing cone region which has the insufficient object sampling in frequency space. While the total variation minimization based iterative reconstruction (TV-IR) may reduce the cone beam artifacts by filling in the missing cone region, it introduces image blurring or noise increase depending on the regularization parameter. In this work, we propose a method to reduce the cone beam artifacts by optimal combination of FDK and TV-IR images. The method maintains the exactness of FDK using FDK data outside the missing cone region and preserves the benefit of TV-IR in cone beam artifact reduction using TV-IR data inside the missing cone region. For the evaluation, defrise disks phantom and vertical plates phantom were used and the image quality was compared using structural similarity (SSIM) with different regularization parameters of TV-IR. The results show that both TV-IR and the proposed method were effective in cone beam artifacts reduction, but the proposed method provided good image quality regardless of the regularization parameter.
CT artifact reduction via U-net CNN
C. Zhang, Y. Xing
Purpose: Our preliminary study showed us the capability of a deep learning neural network (DLNN) based method to eliminate a specific type of artifact in CT images. This work is to comprehensively study the applicability of a U-net CNN architecture in improving the image quality of CT reconstructions by respectively testing its performance in various artifact removal tasks. Methods: A U-net architecture is trained by a big dataset of contaminated and expected image pairs. The expected images known as reference images are acquired from groundtruths or using superior imaging system. A proper initialization of network parameters, a careful normalization of original data and a residual learning objective are incoorprated into the framework to boost training convergence. Both numerical and real data studies are conducted to validate this method. Results: In numerical studies, we found that the DLNN-based artifact reduction is powerful and can work well in reducing nearly all type artifacts and recover detailed structrual information in low-quliaty images (e.g. plain FBP reconstructions) if the network is trained with groundtruths provided. In real situations where groundtruth is not available, the proposed method can characterize the discrepancy between contaminated data and higher-quality reference labels produced by other techniques, mimicking their capability of reducing artifacts. Generalization to disjointed data is also examined using testing data. All results show that the DLNN framework can be applied to various artifact reduction tasks and outperforms conventional methods with shorter runtime. Conclusion: This work gained promising results of the U-net network architecture successfully characterizing both global and local artifact patterns. By forward propagating contaminated images through the trained network, undesired artifacts can be greatly reduced while structrual information maintained for an input of CT image. It should be noted that the proposed deep network should be trained independently for each specific case.
Automatic segmentation of thoracic aorta segments in low-dose chest CT
Julia M. H. Noothout, Bob D. de Vos, Jelmer M. Wolterink, et al.
Morphological analysis and identification of pathologies in the aorta are important for cardiovascular diagnosis and risk assessment in patients. Manual annotation is time-consuming and cumbersome in CT scans acquired without contrast enhancement and with low radiation dose. Hence, we propose an automatic method to segment the ascending aorta, the aortic arch and the thoracic descending aorta in low-dose chest CT without contrast enhancement. Segmentation was performed using a dilated convolutional neural network (CNN), with a receptive field of 131 × 131 voxels, that classified voxels in axial, coronal and sagittal image slices. To obtain a final segmentation, the obtained probabilities of the three planes were averaged per class, and voxels were subsequently assigned to the class with the highest class probability. Two-fold cross-validation experiments were performed where ten scans were used to train the network and another ten to evaluate the performance. Dice coefficients of 0.83 ± 0.07, 0.86 ± 0.06 and 0.88 ± 0.05, and Average Symmetrical Surface Distances (ASSDs) of 2.44 ± 1.28, 1.56 ± 0.68 and 1.87 ± 1.30 mm were obtained for the ascending aorta, the aortic arch and the descending aorta, respectively. The results indicate that the proposed method could be used in large-scale studies analyzing the anatomical location of pathology and morphology of the thoracic aorta.
Fast super-resolution with iterative-guided back projection for 3D MR images
Yutaro Iwamoto, Xian-Hua Han, Akihiko Shiino, et al.
Multimodal magnetic resonance images (e.g., T1-weighted image (TIWI) and T2-weighted image (T2WI)) are used for accurate medical imaging analysis. Different modal images have different resolution depending on pulse sequence parameters under limited data acquisition time. Therefore, interpolation methods are used to match the low-resolution (LR) image with the high-resolution (HR) image. However, the interpolation causes blurring that affects analysis accuracy. Although some recent works such as non-local-means (NLM) filter have manifested impressive super-resolution (SR) performance with available HR modal images, the filter has high computational cost. Therefore, we propose a fast SR framework with iterative-guided back projection, which incorporates iterative back projection with a guided filter (GF) method for resolution enhancement of LR images (e.g., T2WI) by referring HR images in another modality image (e.g., T1WI). The proposed method not only achieves both high accuracy than conventional interpolation methods and original GF and computational efficiency by applying an integral 3D image technique. In addition, although the proposed method is slightly inferior in accuracy visually than the state-of-the-art NLM filter, it can run 22 times faster than the state-of-theart method in expanding three times in the slice-select direction from 180 × 216 × 60 voxels to 180 × 216 × 180 voxels. The computational time of our method is about 1 min only. Therefore, the proposed method will be applied to various applications in practice, including not only multimodal MR images but also multimodal image analysis such as computed tomography (CT) and positron emission tomography (PET).
Poster Session: Machine Learning
icon_mobile_dropdown
Automatic localization and segmentation of optical disk based on faster R-CNN and level set in fundus image
Defeng Zhang, Weifang Zhu, Heming Zhao, et al.
The processing and analysis of retinal fundus images is widely studied because many ocular fundus diseases such as diabetic retinopathy, hypertensive retinopathy, etc., can be diagnosed and treated based on the corresponding analysis results. The optic disc (OD), as the main anatomical structure of ocular fundus, its shape, border, size and pathological depression are very important auxiliary parameters for the diagnosis of fundus diseases. So the precise localization and segmentation of OD is important. Considering the excellent performance of deep learning in object detection and location, an automatic OD localization and segmentation algorithm based on Faster R-CNN and shape constrained level set is presented in this paper. First, Faster R-CNN+ZF model is used to locate the OD via a bounding box (B-box). Second, the main blood vessels in the B-box are removed by Hessian matrix if necessary. Finally, a shape constrained level set algorithm is used to segment the boundary of the OD. The localization algorithm was trained on 4000 images selected from Kaggle and tested on the MESSIDOR database. For the OD localization, the mean average precision (mAP) of 99.9% was achieved, with average time of 0.21s per image. The segmentation algorithm was tested on 120 images randomly selected from MESSIDOR database, achieving an average matching score of 85.4%.
Circle-like foreign element detection in chest x-rays using normalized cross-correlation and unsupervised clustering
Fatema T. Zohora, Sameer Antani, K. C. Santosh
Presence of foreign objects (buttons, medical devices) adversely impact the performance of the automated chest X-ray (CXR) screening. We present a novel image processing and machine learning technique to detect circle-like foreign elements in CXR images that helps avoid confusions in automated detection of abnormalities, such as nodules and other calcifications. In our technique, we apply normalized cross-correlation using a few templates to collect potential circle-like elements and unsupervised clustering to make a decision. We validated our fully automatic technique on a set of 400 publicly available images hosted by LHNCBC, U.S. National Library of Medicine (NLM), National Institutes of Health (NIH). Our method achieved an accuracy greater than 90% and outperforms existing techniques that are reported in the literature.
Orientation regression in hand radiographs: a transfer learning approach
Ivo M. Baltruschat, Axel Saalbach, Mattias P. Heinrich, et al.
Most radiologists prefer an upright orientation of the anatomy in a digital X-ray image for consistency and quality reasons. In almost half of the clinical cases, the anatomy is not upright orientated, which is why the images must be digitally rotated by radiographers. Earlier work has shown that automated orientation detection results in small error rates, but requires specially designed algorithms for individual anatomies. In this work, we propose a novel approach to overcome time-consuming feature engineering by means of Residual Neural Networks (ResNet), which extract generic low-level and high-level features, and provide promising solutions for medical imaging. Our method uses the learned representations to estimate the orientation via linear regression, and can be further improved by fine-tuning selected ResNet layers. The method was evaluated on 926 hand X-ray images and achieves a state-of-the-art mean absolute error of 2.79°.
Organ localization and identification in thoracic CT volumes using 3D CNNs leveraging spatial anatomic relations
In this paper, we present a model to obtain prior knowledge for organ localization in CT thorax images using three dimensional convolutional neural networks (3D CNNs). Specifically, we use the knowledge obtained from CNNs in a Bayesian detector to establish the presence and location of a given target organ defined within a spherical coordinate system. We train a CNN to perform a soft detection of the target organ potentially present at any point, x = [r,Θ,Φ]T. This probability outcome is used as a prior in a Bayesian model whose posterior probability serves to provide a more accurate solution to the target organ detection problem. The likelihoods for the Bayesian model are obtained by performing a spatial analysis of the organs in annotated training volumes. Thoracic CT images from the NSCLC–Radiomics dataset are used in our case study, which demonstrates the enhancement in robustness and accuracy of organ identification. The average value of the detector accuracies for the right lung, left lung, and heart were found to be 94.87%, 95.37%, and 90.76% after the CNN stage, respectively. Introduction of spatial relationship using a Bayes classifier improved the detector accuracies to 95.14%, 96.20%, and 95.15%, respectively, showing a marked improvement in heart detection. This workflow improves the detection rate since the decision is made employing both lower level features (edges, contour etc) and complex higher level features (spatial relationship between organs). This strategy also presents a new application to CNNs and a novel methodology to introduce higher level context features like spatial relationship between objects present at a different location in images to real world object detection problems.
Automatic valve segmentation in cardiac ultrasound time series data
Yoni Dukler, Yurun Ge, Yizhou Qian, et al.
We consider the problem of automatically tracking the mitral valve in cardiac ultrasound time series and present an unsupervised method for decomposing and segmenting the mitral valve from noisy ultrasound videos. To do so we propose a Robust Nonnegative Matrix Factorization (RNMF) method that naturally decomposes the time series into three separate parts, highlighting the cardiac cycle, mitral valve, and ultrasound noise. The low rank component of RNMF captures the simple motions of the cardiac cycle effectively aside from the sporadic motion of the mitral valve tissue that is captured innately in our RNMF sparse signal term. Using the RNMF representation, we introduce a simple valve object detection algorithm. Our method performs especially well in noisy time series when existing methods fail, differentiating general noise from the subtle and complex motions of the mitral valve. The valve is then segmented using simple thresholding and diffusion. The method presented is highly robust to low quality ultrasound video, and does not require manual preprocessing, prior labeling, or any training data.
Transfer learning for diabetic retinopathy
Jeremy Benson, Hector Carrillo, Jeff Wigdahl, et al.
Diabetic Retinopathy (DR)1, 2 is a leading cause of blindness worldwide and is estimated to threaten the vision of nearly 200 million by 2030.3 To work with the ever-increasing population, the use of image processing algorithms to screen for those at risk has been on the rise. Research-oriented solutions have proven effective in classifying images with or without DR, but often fail to address the true need of the clinic - referring only those who need to be seen by a specialist, and reading every single case. In this work, we leverage an array of image pre-preprocessing techniques, as well as Transfer Learning to re-purpose an existing deep network for our tasks in DR. We train, test, and validate our system on 979 clinical cases, achieving a 95% Area Under the Curve (AUC) for referring Severe DR with an equal error Sensitivity and Specificity of 90%. Our system does not reject any images based on their quality, and is agnostic in terms of eye side and field. These results show that general purpose classifiers can, with the right type of input, have a major impact in clinical environments or for teams lacking access to large volumes of data or high-throughput supercomputers.
Extraction of brain tissue from CT head images using fully convolutional neural networks
Zeynettin Akkus, Petro M. Kostandy, Kenneth A. Philbrick, et al.
Removing non-brain tissues such as the skull, scalp and face from head computed tomography (CT) images is an important field of study in brain image processing applications. It is a prerequisite step in numerous quantitative imaging analyses of neurological diseases as it improves the computational speed and accuracy of quantitative analyses and image coregistration. In this study, we present an accurate method based on fully convolutional neural networks (fCNN) to remove non-brain tissues from head CT images in a time-efficient manner. The method includes an encoding part which has sequential convolutional filters that produce activation maps of the input image in low dimensional space; and it has a decoding part consisting of convolutional filters that reconstruct the input image from the reduced representation. We trained the fCNN on 122 volumetric head CT images and tested on 22 unseen volumetric CT head images based on an expert’s manual brain segmentation masks. The performance of our method on the test set is: Dice Coefficient= 0.998±0.001 (mean ± standard deviation), recall=0.998±0.001, precision=0.998±0.001, and accuracy=0.9995±0.0001. Our method extracts complete volumetric brain from head CT images in 2s which is much faster than previous methods. To the best of our knowledge, this is the first study using fCNN to perform skull stripping from CT images. Our approach based on fCNN provides accurate extraction of brain tissue from head CT images in a time-efficient manner.
Deep learning-based depth estimation from a synthetic endoscopy image training set
Colorectal cancer is the fourth leading cause of cancer deaths worldwide. The detection and removal of premalignant lesions through an endoscopic colonoscopy is the most effective way to reduce colorectal cancer mortality. Unfortunately, conventional colonoscopy has an almost 25% polyp miss rate, in part due to the lack of depth information and contrast of the surface of the colon. Estimating depth using conventional hardware and software methods is challenging in endoscopy due to limited endoscope size and deformable mucosa. In this work, we use a joint deep learning and graphical model-based framework for depth estimation from endoscopy images. Since depth is an inherently continuous property of an object, it can easily be posed as a continuous graphical learning problem. Unlike previous approaches, this method does not require hand-crafted features. Large amounts of augmented data are required to train such a framework. Since there is limited availability of colonoscopy images with ground-truth depth maps and colon texture is highly patient-specific, we generated training images using a synthetic, texture-free colon phantom to train our models. Initial results show that our system can estimate depths for phantom test data with a relative error of 0.164. The resulting depth maps could prove valuable for 3D reconstruction and automated Computer Aided Detection (CAD) to assist in identifying lesions.
Automatic lung ultrasound B-line recognition in pediatric populations for the detection of pneumonia
Pneumonic lung sonograms are known to include vertical comet-tail artifacts called B-lines. In this study, the potential of histogram properties from lung ultrasound images for the automatic identification of B-line artifacts is explored. Five histogram features (skewness, kurtosis, standard deviation, energy and average) were calculated for intercostal spaces. The sample consisted of 15 positive- and 15 negative-diagnosed B-mode videos selected by a medical expert and captured in a local pediatric health institute. For each frame, an initial domain of interest (DOI) starting from the pleural line is automatically outlined. The pleura is detected by a brightness based thresholding. Smaller regions containing the intercostal spaces inside the DOI are then outlined and histogram features are estimated. The potential classification of properties was evaluated independently, in pairs and using the group of 5. For single feature analysis, the optimal threshold was selected based on ROC (receiver operator characteristic) curve. For studying features in pairs a support vector machine (SVM) analysis using a RBF kernel was performed. Finally, for studying the five features, PCA (principal component analysis) was useful to determine the two principal components and apply an algorithm able to identify a B-line in the intercostal space. The results revealed that energy performed best as discriminator when using a single feature with 77% sensitivity, 75% specificity and 75% accuracy. When using features in pairs, average and skewness performed best with 93% sensitivity, 86% specificity and 88% accuracy. Finally, analyzing the 5 features, the results were 100% sensitivity, 98% specificity and 98% accuracy.
Brain decoding using deep convolutional network and its application in cross-subject analysis
Recent advances in functional magnetic resonance imaging (fMRI) techniques and machine learning have shown that it is possible to decode distinct brain state from complex brain activities, which have raised widespread concern. Deep learning is a popular method of machine learning and has achieved remarkable results in the field of speech recognition, image recognition and so on. However, there are many challenges in medical image analysis when using deep learning. Aiming to solve the difficulty of subject-transfer decoding, high dimensional feature extraction and slow computation, here we proposed a deep convolutional decoding (DCD) model. First, an architecture of deep convolutional network became a subject-transfer feature extractor on task-fMRI (tfMRI) data. Then, the high dimensional abstract feature was used to identify certain brain cognitive state. The experimental results show that our proposed method can achieve higher decoding accuracy of brain state across different subjects compared with traditional methods.
Neural network fusion: a novel CT-MR aortic aneurysm image segmentation method
Duo Wang, Rui Zhang, Jin Zhu, et al.
Medical imaging examination on patients usually involves more than one imaging modalities, such as Computed Tomography (CT), Magnetic Resonance (MR) and Positron Emission Tomography(PET) imaging. Multimodal imaging allows examiners to benefit from the advantage of each modalities. For example, for Abdominal Aortic Aneurysm, CT imaging shows calcium deposits in the aorta clearly while MR imaging distinguishes thrombus and soft tissues better.1 Analysing and segmenting both CT and MR images to combine the results will greatly help radiologists and doctors to treat the disease. In this work, we present methods on using deep neural network models to perform such multi-modal medical image segmentation.

As CT image and MR image of the abdominal area cannot be well registered due to non-affine deformations, a naive approach is to train CT and MR segmentation network separately. However, such approach is time-consuming and resource-inefficient. We propose a new approach to fuse the high-level part of the CT and MR network together, hypothesizing that neurons recognizing the high level concepts of Aortic Aneurysm can be shared across multiple modalities. Such network is able to be trained end-to-end with non-registered CT and MR image using shorter training time. Moreover network fusion allows a shared representation of Aorta in both CT and MR images to be learnt. Through experiments we discovered that for parts of Aorta showing similar aneurysm conditions, their neural presentations in neural network has shorter distances. Such distances on the feature level is helpful for registering CT and MR image.
Spine centerline extraction and efficient spine reading of MRI and CT data
C. Lorenz, N. Vogt, P. Börnert, et al.
Radiological assessment of the spine is performed regularly in the context of orthopedics, neurology, oncology, and trauma management. Due to the extension and curved geometry of the spinal column, reading is time-consuming and requires substantial user interaction to navigate through the data during inspection. In this paper a spine geometry guided viewing approach is proposed facilitating reading by reducing the degrees of freedom to be manipulated during inspection of the data. The method is using the spine centerline as a representation of the spine geometry. We assume that renderings most useful for reading are those that can be locally defined based on a rotation and translation relative to the spine centerline. The resulting renderings conserve locally the relation to the spine and lead to curved planar reformats that can be adjusted using a small set of parameters to minimize user interaction. The spine centerline is extracted by an automated image to image foveal fully convolutional neural network (FFCN) based approach. The network consists of three parallel convolutional pathways working on different levels of resolution and processed fields of view. The outputs of the parallel pathways are combined by a subsequent feature integration pathway to yield the (final) centerline probability map, which is converted into a set of spine centerline points. The network has been trained separately using two data set types, one comprising a mixture of T1 and T2 weighted spine MR images and one using CT image data. We achieve an average centerline position error of 1.7 mm for MR and 0.9 mm for CT and a DICE coefficient of 0.84 for MR and 0.95 for CT. Based on the thus obtained centerline viewing and multi-planar reformatting can be easily facilitated.
Poster Session: Quantification and Modeling
icon_mobile_dropdown
Segmentation of subcutaneous fat within mouse skin in 3D OCT image data using random forests
Timo Kepp, Christine Droigk, Malte Casper, et al.
Cryolipolysis is a well-established cosmetic procedure for non-invasive local fat reduction. This technique selectively destroys subcutaneous fat cells using controlled cooling. Thickness measurements of subcutaneous fat were conducted using a mouse model. For detailed examination of mouse skin optical coherence tomography (OCT) was performed, which is a non-invasive imaging modality. Due to a high number of image slices manual delineation is not feasible. Therefore, automatic segmentation algorithms are required. In this work an algorithm for the automatic 3D segmentation of the subcutaneous fat layer is presented, which is based on a random forest classification followed by a graph-based refinement step. Our approach is able to accurately segment the subcutaneous fat layer with an overall average symmetric surface distance of 11.80±6.05 μm and Dice coefficient of 0.921 ± 0.045. Furthermore, it was shown that the graph-based refining step leads to increased accuracy and robustness of the segmentation results of the random forest classifier.
Automatic detection of the inner ears in head CT images using deep convolutional neural networks
Cochlear implants (CIs) use electrode arrays that are surgically inserted into the cochlea to stimulate nerve endings to replace the natural electro-mechanical transduction mechanism and restore hearing for patients with profound hearing loss. Post-operatively, the CI needs to be programmed. Traditionally, this is done by an audiologist who is blind to the positions of the electrodes relative to the cochlea and relies on the patient’s subjective response to stimuli. This is a trial-and-error process that can be frustratingly long (dozens of programming sessions are not unusual). To assist audiologists, we have proposed what we call IGCIP for image-guided cochlear implant programming. In IGCIP, we use image processing algorithms to segment the intra-cochlear anatomy in pre-operative CT images and to localize the electrode arrays in post-operative CTs. We have shown that programming strategies informed by image-derived information significantly improve hearing outcomes for both adults and pediatric populations. We are now aiming at deploying these techniques clinically, which requires full automation. One challenge we face is the lack of standard image acquisition protocols. The content of the image volumes we need to process thus varies greatly and visual inspection and labelling is currently required to initialize processing pipelines. In this work we propose a deep learning-based approach to automatically detect if a head CT volume contains two ears, one ear, or no ear. Our approach has been tested on a data set that contains over 2,000 CT volumes from 153 patients and we achieve an overall 95.97% classification accuracy.
Multiorgan structures detection using deep convolutional neural networks
Many automatic image analysis algorithms in medical imaging require a good initialization to work properly. A similar problem occurs in many imaging-based clinical workflows, which depend on anatomical landmarks. The localization of anatomic structures based on a defined context provides with a solution to that problem, which turns out to be more challenging in medical imaging where labeled images are difficult to obtain. We propose a two-stage process to detect and regress 2D bounding boxes of predefined anatomical structures based on a 2D surrounding context. First, we use a deep convolutional neural network (DCNN) architecture to detect the optimal slice where an anatomical structure is present, based on relevant landmark features. After this detection, we employ a similar architecture to perform a 2D regression with the aim of proposing a bounding box where the structure is encompassed. We trained and tested our system for 57 anatomical structures defined in axial, sagittal and coronal planes with a dataset of 504 labeled Computed Tomography (CT) scans. We compared our method with a well-known object detection algorithm (Viola Jones) and with the inter-rater error for two human experts. Despite the relatively small number of scans and the exhaustive number of structures analyzed, our method obtained promising and consistent results, which proves our architecture very generalizable to other anatomical structures.
Coupling reconstruction and motion estimation for dynamic MRI through optical flow constraint
Ningning Zhao, Daniel O'Connor, Wenbo Gu, et al.
This paper addresses the problem of dynamic magnetic resonance image (DMRI) reconstruction and motion estimation jointly. Because of the inherent anatomical movements in DMRI acquisition, reconstruction of DMRI using motion estimation/compensation (ME/MC) has been explored under the compressed sensing (CS) scheme. In this paper, by embedding the intensity based optical flow (OF) constraint into the traditional CS scheme, we are able to couple the DMRI reconstruction and motion vector estimation. Moreover, the OF constraint is employed in a specific coarse resolution scale in order to reduce the computational complexity. The resulting optimization problem is then solved using a primal-dual algorithm due to its efficiency when dealing with nondifferentiable problems. Experiments on highly accelerated dynamic cardiac MRI with multiple receiver coils validate the performance of the proposed algorithm.
Sinogram synthesis using convolutional-neural-network for sparsely view-sampled CT
Jongha Lee, Hoyeon Lee, Seungryong Cho
Reducing the number of projections in computed tomography (CT) has been exploited as a low-dose option in conjunction with advanced iterative image reconstruction algorithms. While such iterative image reconstruction methods do provide useful images and valuable insights of the inverse imaging problems, it is an intriguing issue whether missing view projection data in the sinogram can be successfully recovered. There have been reported several approaches to interpolating the missing sinogram data. Deep-learning based super-resolution techniques in the field of natural image enhancement have been recently introduced and showed promising results. Inspired by the super-resolution techniques, we have earlier proposed a sinogram inpainting method that uses a convolutional-neural-network for sparsely viewsampled CT. Despite of the encouraging initial results, our previously proposed method had two drawbacks. The measured sinogram was contaminated in the process of filling the missing sinogram by the deep learning network. Additionally, the sum of the interpolated sinogram in the direction of detector row at each view angle was not preserved. In this study, we improve our previously developed deep-learning based sinogram synthesis method by adding new layers and modifying the size of receptive field in the deep learning network to overcome the above limitations. From the quantitative evaluations on the image accuracy and quality using real patients’ CT images, we show that the new approach synthesizes more accurate sinogram and thus leads to higher quality of CT image than the previous one.
High resolution robust and smooth precision matrices to capture functional connectivity
Nicolas Honnorat, Christos Davatzikos
Resting-state functional MRI (fMRI) provides a crucial insight into brain organization, by offering a mean to measure the functional connectivity between brain regions. A popular measure, the effective functional connectivity, is derived from the precision matrix obtained by inverting the correlations between brain regions fMRI signals. This approach has been widely adopted to build brain connectomes for large populations. For small populations and single fMRI scans, however, the significant amount of noise in the fMRI scans reduces the quality of the precision matrices, and the non-invertibility of the correlation matrices calls for more sophisticated precision estimators. These issues are especially dramatic at full brain resolution. In this work, we investigate several approaches to improve full resolution precision matrices derived from single fMRI scans. First, we compare three approaches for the computation of the correlation matrix. Then, we investigate two regularized inversions, in combination with a correlation shrinkage and two spatial smoothing strategies. During these experiments, the quality of precision matrices obtained for random fMRI half scans was measured by their generalization: their fit to the unseen time points. Our experiments, using ten high resolutions scans of the Human Connectome Project, indicate that correlation shrinkage strongly improves precision generalization. The two regularizations are associated with similar generalization. Smoothing the fMRI signal before the inversion deteriorates the generalization whereas a penalty directly improving the smoothness of the precision matrix can improve the generalization, in particular for short time series and in combination with shrinkage.
Hubs defined with participation coefficient metric altered following acute mTBI
Xiaocui Wang, Chuanzhu Sun, Shan Wang, et al.
Patients with mild traumatic brain injury (mTBI) may suffer from a widespread spectrum of symptoms that arise from the damage of long-distance white matter connections in distributed brain networks. In brain networks, an increasing attention has been devoted to assessing the functional roles of regions by estimating the spatial layout of their connections among different modules, using the participation coefficient. In the present study, we aimed to investigate the role of hubs in inter-subnetwork information coordination and integration by using participation coefficients after mTBI. 74 patients after mTBI within 7 days post-injury and 51 matched healthy controls enrolled in this study. Our results presented that hubs for mTBI patients distributed in more extensive networks such as the default mode network (DMN), ventral attention network (VAN) and frontoparietal network (FPN), somatomotor network (SMN) and visual network (VN), compared with healthy controls limited to the first three. Participation coefficients for mTBI presented significantly decreased in the DMN (P=0.015) and FPN (P=0.02), while increased in the VN (P=0.035). SVM trained with participation coefficient metrics were able to identify mTBI patients from controls with 78% accuracy, providing for its diagnose potential in clinical settings. From our point of view, difference between two groups could be related with functional network reorganization in mTBI groups.
Aorta and pulmonary artery segmentation using optimal surface graph cuts in non-contrast CT
Zahra Sedghi Gamechi, Andres M. Arias-Lorza, Jesper Holst Pedersen, et al.
Accurate measurements of the size and shape of the aorta and pulmonary arteries are important as risk factors for cardiovascular diseases, and for Chronicle Obstacle Pulmonary Disease (COPD).1 The aim of this paper is to propose an automated method for segmenting the aorta and pulmonary arteries in low-dose non-ECGgated non-contrast CT scans. Low contrast and the high noise level make the automatic segmentation in such images a challenging task. In the proposed method, first, a minimum cost path tracking algorithm traces the centerline between user-defined seed points. The cost function is based on a multi-directional medialness filter and a lumen intensity similarity metric. The vessel radius is also estimated from the medialness filter. The extracted centerlines are then smoothed and dilated non-uniformly according to the extracted local vessel radius and subsequently used as initialization for a graph-cut segmentation. The algorithm is evaluated on 225 low-dose non-ECG-gated non-contrast CT scans from a lung cancer screening trial. Quantitatively analyzing 25 scans with full manual annotations, we obtain a dice overlap of 0.94±0.01 for the aorta and 0.92±0.01 for pulmonary arteries. Qualitative validation by visual inspection on 200 scans shows successful segmentation in 93% of all cases for the aorta and 94% for pulmonary arteries.
Model based rib-cage unfolding for trauma CT
A CT rib-cage unfolding method is proposed that does not require to determine rib centerlines but determines the visceral cavity surface by model base segmentation. Image intensities are sampled across this surface that is flattened using a model based 3D thin-plate-spline registration. An average rib centerline model projected onto this surface serves as a reference system for registration. The flattening registration is designed so that ribs similar to the centerline model are mapped onto parallel lines preserving their relative length. Ribs deviating from this model appear deviating from straight parallel ribs in the unfolded view, accordingly. As the mapping is continuous also the details in intercostal space and those adjacent to the ribs are rendered well. The most beneficial application area is Trauma CT where a fast detection of rib fractures is a crucial task. Specifically in trauma, automatic rib centerline detection may not be guaranteed due to fractures and dislocations. The application by visual assessment on the large public LIDC data base of lung CT proved general feasibility of this early work.
Thoracic lymph node station recognition on CT images based on automatic anatomy recognition with an optimal parent strategy
Currently, there are many papers that have been published on the detection and segmentation of lymph nodes from medical images. However, it is still a challenging problem owing to low contrast with surrounding soft tissues and the variations of lymph node size and shape on computed tomography (CT) images. This is particularly very difficult on low-dose CT of PET/CT acquisitions. In this study, we utilize our previous automatic anatomy recognition (AAR) framework to recognize the thoracic-lymph node stations defined by the International Association for the Study of Lung Cancer (IASLC) lymph node map. The lymph node stations themselves are viewed as anatomic objects and are localized by using a one-shot method in the AAR framework. Two strategies have been taken in this paper for integration into AAR framework. The first is to combine some lymph node stations into composite lymph node stations according to their geometrical nearness. The other is to find the optimal parent (organ or union of organs) as an anchor for each lymph node station based on the recognition error and thereby find an overall optimal hierarchy to arrange anchor organs and lymph node stations. Based on 28 contrast-enhanced thoracic CT image data sets for model building, 12 independent data sets for testing, our results show that thoracic lymph node stations can be localized within 2-3 voxels compared to the ground truth.
Tapering analysis of airways with bronchiectasis
Kin Quan, Rebecca J. Shipley, Ryutaro Tanno, et al.
Bronchiectasis is the permanent dilation of airways. Patients with the disease can suffer recurrent exacerbations, reducing their quality of life. The gold standard to diagnose and monitor bronchiectasis is accomplished by inspection of chest computed tomography (CT) scans. A clinician examines the broncho-arterial ratio to determine if an airway is brochiectatic. The visual analysis assumes the blood vessel diameter remains constant, although this assumption is disputed in the literature. We propose a simple measurement of tapering along the airways to diagnose and monitor bronchiectasis. To this end, we constructed a pipeline to measure the cross-sectional area along the airways at contiguous intervals, starting from the carina to the most distal point observable. Using a phantom with calibrated 3D printed structures, the precision and accuracy of our algorithm extends to the sub voxel level. The tapering measurement is robust to bifurcations along the airway and was applied to chest CT images acquired in clinical practice. The result is a statistical difference in tapering rate between airways with bronchiectasis and controls.
Volumetric versus area-based density assessment: comparisons using automated quantitative measurements from a large screening cohort
Aimilia Gastounioti, Meng-Kang Hsieh, Lauren Pantalone, et al.
Mammographic density is an established risk factor for breast cancer. However, area-based density (ABD) measured in 2D mammograms consider the projection, rather than the actual volume of dense tissue which may be an important limitation. With the increasing utilization of digital breast tomosynthesis (DBT) in screening, there’s an opportunity to routinely estimate volumetric breast density (VBD). In this study, we investigate associations between DBT-VBD and ABD extracted from standard-dose mammography (DM) and synthetic 2D digital mammography (sDM) increasingly replacing DM. We retrospectively analyzed bilateral imaging data from a random sample of 1000 women, acquired over a transitional period at our institution when all women had DBT, sDM and DM acquired as part of their routine breast screening. For each exam, ABD was measured in DM and sDM images with the publicly available “LIBRA” software, while DBT-VBD was measured using a previously validated, fully-automated computer algorithm. Spearman correlation (r) was used to compare VBD to ABD measurements. For each density measure, we also estimated the within woman intraclass correlation (ICC) and finally, to compare to clinical assessments, we performed analysis of variance (ANOVA) to evaluate the variation to the assigned clinical BI-RADS breast density category for each woman. DBT-VBD was moderately correlated to ABD from DM (r=0.70) and sDM (r=0.66). All density measures had strong bilateral symmetry (ICC = [0.85, 0.95]), but were significantly different across BI-RADS density categories (ANOVA, p<0.001). Our results contribute to further elaborating the clinical implications of breast density measures estimated with DBT which may better capture the volumetric amount of dense tissue within the breast than area-based measures and visual assessment.
Subject-specific brain tumor growth modelling via an efficient Bayesian inference framework
Yongjin Chang, Gregory C. Sharp, Quanzheng Li, et al.
An accurate prediction of brain tumor progression is crucial for optimized treatment of the tumors. Gliomas are primarily treated by combining surgery, external beam radiotherapy, and chemotherapy. Among them, radiotherapy is a non-invasive and effective therapy, and an understanding of tumor growth will allow better therapy planning. In particular, estimating parameters associated with tumor growth, such as the diffusion coefficient and proliferation rate, is crucial to accurately characterize physiology of tumor growth and to develop predictive models of tumor infiltration and recurrence. Accurate parameter estimation, however, is a challenging task due to inaccurate tumor boundaries and the approximation of the tumor growth model. Here, we introduce a Bayesian framework for a subject-specific tumor growth model that estimates the tumor parameters effectively. This is achieved by using an improved elliptical slice sampling method based on an adaptive sample region. Experimental results on clinical data demonstrate that the proposed method provides a higher acceptance rate, while preserving the parameter estimation accuracy, compared with other state-of-the-art methods such as Metropolis-Hastings and elliptical slice sampling without any modification. Our approach has the potential to provide a method to individualize therapy, thereby offering an optimized treatment.
Image-based assessment of uncertainty in quantification of carotid lumen
Lilli Kaufhold, Andreas Harloff, Christian Schumann, et al.
Measurements of the vessel lumen diameter are often used to determine the degree of atherosclerotic disease in carotid arteries. However, quantification results vary with imaging technique and acquisition settings. In this work, we aim at providing a tool, that quantifies the lumen diameter on different image datasets and gives an estimate of quantification uncertainties, so that they can be taken into consideration when evaluating and comparing measurements. For the segmentation of the vessel lumen we present an algorithm using ray-casting techniques and partial volume correction. We furthermore propose a scheme for analysis and exploration of the lumen diameter. Finally, we present clinically relevant application scenario, in which we explore agreement between lumen diameter estimations in corresponding CTA, CEMRA, TOF and subtraction images of carotid vessels with severe carotid atherosclerotic plaques.
Automated Agatston score computation in non-ECG gated CT scans using deep learning
Carlos Cano-Espinosa, Germán González, George R. Washko, et al.
Introduction: The Agatston score is a well-established metric of cardiovascular disease related to clinical outcomes. It is computed from CT scans by a) measuring the volume and intensity of the atherosclerotic plaques and b) aggregating such information in an index. Objective: To generate a convolutional neural network that inputs a non-contrast chest CT scan and outputs the Agatston score associated with it directly, without a prior segmentation of Coronary Artery Calcifications (CAC). Materials and methods: We use a database of 5973 non-contrast non-ECG gated chest CT scans where the Agatston score has been manually computed. The heart of each scan is cropped automatically using an object detector. The database is split in 4973 cases for training and 1000 for testing. We train a 3D deep convolutional neural network to regress the Agatston score directly from the extracted hearts. Results: The proposed method yields a Pearson correlation coefficient of r = 0.93; p ≤ 0.0001 against manual reference standard in the 1000 test cases. It further stratifies correctly 72.6% of the cases with respect to standard risk groups. This compares to more complex state-of-the-art methods based on prior segmentations of the CACs, which achieve r = 0.94 in ECG-gated pulmonary CT. Conclusions: A convolutional neural network can regress the Agatston score from the image of the heart directly, without a prior segmentation of the CACs. This is a new and simpler paradigm in the Agatston score computation that yields similar results to the state-of-the-art literature.
Generative statistical modeling of left atrial appendage appearance to substantiate clinical paradigms for stroke risk stratification
Left atrial appendage (LAA) is the source of 91% of the thrombi in patients with atrial arrhythmias (~2.3 million US adults), turning this region into a potential threat for stroke. LAA geometries have been clinically categorized into four appearance groups viz. Cauliflower, Cactus, Chicken-Wing and WindSock, based on visual appearance in 3D volume visualizations of contrast-enhanced computed tomography (CT) imaging, and have further been correlated with stroke risk by considering clinical mortality statistics. However, such classification from visual appearance is limited by human subjectivity and is not sophisticated enough to address all the characteristics of the geometries. Quantification of LAA geometry metrics can reveal a more repeatable and reliable estimate on the characteristics of the LAA which correspond with stasis risk, and in-turn cardioembolic risk. We present an approach to quantify the appearance of the LAA in patients in atrial fibrillation (AF) using a weighted set of baseline eigen-modes of LAA appearance variation, as a means to objectify classification of patient-specific LAAs into the four accepted clinical appearance groups. Clinical images of 16 patients (4 per LAA appearance category) with atrial fibrillation (AF) were identified and visualized as volume images. All the volume images were rigidly reoriented in order to be spatially co-registered, normalized in terms of intensity, resampled and finally reshaped appropriately to carry out principal component analysis (PCA), in order to parametrize the LAA region’s appearance based on principal components (PCs/eigen mode) of greyscale appearance, generating 16 eigen-modes of appearance variation. Our pilot studies show that the most dominant LAA appearance (i.e. reconstructable using the fewest eigen-modes) resembles the Chicken-Wing class, which is known to have the lowest stroke risk per clinical mortality statistics. Our findings indicate the possibility that LAA geometries with high risk of stroke are higher-order statistical variants of underlying lower risk shapes.
Feature analysis of high SUV regions based on FDG-PET
Y. Torigoe, S. Oshiro, T. Tetsuya, et al.
In this study, we propose the extraction methods of the high SUV regions based on the FDG-PET images and perform feature analysis of the cancers. Since FDG accumulates in a large amount in cancer cells, FDG-PET becomes possible to image areas suspected of cancer. SUV shows the degree of FDG accumulation in the body and it is one of the significant criteria for diagnosis. Therefore, extraction of the high SUV regions is considered to be effective. To extract them, we calculate the curvatures of four dimensional hyper-surface of FDG-PET images. There are three curvatures through this calculation, and they express original structures such as the liner shape and isolation degree. We confirm these features using the phantom data and the anatomical images. Then, we extract high SUV regions based on these curvatures of the FDG-PET images. However, since FDG remain not only in cancer cells but also in the brain, cardiac muscle, bladder, and so on, the high SUV regions cannot be defined as malignant. Therefore, we perform feature analysis on the extracted regions and evaluate these regions quantitatively from the viewpoint of the functional indicators and the morphological indicators. As functional indicators, we evaluate these regions quantitatively from average of SUV, maximum of SUV and variance of SUV. As morphological indicators, we evaluate these regions quantitatively from degree of sphericity and average of third curvature. In this paper, we apply the above methods to six cancer cases and analyze the features unique to cancer.
Relating regional characteristics of left atrial shape to presence of scar in patients with atrial fibrillation
Soroosh Sanatkhani, Michael Oladosu, Karandeep Chera, et al.
Pulmonary vein isolation (PVI) is an established procedure for atrial fibrillation (AF) patients. Pre-procedural screening is necessary prior to PVI in order to reduce the likelihood of AF recurrence and improve overall success rate of the procedure. However, current reliable methods to determine AF triggers are invasive. In this paper, we present an approach to relate the regional characteristics of left atrial (LA) shape to existence of low-voltage areas (LVA) which indicate the presence of scar in invasive exams. A cohort of 29 AF patient-specific clinical images were each segmented into 3D surface bodies representing the LA. Iterative closest point based similarity transformation was used to find the best fit sphere to each patient-specific LA and the mean deviation of LA wall to this sphere of best fit was determined using a signed point-to-surface regional distance metric. Regional departure from the best-fit sphere was reduced into a metric of global LA sphericity. Next, the LA was divided into six regions to perform an analysis of regional sphericity. Regional sphericity analysis revealed that sphericity of the inferior-posterior LA region was found to be related to several clinical variables, including a direct correlation with body mass index (BMI) and an inverse correlation with left ventricular ejection fraction (EF), which presents a diseased heart that has been asymmetrically inflated. Our observations therefore demonstrate promise in being leveraged as a non-invasive patient selection tool to increase the success rate of PVI procedures.
Automatic anatomy recognition using neural network learning of object relationships via virtual landmarks
Fengxia Yan, Jayaram K. Udupa, Yubing Tong, et al.
The recently developed body-wide Automatic Anatomy Recognition (AAR) methodology depends on fuzzy modeling of individual objects, hierarchically arranging objects, constructing an anatomy ensemble of these models, and a dichotomous object recognition–delineation process. The parent-to-offspring spatial relationship in the object hierarchy is crucial in the AAR method. We have found this relationship to be quite complex, and as such any improvement in capturing this relationship information in the anatomy model will improve the process of recognition itself. Currently, the method encodes this relationship based on the layout of the geometric centers of the objects. Motivated by the concept of virtual landmarks (VLs), this paper presents a new one-shot AAR recognition method that utilizes the VLs to learn object relationships by training a neural network to predict the pose and the VLs of an offspring object given the VLs of the parent object in the hierarchy. We set up two neural networks for each parent-offspring object pair in a body region, one for predicting the VLs and another for predicting the pose parameters. The VL-based learning/prediction method is evaluated on two object hierarchies involving 14 objects. We utilize 54 computed tomography (CT) image data sets of head and neck cancer patients and the associated object contours drawn by dosimetrists for routine radiation therapy treatment planning. The VL neural network method is found to yield more accurate object localization than the currently used simple AAR method.
Training classifiers with limited data using the Radon cumulative distribution transform
Cailey E. Fitzgerald, Liam Cattell, Gustavo K. Rohde
Our purpose in this study is to investigate whether a recently introduced image transform, denoted as the Radon cumulative distribution transform (R-CDT), can be used as a viable preprocessing step for augmenting the robustness of training end-to-end systems with fewer training samples. In order to assess the ability of the R-CDT to perform this aim, we identified a standard machine learning dataset, MNIST, and a preliminary dataset comprised of liver cell nuclei images derived from one of two tissue types: benign or malignant tumor lesions. We separated the data into training and testing sets with 20% of the total data used for testing across all training set size conditions. To simulate a range of limited size of training examples, we randomly generated data subsets ranging in size from 80% to 0.8% of the total dataset size to be used for training. Linear classification algorithms were implemented via logistic regression and a support vector machine model with a linear kernel on both the raw images and images transformed via the R-CDT. Additionaly, non-linear classification accuracies were assessed via comparing the R-CDT paired with a shallow CNN and using a deep CNN to classify images. Results indicate that classification in Radon cumulative distribution transform space outperforms classification in image space in conditions of limited data, as one is likely to see in medical imaging.
Poster Session: Registration
icon_mobile_dropdown
Enhancement of breast periphery region in digital mammography
Ana Luiza Menegatti Pavan, Antoine Vacavant, Andre Petean Trindade, et al.
Volumetric breast density has been shown to be one of the strongest risk factor for breast cancer diagnosis. This metric can be estimated using digital mammograms. During mammography acquisition, breast is compressed and part of it loses contact with the paddle, resulting in an uncompressed region in periphery with thickness variation. Therefore, reliable density estimation in the breast periphery region is a problem, which affects the accuracy of volumetric breast density measurement. The aim of this study was to enhance breast periphery to solve the problem of thickness variation. Herein, we present an automatic algorithm to correct breast periphery thickness without changing pixel value from internal breast region. The correction pixel values from periphery was based on mean values over iso-distance lines from the breast skin-line using only adipose tissue information. The algorithm detects automatically the periphery region where thickness should be corrected. A correction factor was applied in breast periphery image to enhance the region. We also compare our contribution with two other algorithms from state-of-the-art, and we show its accuracy by means of different quality measures. Experienced radiologists subjectively evaluated resulting images from the tree methods in relation to original mammogram. The mean pixel value, skewness and kurtosis from histogram of the three methods were used as comparison metric. As a result, the methodology presented herein showed to be a good approach to be performed before calculating volumetric breast density.
Fast diffeomorphic image registration via GPU-based parallel computing: an investigation of the matching cost function
Jiong Wu, Xiaoying Tang
Large deformation diffeomorphic metric mapping (LDDMM) is one of the state-of-the-art deformable image registration algorithms that has been shown to be of superior performance, especially for brain images. LDDMM was originally proposed for matching intra-modality images, with the Sum of Squared Difference (SSD) being used as the matching cost function. Extension of LDDMM to other types of matching cost functions has been very limited. In this paper, we systematically evaluated three different matching cost functions, the SSD, the Mutual Information (MI), and the Cross Correlation (CC) in the LDDMM-image setting, based on 14 subcortical and ventricular structures in a total of 120 pairs of brain images. In addition, we proposed an efficient implementation for those three LDDMM-image settings via GPU-base parallel computing and quantitatively compared with the standard open source implementation of LDDMM-SSD in terms of both registration accuracy and computational time. The proposed parallelization and optimization approach resulted in an acceleration by 28 times, relative to the standard open source implementation, on a 4-core machine with GTX970 card (29.67 mins versus 828.35 mins on average), without sacrificing the registration accuracy. Comparing the three matching cost functions, we observed that LDDMM-CC worked the best in terms of registration accuracy, obtaining Dice overlaps larger than 0.853 for a majority of structures of interest.
Group-wise shape correspondence of variable and complex objects
Ilwoo Lyu, Jonathan Perdomo, Gabriel S. Yapuncich, et al.
We present a group-wise shape correspondence method for analyzing variable and complex objects in a population study. The proposed method begins with the standard spherical harmonics (SPHARM) point distribution models (PDM) with their spherical mappings. In case of complex and variable objects, the equal area spherical mapping based SPHARM correspondence is imperfect. For such objects, we present here a novel group-wise correspondence. As an example dataset, we use 12 second mandibular molars representing 6 living or fossil euarchontan species. To improve initial correspondence of the SPHARM-PDM representation, we first apply a rigid transformation on each subject using five well-known landmarks (molar cusps). We further enhance the correspondence by optimizing landmarks (local) and multidimensional geometric property (global) over each subject with spherical harmonic representation. The resulting average shape model better captures sharp landmark representation in quantitative evaluation as well as a nice separation of different species compared with that of the SPHARM-PDM method.
Poster Session: Segmentation
icon_mobile_dropdown
Student beats the teacher: deep neural networks for lateral ventricles segmentation in brain MR
Ventricular volume and its progression are known to be linked to several brain diseases such as dementia and schizophrenia. Therefore accurate measurement of ventricle volume is vital for longitudinal studies on these disorders, making automated ventricle segmentation algorithms desirable. In the past few years, deep neural networks have shown to outperform the classical models in many imaging domains. However, the success of deep networks is dependent on manually labeled data sets, which are expensive to acquire especially for higher dimensional data in the medical domain. In this work, we show that deep neural networks can be trained on muchcheaper-to-acquire pseudo-labels (e.g., generated by other automated less accurate methods) and still produce more accurate segmentations compared to the quality of the labels. To show this, we use noisy segmentation labels generated by a conventional region growing algorithm to train a deep network for lateral ventricle segmentation. Then on a large manually annotated test set, we show that the network significantly outperforms the conventional region growing algorithm which was used to produce the training labels for the network. Our experiments report a Dice Similarity Coefficient (DSC) of 0.874 for the trained network compared to 0.754 for the conventional region growing algorithm (p < 0.001).
Fully convolutional neural networks improve abdominal organ segmentation
Abdominal image segmentation is a challenging, yet important clinical problem. Variations in body size, position, and relative organ positions greatly complicate the segmentation process. Historically, multi-atlas methods have achieved leading results across imaging modalities and anatomical targets. However, deep learning is rapidly overtaking classical approaches for image segmentation. Recently, Zhou et al. showed that fully convolutional networks produce excellent results in abdominal organ segmentation of computed tomography (CT) scans. Yet, deep learning approaches have not been applied to whole abdomen magnetic resonance imaging (MRI) segmentation. Herein, we evaluate the applicability of an existing fully convolutional neural network (FCNN) designed for CT imaging to segment abdominal organs on T2 weighted (T2w) MRI’s with two examples. In the primary example, we compare a classical multi-atlas approach with FCNN on forty-five T2w MRI’s acquired from splenomegaly patients with five organs labeled (liver, spleen, left kidney, right kidney, and stomach). Thirty-six images were used for training while nine were used for testing. The FCNN resulted in a Dice similarity coefficient (DSC) of 0.930 in spleens, 0.730 in left kidneys, 0.780 in right kidneys, 0.913 in livers, and 0.556 in stomachs. The performance measures for livers, spleens, right kidneys, and stomachs were significantly better than multi-atlas (p < 0.05, Wilcoxon rank-sum test). In a secondary example, we compare the multi-atlas approach with FCNN on 138 distinct T2w MRI’s with manually labeled pancreases (one label). On the pancreas dataset, the FCNN resulted in a median DSC of 0.691 in pancreases versus 0.287 for multi-atlas. The results are highly promising given relatively limited training data and without specific training of the FCNN model and illustrate the potential of deep learning approaches to transcend imaging modalities. 1
Multi-class segmentation of neuronal electron microscopy images using deep learning
Nivedita Khobragade, Chirag Agarwal
Study of connectivity of neural circuits is an essential step towards a better understanding of functioning of the nervous system. With the recent improvement in imaging techniques, high-resolution and high-volume images are being generated requiring automated segmentation techniques. We present a pixel-wise classification method based on Bayesian SegNet architecture. We carried out multi-class segmentation on serial section Transmission Electron Microscopy (ssTEM) images of Drosophila third instar larva ventral nerve cord, labeling the four classes of neuron membranes, neuron intracellular space, mitochondria and glia / extracellular space. Bayesian SegNet was trained using 256 ssTEM images of 256 x 256 pixels and tested on 64 different ssTEM images of the same size, from the same serial stack. Due to high class imbalance, we used a class-balanced version of Bayesian SegNet by re-weighting each class based on their relative frequency. We achieved an overall accuracy of 93% and a mean class accuracy of 88% for pixel-wise segmentation using this encoder-decoder approach. On evaluating the segmentation results using similarity metrics like SSIM and Dice Coefficient, we obtained scores of 0.994 and 0.886 respectively. Additionally, we used the network trained using the 256 ssTEM images of Drosophila third instar larva for multi-class labeling of ISBI 2012 challenge ssTEM dataset.
Automatic segmentation of fibroglandular tissue in breast MRI using anatomy-driven three-dimensional spatial context
Dong Wei, Susan Weinstein, Meng-Kang Hsieh, et al.
The relative amount of fibroglandular tissue (FGT) in the breast has been shown to be a risk factor for breast cancer. However, automatic segmentation of FGT in breast MRI is challenging due mainly to its wide variation in anatomy (e.g., amount, location and pattern, etc.), and various imaging artifacts especially the prevalent bias-field artifact. Motivated by a previous work demonstrating improved FGT segmentation with 2-D a priori likelihood atlas, we propose a machine learning-based framework using 3-D FGT context. The framework uses features specifically defined with respect to the breast anatomy to capture spatially varying likelihood of FGT, and allows (a) intuitive standardization across breasts of different sizes and shapes, and (b) easy incorporation of additional information helpful to the segmentation (e.g., texture). Extended from the concept of 2-D atlas, our framework not only captures spatial likelihood of FGT in 3-D context, but also broadens its applicability to both sagittal and axial breast MRI rather than being limited to the plane in which the 2-D atlas is constructed. Experimental results showed improved segmentation accuracy over the 2-D atlas method, and demonstrated further improvement by incorporating well-established texture descriptors.
Extraction of breast lesions from ultrasound imagery: Bhattacharyya gradient flow approach
Mahsa Torkaman, Romeil Sandhu, Allen Tannenbaum
Breast cancer is one of the most commonly diagnosed neoplasms among American women and the second leading cause of death among women all over the world. In order to reduce the mortality rate and cost of treatment, early diagnosis and treatment are essential. Accurate and reliable diagnosis is required in order to ensure the most effective treatment and a second opinion is often advisable. In this paper, we address the problem of breast lesion detection from ultrasound imagery by means of active contours, whose evolution is driven by maximizing the Bhattacharyya distance1 between the probability density functions (PDFs). The proposed method was applied to ultrasound breast imagery, and the lesion boundary was obtained by maximizing the distance-based energy functional such that the maximum (optimal contour) is attained at the boundary of the potential lesion. We compared the results of the proposed method quantitatively using the Dice coefficient (similarity index)2 to well-known GrowCut segmentation method3 and demonstrated that Bhattacharyya approach outperforms GrowCut in most of the cases.
Coupled dictionary learning for joint MR image restoration and segmentation
To achieve better segmentation of MR images, image restoration is typically used as a preprocessing step, especially for low-quality MR images. Recent studies have demonstrated that dictionary learning methods could achieve promising performance for both image restoration and image segmentation. These methods typically learn paired dictionaries of image patches from different sources and use a common sparse representation to characterize paired image patches, such as low-quality image patches and their corresponding high quality counterparts for the image restoration, and image patches and their corresponding segmentation labels for the image segmentation. Since learning these dictionaries jointly in a unified framework may improve the image restoration and segmentation simultaneously, we propose a coupled dictionary learning method to concurrently learn dictionaries for joint image restoration and image segmentation based on sparse representations in a multi-atlas image segmentation framework. Particularly, three dictionaries, including a dictionary of low quality image patches, a dictionary of high quality image patches, and a dictionary of segmentation label patches, are learned in a unified framework so that the learned dictionaries of image restoration and segmentation can benefit each other. Our method has been evaluated for segmenting the hippocampus in MR T1 images collected with scanners of different magnetic field strengths. The experimental results have demonstrated that our method achieved better image restoration and segmentation performance than state of the art dictionary learning and sparse representation based image restoration and image segmentation methods.
Exudate segmentation using fully convolutional neural networks and inception modules
Piotr Chudzik, Somshubra Majumdar, Francesco Caliva, et al.
Diabetic retinopathy is an eye disease associated with diabetes mellitus and also it is the leading cause of preventable blindness in working-age population. Early detection and treatment of DR is essential to prevent vision loss. Exudates are one of the earliest signs of diabetic retinopathy. This paper proposes an automatic method for the detection and segmentation of exudates in fundus photographies. A novel fully convolutional neural network architecture with Inception modules is proposed. Compared to other methods it does not require the removal of other anatomical structures. Furthermore, a transfer learning approach is applied between small datasets of different modalities from the same domain. To the best of authors’ knowledge, it is the first time that such approach has been used in the exudate segmentation domain. The proposed method was evaluated using publicly available E-Ophtha datasets. It achieved better results than the state-of-the-art methods in terms of sensitivity and specificity metrics. The proposed algorithm accomplished better results using a diseased/not diseased evaluation scenario which indicates its applicability for screening purposes. Simplicity, performance, efficiency and robustness of the proposed method demonstrate its suitability for diabetic retinopathy screening applications.
Deformable model reconstruction of the subarachnoid space
Jeffrey Glaister, Muhan Shao, Xiang Li, et al.
The subarachnoid space is a layer in the meninges that surrounds the brain and is filled with trabeculae and cerebrospinal fluid. Quantifying the volume and thickness of the subarachnoid space is of interest in order to study the pathogenesis of neurodegenerative diseases and compare with healthy subjects. We present an automatic method to reconstruct the subarachnoid space with subvoxel accuracy using a nested deformable model. The method initializes the deformable model using the convex hull of the union of the outer surfaces of the cerebrum, cerebellum and brainstem. A region force is derived from the subject’s T1-weighted and T2-weighted MRI to drive the deformable model to the outer surface of the subarachnoid space. The proposed method is compared to a semi-automatic delineation from the subject’s T2-weighted MRI and an existing multi-atlas-based method. A small pilot study comparing the volume and thickness measurements in a set of age-matched subjects with normal pressure hydrocephalus and healthy controls is presented to show the efficacy of the proposed method.
Convolutional neural network based automatic plaque characterization for intracoronary optical coherence tomography images
Shenghua He, Jie Zheng, Akiko Maehara, et al.
Optical coherence tomography (OCT) can provide high-resolution cross-sectional images for analyzing superficial plaques in coronary arteries. Commonly, plaque characterization using intra-coronary OCT images is performed manually by expert observers. This manual analysis is time consuming and its accuracy heavily relies on the experience of human observers. Traditional machine learning based methods, such as the least squares support vector machine and random forest methods, have been recently employed to automatically characterize plaque regions in OCT images. Several processing steps, including feature extraction, informative feature selection, and final pixel classification, are commonly used in these traditional methods. Therefore, the final classification accuracy can be jeopardized by error or inaccuracy within each of these steps. In this study, we proposed a convolutional neural network (CNN) based method to automatically characterize plaques in OCT images. Unlike traditional methods, our method uses the image as a direct input and performs classification as a single- step process. The experiments on 269 OCT images showed that the average prediction accuracy of CNN-based method was 0.866, which indicated a great promise for clinical translation.
Sequential neural networks for biologically informed glioma segmentation
In the last five years, advances in processing power and computational efficiency in graphical processing units have catalyzed dozens of deep neural network segmentation algorithms for a variety of target tissues and malignancies. However, few of these algorithms preconfigure any biological context of their chosen segmentation tissues, instead relying on the neural network’s optimizer to develop such associations de novo. We present a novel method for applying deep neural networks to the problem of glioma tissue segmentation that takes into account the structured nature of gliomas – edematous tissue surrounding mutually-exclusive regions of enhancing and non-enhancing tumor. We trained separate deep neural networks with a 3D U-Net architecture in a tree structure to create segmentations for edema, non-enhancing tumor, and enhancing tumor regions. Specifically, training was configured such that the whole tumor region including edema was predicted first, and its output segmentation was fed as input into separate models to predict enhancing and non-enhancing tumor. We trained our model on publicly available pre- and post-contrast T1 images, T2 images, and FLAIR images, and validated our trained model on patient data from an ongoing clinical trial.
A hybrid segmentation method for partitioning the liver based on 4D DCE-MR images
Tian Zhang, Zhiyi Wu, Jurgen H. Runge, et al.
The Couinaud classification of hepatic anatomy partitions the liver into eight functionally independent segments. Detection and segmentation of the hepatic vein (HV), portal vein (PV) and inferior vena cava (IVC) plays an important role in the subsequent delineation of the liver segments. To facilitate pharmacokinetic modeling of the liver based on the same data, a 4D DCE-MR scan protocol was selected. This yields images with high temporal resolution but low spatial resolution. Since the liver’s vasculature consists of many tiny branches, segmentation of these images is challenging. The proposed framework starts with registration of the 4D DCE-MRI series followed by region growing from manually annotated seeds in the main branches of key blood vessels in the liver. It calculates the Pearson correlation between the time intensity curves (TICs) of a seed and all voxels. A maximum correlation map for each vessel is obtained by combining the correlation maps for all branches of the same vessel through a maximum selection per voxel. The maximum correlation map is incorporated in a level set scheme to individually delineate the main vessels. Subsequently, the eight liver segments are segmented based on three vertical intersecting planes fit through the three skeleton branches of HV and IVC’s center of mass as well as a horizontal plane fit through the skeleton of PV. Our segmentation regarding delineation of the vessels is more accurate than the results of two state-of-the-art techniques on five subjects in terms of the average symmetric surface distance (ASSD) and modified Hausdorff distance (MHD). Furthermore, the proposed liver partitioning achieves large overlap with manual reference segmentations (expressed in Dice Coefficient) in all but a small minority of segments (mean values between 87% and 94% for segments 2-8). The lower mean overlap for segment 1 (72%) is due to the limited spatial resolution of our DCE-MR scan protocol.
A new medical image segmentation model based on fractional order differentiation and level set
Bo Chen, Shan Huang, Feifei Xie, et al.
Segmenting medical images is still a challenging task for both traditional local and global methods because the image intensity inhomogeneous. In this paper, two contributions are made: (i) on the one hand, a new hybrid model is proposed for medical image segmentation, which is built based on fractional order differentiation, level set description and curve evolution; and (ii) on the other hand, three popular definitions of Fourier-domain, Grünwald-Letnikov (G-L) and Riemann-Liouville (R-L) fractional order differentiation are investigated and compared through experimental results. Because of the merits of enhancing high frequency features of images and preserving low frequency features of images in a nonlinear manner by the fractional order differentiation definitions, one fractional order differentiation definition is used in our hybrid model to perform segmentation of inhomogeneous images. The proposed hybrid model also integrates fractional order differentiation, fractional order gradient magnitude and difference image information. The widely-used dice similarity coefficient metric is employed to evaluate quantitatively the segmentation results. Firstly, experimental results demonstrated that a slight difference exists among the three expressions of Fourier-domain, G-L, RL fractional order differentiation. This outcome supports our selection of one of the three definitions in our hybrid model. Secondly, further experiments were performed for comparison between our hybrid segmentation model and other existing segmentation models. A noticeable gain was seen by our hybrid model in segmenting intensity inhomogeneous images.
Automatic PET cervical tumor segmentation by deep learning with prior information
Liyuan Chen, Chenyang Shen, Shulong Li, et al.
Cervical tumor segmentation on 3D 18FDG PET images is a challenging task due to the proximity between cervix and bladder. Since bladder has high capacity of 18FDG tracers, bladder intensity is similar to cervical tumor intensity in the PET image. This inhibits traditional segmentation methods based on intensity variation of the image to achieve high accuracy. We propose a supervised machine learning method that integrates a convolutional neural network (CNN) with prior information of cervical tumor. In the proposed prior information constraint CNN (PIC-CNN) algorithm, we first construct a CNN to weaken the bladder intensity value in the image. Based on the roundness of cervical tumor and relative positioning information between bladder and cervix, we obtain the final segmentation result from the output of the network by an auto-thresholding method. We evaluate the performance of the proposed PIC-CNN method on PET images from 50 cervical cancer patients whose cervix and bladder are abutting. The PIC-CNN method achieves a mean DSC value of 0.84 while transfer learning method based on fully convolutional neural networks (FCN) achieves 0.77 DSC. In addition, traditional segmentation methods such as automatic threshold and region-growing method only achieve 0.59 and 0.52 DSC values, respectively. The proposed method provides a more accurate way for segmenting cervical tumor in 3D PET image.
Automated segmentation of cellular images using an effective region force
Understanding the behaviour of cells is an important problem for biologists. Significant research has been done to facilitate this by automating the segmentation of microscopic cellular images. Bright-field images of cells prove to be particularly difficult to segment due to features such as low contrast, missing boundaries and broken halos. In this paper, we present two algorithms for automated segmentation of cellular images. These algorithms are based on a graph-partitioning approach where each pixel is modelled as a node of a weighted graph. The method combines an effective Region Force with the Laplacian and the Total Variation boundary forces, respectively, to give the two models. This region force can be interpreted as a conditional probability of a pixel belonging to a certain class (cell or background) given a small set of already labelled pixels. For practicality, we use a small set of only background pixels from the border of cell images as the labelled set. Both algorithms are tested on bright-field images to give good results. Due to faster performance, the Laplacian-based algorithm is also tested on a variety of other datasets including fluorescent images, phase-contrast images and 2- and 3-D simulated images. The results show that the algorithm performs well and consistently across a range of various cell image features such as the cell shape, size, contrast and noise levels.
Improved stability of whole brain surface parcellation with multi-atlas segmentation
Whole brain segmentation and cortical surface parcellation are essential in understanding the brain’s anatomicalfunctional relationships. Multi-atlas segmentation has been regarded as one of the leading segmentation methods for the whole brain segmentation. In our recent work, the multi-atlas technique has been adapted to surface reconstruction using a method called Multi-atlas CRUISE (MaCRUISE). The MaCRUISE method not only performed consistent volumesurface analyses but also showed advantages on robustness compared with the FreeSurfer method. However, a detailed surface parcellation was not provided by MaCRUISE, which hindered the region of interest (ROI) based analyses on surfaces. Herein, the MaCRUISE surface parcellation (MaCRUISEsp) method is proposed to perform the surface parcellation upon the inner, central and outer surfaces that are reconstructed from MaCRUISE. MaCRUISEsp parcellates inner, central and outer surfaces with 98 cortical labels respectively using a volume segmentation based surface parcellation (VSBSP), following a topological correction step. To validate the performance of MaCRUISEsp, 21 scanrescan magnetic resonance imaging (MRI) T1 volume pairs from the Kirby21 dataset were used to perform a reproducibility analyses. MaCRUISEsp achieved 0.948 on median Dice Similarity Coefficient (DSC) for central surfaces. Meanwhile, FreeSurfer achieved 0.905 DSC for inner surfaces and 0.881 DSC for outer surfaces, while the proposed method achieved 0.929 DSC for inner surfaces and 0.835 DSC for outer surfaces. Qualitatively, the results are encouraging, but are not directly comparable as the two approaches use different definitions of cortical labels.
Feature extraction using convolutional neural networks for multi-atlas based image segmentation
Multi-atlas based image segmentation in conjunction with pattern recognition based label fusion strategies has achieved promising performance in a variety of image segmentation problems, including hippocampus segmentation from MR images. The pattern recognition based label fusion consists of image feature extraction and pattern recognition components. Since the feature extraction component plays an important role in the pattern recognition based label fusion, a variety of feature extraction methods have been proposed to extract image features, including texture features and random projection features. However, these feature extraction methods are not adaptive to different segmentation problems. Following the success of convolutional neural networks in image feature extraction, we propose a feature extraction method based on convolutional neural networks for multi-atlas based image segmentation. The proposed method has been validated based on 135 T1 magnetic resonance imaging (MRI) scans and their hippocampus labels provided by the EADC-ADNI harmonized segmentation protocol. We also compared our method with state-of-the-art pattern recognition based MAIS methods, including Local Label Learning and Random Local Binary Patterns. The experimental results have demonstrated that our method could achieve competitive hippocampus segmentation performance over the alternative methods under comparison.
Random walk based optic chiasm localization using multi-parametric MRI for patients with pituitary adenoma
Min Sun, Zhiqiang Zhang M.D., Chiyuan Ma M.D., et al.
The relative position between the optic chiasm and the pituitary adenoma will affect the pattern and severity of visual field defect, which is the most common and early onset visual disability induced by this kind of tumor. In this paper we describe an interactive method to localize the optic chiasm from multi-parametric magnetic resonance imaging (MRI) data by using a combined random walk algorithm. In the optic chiasm extraction framework, the modified random walk segmentation integrates the different information of T1-weighted (T1W) and T2-weighted (T2W) three-dimension (3-D) MRI data into the energy formulation to deduce the probabilities that voxels are assigned to the foreground and background. To avoid extract the wrong region into the object, we design a threshold based region detection method to segment the optic chiasm from the probabilities map. The proposed method is tested on 16 T1W and T2W MRI data from 16 patients diagnosed with pituitary adenoma. Experimental results show that the proposed method provides clinicians with good effectiveness and accuracy in the segmentation of the optic chiasm in patients with pituitary tumors to assist diagnosis and treatment.
Advanced two-layer level set with a soft distance constraint for dual surfaces segmentation in medical images
Yuanbo Ji, Rob J. van der Geest, Saman Nazarian, et al.
Anatomical objects in medical images very often have dual contours or surfaces that are highly correlated. Manually segmenting both of them by following local image details is tedious and subjective. In this study, we proposed a two-layer region-based level set method with a soft distance constraint, which not only regularizes the level set evolution at two levels, but also imposes prior information on wall thickness in an effective manner. By updating the level set function and distance constraint functions alternatingly, the method simultaneously optimizes both contours while regularizing their distance. The method was applied to segment the inner and outer wall of both left atrium (LA) and left ventricle (LV) from MR images, using a rough initialization from inside the blood pool. Compared to manual annotation from experience observers, the proposed method achieved an average perpendicular distance (APD) of less than 1mm for the LA segmentation, and less than 1.5mm for the LV segmentation, at both inner and outer contours. The method can be used as a practical tool for fast and accurate dual wall annotations given proper initialization.