Proceedings Volume 10949

Medical Imaging 2019: Image Processing

cover
Proceedings Volume 10949

Medical Imaging 2019: Image Processing

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 17 June 2019
Contents: 14 Sessions, 121 Papers, 52 Presentations
Conference: SPIE Medical Imaging 2019
Volume Number: 10949

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 10949
  • Image Reconstruction and Synthesis
  • Deep Learning: Segmentation
  • Image Enhancement and Modeling
  • Brain: Shapes and Biomarkers
  • fMRI and DTI
  • Keynote and Highlights
  • Machine Learning for Clinical Prediction
  • Classification
  • Cardiac Imaging
  • Registration and Motion
  • Deep Learning: Lesions and Pathologies
  • OCT and Microscopy
  • Poster Session
Front Matter: Volume 10949
icon_mobile_dropdown
Front Matter: Volume 10949
This PDF file contains the front matter associated with SPIE Proceedings Volume 10949 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Image Reconstruction and Synthesis
icon_mobile_dropdown
Self-consistent deep learning-based boosting of 4D cone-beam computed tomography reconstruction
Frederic Madesta, Tobias Gauer, Thilo Sentker, et al.
Inter-fractional magnitude and trajectory changes are of great importance for radiotherapy (RT) of moving targets. In order to verify the amount and characteristics of patient-specific respiratory motion prior to each RT treatment session, a time-resolved cone-beam computed tomography (4D CBCT) is necessary. However, due to sparse view artifacts, the resulting image quality is limited when applying current 4D CBCT reconstruction approaches. In this study, a new deep learning-based boosting approach for 4D CBCT reconstruction is presented that does not rely on any a-priori information (e.g. 4D CT images) and is applicable to arbitrary reconstruction algorithms. It is shown that the overall image quality is significantly improved after boosting; in particular, sparse view sampling artifacts are suppressed.
Image-domain multi-material decomposition for dual-energy CT with non-convex sparsity regularization
Dual energy CT (DECT) has the potential to decompose tissues into different materials. However, the classic direct inversion (DI) method for multi-material decomposition (MMD) cannot accurately separate more than two basis materials due to the ill-posed problem and amplified image noise. We proposed a novel integrated MMD method that addresses the piecewise smoothness and intrinsic sparsity property of the decomposition image. The proposed MMD was formulated as an optimization problem including a quadratic data fidelity term, an isotropic total variation term that encourages image smoothness, and a non-convex penalty function that promotes decomposition image sparseness. The mass and volume conservation rule were formulated as the probability simplex constraint. An accelerated primal-dual splitting approach with line search was applied to solve the optimization problem. The proposed method with different penalty functions was compared against DI on a digital phantom, a Catphan○c600 phantom, a Quantitative Imaging phantom, and a pelvis patient. The proposed framework distinctly separated the CT image into up to 12 basis materials plus air with high decomposition accuracy. The cross-talks between two different materials are substantially reduced as shown by the decreased non-diagonal elements of the Normalized Cross Correlation (NCC) matrix. The mean square error of the measured electron densities was reduced by 72.6%. Across all datasets, the proposed method improved the average Volume Fraction (VF) accuracy from 63.9% to 99.8% and increased the diagonality of the NCC matrix from 0.73 to 0.96. Compared with DI, the proposed MMD framework improved decomposition accuracy and material separation.
Non-learning based deep parallel MRI reconstruction (NLDpMRI)
Fast data acquisition in Magnetic Resonance Imaging (MRI) is vastly in demand and scan time directly depends on the number of acquired k-space samples. Recently, the deep learning-based MRI reconstruction techniques were suggested to accelerate MR image acquisition. The most common issues in any deep learning-based MRI reconstruction approaches are generalizability and transferability. For different MRI scanner configurations using these approaches, the network must be trained from scratch every time with new training dataset, acquired under new configurations, to be able to provide good reconstruction performance. Here, we propose a new generalized parallel imaging method based on deep neural networks called NLDpMRI to reduce any structured aliasing ambiguities related to the different k-space undersampling patterns for accelerated data acquisition. Two loss functions including non-regularized and regularized are proposed for parallel MRI reconstruction using deep network optimization and we reconstruct MR images by optimizing the proposed loss functions over the network parameters. Unlike any deep learning-based MRI reconstruction approaches, our method doesn’t include any training step that the network learns from a large number of training samples and it only needs the single undersampled multi-coil k-space data for reconstruction. Also, the proposed method can handle k-space data with different undersampling patterns, and the different number of coils. Experimental results show that the proposed method outperforms the current state-of-the-art GRAPPA method and the deep learning-based variational network method.
Unpaired whole-body MR to CT synthesis with correlation coefficient constrained adversarial learning
MR to CT image synthesis plays an important role in medical image analysis, and its applications included, but not limited to PET-MR attenuation correction and MR only radiation therapy planning. Recently, deep learning-based image synthesis techniques have achieved much success. However, most of the current methods require large scales of paired data from two different modalities, which greatly limits their usage as in some situation paired data is infeasible to obtain. Some efforts have been proposed to relax this constraint such as cycle-consistent adversarial networks (Cycle-GAN). However, the cycle consistency loss is an indirect structural similarity constraint of input and synthesized images, and it can lead to inferior synthesized results. To overcome this challenge, a novel correlation coefficient loss is proposed to directly enforce the structural similarity between MR and synthesized CT image, which can not only improve the representation capability of the network but also guarantee the structure consistency between MR and synthesized CT images. In addition, to overcome the problem of big variance in whole-body mapping, we use the multi-view adversarial learning scheme to combine the complementary information along different directions to provide more robust synthesized results. Experimental results demonstrate that our method can achieve better MR to CT synthesis results both qualitatively and quantitatively with unpaired MR and CT images compared with state-of-the-art methods.
Iterative reconstruction for low dose CT using Plug-and-Play alternating direction method of multipliers (ADMM) framework
Concerns over the risks of radiation dose from diagnostic CT motivated the utilization of low dose CT (LdCT). However, due to the extremely low X-ray photon statistics in LdCT, the reconstruction problem is ill-posed and noisecontaminated. Conventional Compressed Sensing (CS) methods have been investigated to enhance the signal-to-noise ratio of LdCT at the cost of image resolution and low contrast object visibility. In this work, we adapted a flexible, iterative reconstruction framework, termed Plug-and-Play (PnP) alternating direction method of multipliers (ADMM), that incorporated state-of-the-art denoising algorithms into model-based image reconstruction. The PnP ADMM framework is achieved by combining a least square data fidelity term with a regularization term for image smoothness and was solved through the ADMM. An off-the-shelf image denoiser, the Block-Matching 3D-transform shrinkage (BM3D) filter, is plugged in to substitute an ADMM module. The PnP ADMM was evaluated on low dose scans of ACR 464 phantom and two lung screening data sets and is compared with the Filtered Back Projection (FBP), the Total Variation (TV), the BM3D post-processing method, and the BM3D regularization method. The proposed framework distinguished the line pairs at 9 lp/cm resolution on the ACR phantom and the fissure line in the left lung, resolving the same or better image details than FBP reconstruction of higher dose scans with up to 18 times less dose. Compared with conventional iterative reconstruction methods resulting in comparable image noise, the proposed method is significantly better at recovering image details and improving low contrast conspicuity.
Deep Learning: Segmentation
icon_mobile_dropdown
Two-level training of a 3D U-Net for accurate segmentation of the intra-cochlear anatomy in head CTs with limited ground truth training data
Cochlear implants (CIs) use electrode arrays that are surgically inserted into the cochlea to treat patients with hearing loss. For CI recipients, sound bypasses the natural transduction mechanism and directly stimulates the neural regions, thus creating a sense of hearing. Post-operatively, CIs need to be programmed. Traditionally, this is done by an audiologist who is blind to the positions of the electrodes relative to the cochlea and only relies on the subjective response of the patient. Multiple programming sessions are usually needed, which can take a frustratingly long time. We have developed an imageguided cochlear implant programming (IGCIP) system to facilitate the process. In IGCIP, we segment the intra-cochlear anatomy and localize the electrode arrays in the patient’s head CT image. By utilizing their spatial relationship, we can suggest programming settings that can significantly improve hearing outcomes. To segment the intra-cochlear anatomy, we use an active shape model (ASM)-based method. Though it produces satisfactory results in most cases, sub-optimal segmentation still happens. As an alternative, herein we explore using a deep learning method to perform the segmentation task. Large image sets with accurate ground truth (in our case manual delineation) are typically needed to train a deep learning model for segmentation but such a dataset does not exist for our application. To tackle this problem, we use segmentations generated by the ASM-based method to pre-train the model and fine-tune it on a small image set for which accurate manual delineation is available. Using this method, we achieve better results than the ASM-based method.
Improving splenomegaly segmentation by learning from heterogeneous multi-source labels
Splenomegaly segmentation on computed tomography (CT) abdomen anatomical scans is essential for identifying spleen biomarkers and has applications for quantitative assessment in patients with liver and spleen disease. Deep convolutional neural network automated segmentation has shown promising performance for splenomegaly segmentation. However, manual labeling of abdominal structures is resource intensive, so the labeled abdominal imaging data are rare resources despite their essential role in algorithm training. Hence, the number of annotated labels (e.g., spleen only) are typically limited with a single study. However, with the development of data sharing techniques, more and more publicly available labeled cohorts are available from different resources. A key new challenging is to co-learn from the multi-source data, even with different numbers of labeled abdominal organs in each study. Thus, it is appealing to design a co-learning strategy to train a deep network from heterogeneously labeled scans. In this paper, we propose a new deep convolutional neural network (DCNN) based method that integrates heterogeneous multi-resource labeled cohorts for splenomegaly segmentation. To enable the proposed approach, a novel loss function is introduced based on the Dice similarity coefficient to adaptively learn multi-organ information from different resources. Three cohorts were employed in our experiments, the first cohort (98 CT scans) has only splenomegaly labels, while the second training cohort (100 CT scans) has 15 distinct anatomical labels with normal spleens. A separate, independent cohort consisting of 19 splenomegaly CT scans with labeled spleen was used as testing cohort. The proposed method achieved the highest median Dice similarity coefficient value (0.94), which is superior (p-value<0.01 against each other method) to the baselines of multi-atlas segmentation (0.86), SS-Net segmentation with only spleen labels (0.90) and U-Net segmentation with multi-organ training (0.91). Our approach for adapting the loss function and training structure is not specific to the abdominal context and may be beneficial in other situations where datasets with varied label sets are available.
Simultaneous MR knee image segmentation and bias field correction using deep learning and partial convolution
Intensity inhomogeneity is a great challenge for automated organ segmentation in magnetic resonance (MR) images. Many segmentation methods fail to deliver satisfactory results when the images are corrupted by a bias field. Although inhomogeneity correction methods exist, they often fail to remove the bias field completely in knee MR images. We present a new iterative approach that simultaneously predicts the segmentation mask of knee structures using a 3D U-net and estimates the bias field in 3D MR knee images using partial convolution operations. First, the test images run through a trained 3D U-net to generate a preliminary segmentation result, which is then fed to the partial convolution filter to create a preliminary estimation of the bias field using the segmented bone mask. Finally, the estimated bias field is then used to produce bias field corrected images as the new inputs to the 3D U-net. Through this loop, the segmentation results and bias field correction are iteratively improved. The proposed method was evaluated on 20 proton-density (PD)-weighted knee MRI scans with manually created segmentation ground truth using 10 fold cross-validation. In our preliminary experiments, the proposed methods outperformed conventional inhomogeneity-correction-plus-segmentation setup in terms of both segmentation accuracy and speed.
Distributed deep learning for robust multi-site segmentation of CT imaging after traumatic brain injury
Machine learning models are becoming commonplace in the domain of medical imaging, and with these methods comes an ever-increasing need for more data. However, to preserve patient anonymity it is frequently impractical or prohibited to transfer protected health information (PHI) between institutions. Additionally, due to the nature of some studies, there may not be a large public dataset available on which to train models. To address this conundrum, we analyze the efficacy of transferring the model itself in lieu of data between different sites. By doing so we accomplish two goals: 1) the model gains access to training on a larger dataset that it could not normally obtain and 2) the model better generalizes, having trained on data from separate locations. In this paper, we implement multi-site learning with disparate datasets from the National Institutes of Health (NIH) and Vanderbilt University Medical Center (VUMC) without compromising PHI. Three neural networks are trained to convergence on a computed tomography (CT) brain hematoma segmentation task: one only with NIH data, one only with VUMC data, and one multi-site model alternating between NIH and VUMC data. Resultant lesion masks with the multi-site model attain an average Dice similarity coefficient of 0.64 and the automatically segmented hematoma volumes correlate to those done manually with a Pearson correlation coefficient of 0.87, corresponding to an 8% and 5% improvement, respectively, over the single-site model counterparts.
Improving V-Nets for multi-class abdominal organ segmentation
Segmentation is one of the most important tasks in medical image analysis. With the development of deep leaning, fully convolutional networks (FCNs) have become the dominant approach for this task and their extension to 3D achieved considerable improvements for automated organ segmentation in volumetric imaging data, such as computed tomography (CT). One popular FCN network architecture for 3D volumes is V-Net, originally proposed for single region segmentation. This network effectively solved the imbalance problem between foreground and background voxels by proposing a loss function based on the Dice similarity metric. In this work, we extend the depth of the original V-Net to obtain better features to model the increased complexity of multi-class segmentation tasks at higher input/output resolutions using modern large-memory GPUs. Furthermore, we markedly improved the training behaviour of V-Net by employing batch normalization layers throughout the network. In this way, we can efficiently improve the stability of the training optimization, achieving faster and more stable convergence. We show that our architectural changes and refinements dramatically improve the segmentation performance on a large abdominal CT dataset and obtain close to 90% average Dice score.
A fully automated CT-based airway segmentation algorithm using deep learning and topological leakage detection and branch augmentation approaches
Quantitative CT-based characterization of bronchial morphology is widely used in chronic obstructive pulmonary disease (COPD) related research and clinical studies. There are no fully automated airway tree segmentation methods, which is critical for large multi-site COPD studies. A critical challenge is that airway segmentation failures, e.g., leakages or early truncation, in even a small fraction of cases warrants manual intervention for all cases. In this paper, we present a fullyautomated CT-based hybrid algorithm for human airway segmentation that combines both deep learning and conventional image processing approaches. A three-dimensional (3-D) U-Net is developed to compute a voxel-level likelihood map of airway lumen space from a chest CT image at total lung capacity (TLC). This likelihood map is fed into a conventional image processing cascade that iteratively augments airway branches and removes leakages using newly developed freezeand-grow and progressive threshold parameter relaxation approaches. The new method has been applied on fifteen TLC human chest CT scans from an ongoing COPD Study and its performance has been quantitatively compared with the results of a semi-automated industry-standard software involving manual review and correction. Experimental results show significant improvements in terms of branch level accuracy using the new method as compared to the unedited results from the industry-standard method, while matching with their manually edited results. In terms of segmentation volume leakage, the new method significantly reduced segmentation leakages as compared to both unedited and edited results of the industry-standard method.
Image Enhancement and Modeling
icon_mobile_dropdown
Multi-modal image fusion for multispectral super-resolution in microscopy
Spectral imaging is a ubiquitous tool in modern biochemistry. Despite acquiring dozens to thousands of spectral channels, existing technology cannot capture spectral images at the same spatial resolution as structural microscopy. Due to partial voluming and low light exposure, spectral images are often difficult to interpret and analyze. This highlights a need to upsample the low-resolution spectral image by using spatial information contained in the high-resolution image, thereby creating a fused representation with high specificity both spatially and spectrally. In this paper, we propose a framework for the fusion of co-registered structural and spectral microscopy images to create super-resolved representations of spectral images. As a first application, we super-resolve spectral images of ex-vivo retinal tissue imaged with confocal laser scanning microscopy, by using spatial information from structured illumination microscopy. Second, we super-resolve mass spectroscopic images of mouse brain tissue, by using spatial information from high-resolution histology images. We present a systematic validation of model assumptions crucial towards maintaining the original nature of spectra and the applicability of super-resolution. Goodness-of-fit for spectral predictions are evaluated through functional R2 values, and the spatial quality of the super-resolved images are evaluated using normalized mutual information.
Sharpness preserved sinogram synthesis using convolutional neural network for sparse-view CT imaging
Sparse view computed tomography (CT) is an effective way to lower the radiation exposure, but results in streaking artifacts in the constructed CT image due to insufficient projection views. Several approaches have been reported for full view sinogram synthesis by interpolating the missing data into the sparse-view sinogram. However, current interpolation methods tend to generate over-smoothed sinogram, which could not preserve the sharpness of the image. Such sharpness is often referred to the region boundaries or tissue texture and of high importance as clinical indicators. To address this issue, this paper aims to propose an efficient sharpness-preserve spare-view CT sinogram synthesis method based on convolutional neural network (CNN). The sharpness preserving is stressed by the zero-order and first-order difference based loss function in the model. This study takes advantage of the residual design to overcome the problem of degradation for our deep network (20 layers), which is capable of extracting high level information and dealing with large sample dimensions (672 x 672). The proposed model design and loss function achieved a better performance in both quantitative and qualitative evaluation comparing to current state-of-the-art works. This study also performs ablation test on the effect of different designs and researches on hyper-parameter settings in the loss function.
Deep residual dense U-Net for resolution enhancement in accelerated MRI acquisition
Typical Magnetic Resonance Imaging (MRI) scan may take 20 to 60 minutes. Reducing MRI scan time is beneficial for both patient experience and cost considerations. Accelerated MRI scan may be achieved by acquiring less amount of k-space data (down-sampling in the k-space). However, this leads to lower resolution and aliasing artifacts for the reconstructed images. There are many existing approaches for attempting to reconstruct high-quality images from down-sampled k-space data, with varying complexity and performance. In recent years, deep-learning approaches have been proposed for this task, and promising results have been reported. Still, the problem remains challenging especially because of the high fidelity requirement in most medical applications employing reconstructed MRI images. In this work, we propose a deep-learning approach, aiming at reconstructing high-quality images from accelerated MRI acquisition. Specifically, we use Convolutional Neural Network (CNN) to learn the differences between the aliased images and the original images, employing a U-Net-like architecture. Further, a micro-architecture termed Residual Dense Block (RDB) is introduced for learning a better feature representation than the plain U-Net. Considering the peculiarity of the downsampled k-space data, we introduce a new term to the loss function in learning, which effectively employs the given k-space data during training to provide additional regularization on the update of the network weights. To evaluate the proposed approach, we compare it with other state-of-the-art methods. In both visual inspection and evaluation using standard metrics, the proposed approach is able to deliver improved performance, demonstrating its potential for providing an effective solution.
Artificial neural network filters for enhancing 3D optical microscopy images of neurites
Shih-Luen Wang, Seyed M. M. Kahaki, Armen Stepanyants
The ability to extract accurate morphology of labeled neurons from microscopy images is crucial for mapping brain connectivity and for understanding changes in connectivity that underlie learning and neurological disorders. There are, however, two problems, specific to optical microscopy imaging of neurons, which make accurate neuron tracing exceedingly challenging: (i) neurites can appear broken due to inhomogeneous labeling and (ii) neurites can appear fused in 3D due to limited resolution. Here, we propose and evaluate several artificial neural network (ANN) architectures and conventional image enhancement filters with the aim of alleviating both problems. We developed four image quality metrics to evaluate the effects of the proposed filters: normalized intensity in the cross-over regions between neurites, effective radius of neurites, coefficient of variation of intensity along neurites, and local background to neurite intensity ratio. Our results show that ANN-based filters, trained on optimized semi-manual traces of neurites, can significantly outperform conventional filters. In particular, U-Net based filtering can virtually eliminate background intensity, while also reducing the effective radius of neurites to nearly 1 voxel. In addition, this filter significantly decreases intensity in the cross-over regions between neurites and reduces fluctuations of intensity on neurites’ centerlines. These results suggest that including an ANN-based filtering step, which does not require substantial extra time or computing power, can be beneficial for automated neuron tracing projects.
Volumetric texture modeling using dominant and discriminative binary patterns
Parmeet S. Bhatia, Amit Kale, Zhigang Peng
Volumetric texture analysis is an import task in medical imaging domain and is widely used for characterizing tissues and tumors in medical volumes. Local binary pattern (LBP) based texture descriptors are quite successful for characterizing texture information in 2D images. Unfortunately, the number of binary patterns grows exponentially with number of bits in LBP. Hence its straightforward extension to 3D domain results in extremely large number of bit patterns that may not be relevant for subsequent tasks like classification. In this work we present an efficient extension of LBP for 3D data using decision tree. The leaves of this tree represent texture words whose binary patterns are encoded using the path being followed from the root to reach the leaf. Once trained, this tree is used to create histogram in bag-of-words fashion that can be used as texture descriptor for whole volumetric image. For training, each voxel is converted into a 3D LBP pattern and is assigned the label of it’s corresponding volumetric image. These patterns are used in supervised fashion to construct decision tree. The leaves of the corresponding tree are used as texture descriptor for downstream learning tasks. The proposed texture descriptor achieved state of the art classification results on RFAI database 1. We further showed its efficacy on MR knee protocol classification task where we obtained near perfect results. The proposed algorithm is extremely efficient, computing texture descriptor of typical MRI image in less than 100 milliseconds.
Brain: Shapes and Biomarkers
icon_mobile_dropdown
Regularized topological data analysis for extraction of coherent brain regions
Ishaan Batta, Nicolas Honnorat, Christos Davatzikos
Clustering is widely used in medical imaging to reduce data dimension and discover subgroups in patient populations. However, most of the current clustering algorithms depend on scale parameters which are especially difficult to select. Persistence homology has been introduced to address this issue. This topological data analysis framework analyses a dataset at multiple scales by generating clusters of increasing sizes, similar to single-linkage hierarchical clustering. Because of this approach, however, the results are sensitive to the presence of noise and outliers. Several strategies have been suggested to fix this issue. In this paper, we support this research effort by demonstrating how gradient preserving data smoothings, such as total variation regularization, can improve the stability of persistence homology results, and we derive analytical confidence regions for the significance of the persistence measured for clusters based on Pearson distances. We demonstrate the advantages of our methods by analysing structural and functional MRI data released by the Human Connectome Project.
Automatic quality control using hierarchical shape analysis for cerebellum parcellation
Automatic and accurate cerebellum parcellation has long been a challenging task due to the relative surface complexity and large anatomical variation of the human cerebellum. An inaccurate segmentation will inevitably bias further studies. In this paper we present an automatic approach for the quality control of cerebellum parcellation based on shape analysis in a hierarchical structure. We assume that the overall shape variation of a segmented structure comes from both population and segmentation variation. In this hierarchical structure, the higher level shape mainly captures the population variation of the human cerebellum, while the lower level shape captures both population and segmentation variation. We use a partial least squares regression to combine the lower level and higher level shape information. By compensating for population variation, we show that the estimated segmentation variation is highly correlated with the accuracy of the cerebellum parcellation results, which not only provides a confidence measurement of the cerebellum parcellation, but also gives some clues about when a segmentation software may fail in real scenarios.
Cerebellum parcellation with convolutional neural networks
To better understand cerebellum-related diseases and functional mapping of the cerebellum, quantitative measurements of cerebellar regions in magnetic resonance (MR) images have been studied in both clinical and neurological studies. Such studies have revealed that different spinocerebellar ataxia (SCA) subtypes have different patterns of cerebellar atrophy and that atrophy of different cerebellar regions is correlated with specific functional losses. Previous methods to automatically parcellate the cerebellum, that is, to identify its sub-regions, have been largely based on multi-atlas segmentation. Recently, deep convolutional neural network (CNN) algorithms have been shown to have high speed and accuracy in cerebral sub-cortical structure segmentation from MR images. In this work, two three-dimensional CNNs were used to parcellate the cerebellum into 28 regions. First, a locating network was used to predict a bounding box around the cerebellum. Second, a parcellating network was used to parcellate the cerebellum using the entire region within the bounding box. A leave-one-out cross validation of fifteen manually delineated images was performed. Compared with a previously reported state-ofthe-art algorithm, the proposed algorithm shows superior Dice coefficients. The proposed algorithm was further applied to three MR images of a healthy subject and subjects with SCA6 and SCA8, respectively. A Singularity container of this algorithm is publicly available.
Model selection for spatiotemporal modeling of early childhood sub-cortical development
James Fishbaugh, Beatriz Paniagua, Mahmoud Mostapha, et al.
Spatiotemporal shape models capture the dynamics of shape change over time and are an essential tool for monitoring and measuring anatomical growth or degeneration. In this paper we evaluate non-parametric shape regression on the challenging problem of modeling early childhood sub-cortical development starting from birth. Due to the flexibility of the model, it can be challenging to choose parameters which lead to a good model fit yet does not overfit. We systematically test a variety of parameter settings to evaluate model fit as well as the sensitivity of the method to specific parameters, and we explore the impact of missing data on model estimation.
fMRI and DTI
icon_mobile_dropdown
Detecting connectivity changes in autism spectrum disorder using large-scale Granger causality
We investigated functional MRI connectivity changes in brain networks of subjects with Autism Spectrum Disorder (ASD) using large-scale Granger causality (lsGC), which can provide a truly multivariate representation of directed connectivity. To this end, we investigated the use of lsGC for capturing pair-wise interactions between regional timeseries extracted using ROIs from different resting-state brain networks. We studied these measures in a dataset comprising 59 subjects (34 healthy, 25 autistic; age-matched) from the Autism Brain Imaging Data Exchange (ABIDE) project. A general linear model was used to study the differences between the two groups when controlling for age when comparing: (i) connectivity strength and diversity of each node in the network, (ii) global graph measures, and (iii) regional graph statistics. Clustering coefficient and small-worldness properties were significantly (p<0.05) increased in ASD subjects. Furthermore, we were able to localize differences in connectivity strength within the nodes of the frontoparietal, cingulo-opercular, as well as the sensorimotor network, in line with previously published literature. For comparison, a corresponding analysis using correlation-based connectivity did not reveal any significant differences between groups. Our results indicate that lsGC, in combination with a network analysis framework can serve as an alternative methodology for the analysis of clinical resting-state fMRI data.
Brain network identification in asynchronous task fMRI data using robust and scalable tensor decomposition
Jian Li, Jessica L. Wisnowski, Anand A. Joshi, et al.
The goal of this work is to robustly identify common brain networks and their corresponding temporal dynamics across subjects in asynchronous task functional MRI (tfMRI) signals. We approached this problem using a robust and scalable tensor decomposition method combined with the BrainSync algorithm. We first used BrainSync algorithm to temporally align asynchronous tfMRI data, allowing us to study common brain networks across subjects. We mapped the synchronized tfMRI data into a 3D tensor (vertices × time × session) and performed a greedy canonical polyadic (CP) decomposition, reducing the rank to 20 in order to improve the signal-to-noise ratio (SNR). We incorporated the Nesterovaccelerated adaptive moment estimation into our previously developed scalable and robust sequential CP decomposition (SRSCPD) framework and applied this improved version of SRSCPD to the rank-reduced tensor to identify dynamic brain networks. We successfully identified 9 brain networks with their corresponding temporal dynamics from 40 subjects using Human Connectome Project tfMRI data without using any prior information with regard to the task designs. Three of these show the subjects’ responses to cues at the beginning of each task block (fronto-parietal attentional control network, visual network and executive control network); one corresponds to the default mode network that exhibits deactivation during the tasks; four show motors networks (left hand, right hand, tongue, and both feet) where the temporal dynamics are strongly correlated to the task designs, and the remaining component reflects physiological noise (respiration).
Harmonizing 1.5T/3T diffusion weighted MRI through development of deep learning stabilized microarchitecture estimators
Diffusion weighted magnetic resonance imaging (DW-MRI) is interpreted as a quantitative method that is sensitive to tissue microarchitecture at a millimeter scale. However, the sensitization is dependent on acquisition sequences (e.g., diffusion time, gradient strength, etc.) and susceptible to imaging artifacts. Hence, comparison of quantitative DW-MRI biomarkers across field strengths (including different scanners, hardware performance, and sequence design considerations) is a challenging area of research. We propose a novel method to estimate microstructure using DW-MRI that is robust to scanner difference between 1.5T and 3T imaging. We propose to use a null space deep network (NSDN) architecture to model DW-MRI signal as fiber orientation distributions (FOD) to represent tissue microstructure. The NSDN approach is consistent with histologically observed microstructure (on previously acquired ex vivo squirrel monkey dataset) and scan-rescan data. The contribution of this work is that we incorporate identical dual networks (IDN) to minimize the influence of scanner effects via scan-rescan data. Briefly, our estimator is trained on two datasets. First, a histology dataset was acquired on three squirrel monkeys with corresponding DW-MRI and confocal histology (512 independent voxels). Second, 37 control subjects from the Baltimore Longitudinal Study of Aging (67-95 y/o) were identified who had been scanned at 1.5T and 3T scanners (b-value of 700 s/mm2 , voxel resolution at 2.2mm, 30-32 gradient volumes) with an average interval of 4 years (standard deviation 1.3 years). After image registration, we used paired white matter (WM) voxels for 17 subjects and 440 histology voxels for training and 20 subjects and 72 histology voxels for testing. We compare the proposed estimator with super-resolved constrained spherical deconvolution (CSD) and a previously presented regression deep neural network (DNN). NSDN outperformed CSD and DNN in angular correlation coefficient (ACC) 0.81 versus 0.28 and 0.46, mean squared error (MSE) 0.001 versus 0.003 and 0.03, and general fractional anisotropy (GFA) 0.05 versus 0.05 and 0.09. Further validation and evaluation with contemporaneous imaging are necessary, but the NSDN is promising avenue for building understanding of microarchitecture in a consistent and deviceindependent manner.
Improved estimation of dynamic connectivity from resting-state fMRI data
Functional magnetic resonance imaging (fMRI) has been widely used for neuronal connectivity analysis. As a datadriven technique, independent component analysis (ICA) has become a valuable tool for fMRI studies. Recently, due to the dynamic nature of the human brain, time-varying connectivity analysis is regarded as an important measure to reveal essential information within the network. The sliding window approach has been commonly used to extract dynamic information from fMRI time series. However, it has some limitations due to the assumption that connectivity at a given time can be estimated from all the samples of the input time series data spanned by the selected window. To address this issue, we apply a time-varying graphical lasso model (TVGL) proposed by Hallac et al., which can infer the network even when the observation interval is at only one time point. On the other hand, recent results have shown that the individual’s connectivity profiles can be used as “fingerprint” to identify subjects from a large group. We hypothesize that the subject-specific FC profiles may have the critical effect on analyzing FC dynamics at a group level. In this work, we apply a group ICA (GICA) based data-driven framework to assess dynamic functional network connectivity (dFNC), based on the combination of GICA and TVGL. Also, we use the regression model to remove the subject-specific individuality in detecting functional dynamics. The results prove our hypothesis and suggest that removing the individual effect may benefit us to assess the connectivity dynamics within the human brain.
Longitudinal structural connectivity in the developing brain with projective non-negative matrix factorization
Heejong Kim, Joseph Piven, Guido Gerig
Understanding of early brain changes has the potential to investigate imaging biomarkers for pre-symptomatic diagnosis and thus opportunity for optimal therapeutic intervention, for example in early diagnosis of infants at risk to autism or altered development of infants to drug exposure. In this paper, we propose a framework to analyze longitudinal changes of structural connectivity in the early developing infant brain by exploring underlying network components of brain structural connectivity and its changes with age. Structural connectivity is a non-negative sparse network. Projective non-negative matrix factorization (PNMF) offers benefits in sparsity and learning fewer parameters for non-negative sparse data. The number of matrix subcomponents was estimated by automatic relevance determination PNMF (ARDPNMF) for brain connectivity networks for the given data. We apply linear mixed effect modeling on the resulting loadings from ARDPNMF to model longitudinal network component changes over time. The proposed framework was validated on a synthetic example generated by known linear mixed effects on loadings of the known number of bases with different levels of additive noises. Feasibility of the framework on real data has been demonstrated by analysis of structural connectivity networks of high angular resonance diffusion imaging (HARDI) data from an ongoing neuroimaging study of autism. A total of 139 image data sets from high-risk and low-risk subjects acquired at multiple time points have been processed. Results demonstrate the feasibility of the framework to analyze connectivity network properties as a function of age and the potential to eventually explore differences associated with risk status.
Keynote and Highlights
icon_mobile_dropdown
Deep learning for inverse imaging problems: some recent approaches (Conference Presentation)
In this talk we discuss the idea of data-driven regularisers for inverse imaging problems. We are in particular interested in the combination of model-based and purely data-driven image processing approaches. In this context we will make a journey from “shallow” learning for computing optimal parameters for variational regularisation models by bilevel optimization to the investigation of different approaches that use deep neural networks for solving inverse imaging problems. Alongside all approaches that are being discussed, their numerical solution and available solution guarantees will be stated.
PADDIT: Probabilistic Augmentation of Data using Diffeomorphic Image Transformation
Mauricio Orbes-Arteaga, Lauge Sørensen, Jorge Cardoso, et al.
For proper generalization performance of convolutional neural networks (CNNs) in medical image segmentation, the learnt features should be invariant under particular non-linear shape variations of the input. To induce invariance in CNNs to such transformations, we propose Probabilistic Augmentation of Data using Diffeomorphic Image Transformation (PADDIT) – a systematic framework for generating realistic transformations that can be used to augment data for training CNNs. The main advantage of PADDIT is the ability to produce transformations that capture the morphological variability in the training data. To this end, PADDIT constructs a mean template which represents the main shape tendency of the training data. A Hamiltonian Monte Carlo(HMC) scheme is used to sample transformations which warp the training images to the generated mean template. Augmented images are created by warping the training images using the sampled transformations. We show that CNNs trained with PADDIT outperforms CNNs trained without augmentation and with generic augmentation (0.2 and 0.15 higher dice accuracy respectively) in segmenting white matter hyperintensities from T1 and FLAIR brain MRI scans.
Effect of statistical mismatch between training and test images for CNN-based deformable registration
M. D. Ketcha, T. De Silva, R. Han, et al.
Recently, convolutional neural networks (CNNs) have been proposed as a method for deformable image registration, offering a variety of potential advantages compared to physical model-based methods, including faster runtime and ability to learn complicated functions without explicit models. A persistent question for CNNs is the uncertainty in their behavior when the image statistics (e.g., noise and resolution) of the test data deviate from those of the training data. In this work we investigated the influence of statistical properties of image noise (in CT, for example, related to radiation dose) and deformation magnitude, trained registration networks over a range of dose and deformation levels, and evaluated registration performance (target registration error, TRE) as the statistics of the test data deviated from that of the training data. Generally, registration performance was optimal when the statistics of the test data matched that of the training data, except in cases of very low-dose data, where networks trained on a combination of high- and low-dose images achieved best TRE. Furthermore, TRE was found to be limited by the highest dose training data, with no improvement in TRE for test images of higher dose than that in the training data. Understanding and quantifying the relationship between statistical aspects of the training and test data – and the failure modes caused by statistical mismatch – is an important step in the development of CNN-based registration methods. This work provided new insight on the optima and tradeoffs with respect to image noise (dose) and deformation magnitude, providing important guidance in building training sets that are bestsuited to particular imaging conditions and applications.
Segmentation of corneal optical coherence tomography images using randomized Hough transform
Amr Elsawy, Mohamed Abdel-Mottaleb, Mohamed Abou Shousha
Measuring the thickness of different corneal microlayers is important for the diagnosis of common corneal eye diseases such as dry eye, keratoconus, Fuchs endothelial dystrophy, and corneal graft rejection. High resolution corneal images, obtained using optical coherence tomography (OCT), made it possible to measure the thickness of different corneal microlayers in vivo. The manual segmentation of these images is subjective and time consuming. Therefore, automatic segmentation is necessary. Several methods were proposed for segmenting corneal OCT images, but none of these methods segment all the microlayer interfaces and they are not robust. In addition, the lack of a large annotated database of corneal OCT images impedes the application of machine learning methods such as deep learning which proves to be very powerful. In this paper, we present a new corneal OCT image segmentation algorithm using Randomized Hough Transform. To the best of our knowledge, we developed the first automatic segmentation method for the six corneal microlayer interfaces. The proposed method includes a robust estimate of relative distances of inner corneal interfaces with respect to outer corneal interfaces. Also, it handles properly the correct ordering and the non-intersection of corneal microlayer interfaces. The proposed method was tested on 15 corneal OCT images that were randomly selected. OCT images were manually segmented by two trained operators for comparison. Comparison with the manual segmentation shows that the proposed method has mean segmentation error of 3.77±4.25 pixels across all interfaces which corresponds to 5.66 ± 6.38μm. The mean segmentation error between the two manual operators is 4.07 ± 4.71 pixels, which corresponds to 6.11 ± 7.07μm. The proposed method takes a mean time of 2.59 ± 0.06 seconds to segment six corneal interfaces.
Machine Learning for Clinical Prediction
icon_mobile_dropdown
Reproducible evaluation of methods for predicting progression to Alzheimer's disease from clinical and neuroimaging data
Various machine learning methods have been proposed for predicting progression of patients with mild cognitive impairment (MCI) to Alzheimer’s disease (AD) using neuroimaging data. Even though the vast majority of these works use the public dataset ADNI, reproducing their results is complicated because they often do not make available elements that are essential for reproducibility, such as selected participants and input data, image preprocessing and cross-validation procedures. Comparability is also an issue. Specially, the influence of different components like preprocessing, feature extraction or classification algorithms on the performance is difficult to evaluate. Finally, these studies rarely compare their results to models built from clinical data only, a critical aspect to demonstrate the utility of neuroimaging. In our previous work,1, 2 we presented a framework for reproducible and objective classification experiments in AD, that included automatic conversion of ADNI database into the BIDS community standard, image preprocessing pipelines and machine learning evaluation. We applied this framework to perform unimodal classifications of T1 MRI and FDG-PET images. In the present paper, we extend this work to the combination of multimodal clinical and neuroimaging data. All experiments are based on standard approaches (namely SVM and random forests). In particular, we assess the added value of neuroimaging over using only clinical data. We first demonstrate that using only demographic and clinical data (gender, education level, MMSE, CDR sum of boxes, RAVLT, ADASCog) results in a balanced accuracy of 76% (AUC of 0.85). This performance is higher than that of standard neuroimaging-based classifiers. We then propose a simple trick to improve the performance of neuroimaging-based classifiers: training from AD patients and controls (rather than from MCI patients) improves the performance of FDG-PET classification by 5 percent points, reaching the level of the clinical classifier. Finally, combining clinical and neuroimaging data, prediction results further improved to 79% balanced accuracy and an AUC of 0.89). These prediction accuracies, obtained in a reproducible way, provide a base to develop on top of it and, to compare against, more sophisticated methods. All the code of the framework and the experiments is publicly available at https: //gitlab.icm-institute.org/aramislab/AD-ML.
Reduction of unnecessary thyroid biopsies using deep learning
Thyroid nodules are extremely common lesions and highly detectable by ultrasound (US). Several studies have shown that the overall incidence of papillary thyroid cancer in patients with nodules selected for biopsy is only about 10%. Therefore, there is a clinical need for a dramatic reduction of thyroid biopsies. In this study, we present a guided classification system using deep learning that predicts malignancy of nodules from B-mode US. We retrospectively collected transverse and longitudinal images of 150 benign and 150 malignant thyroid nodules with biopsy proven results. We divided our dataset into training (n=460), validation(n=40), and test (n=100) datasets. We manually segmented nodules from B-mode US images and provided the nodule mask as a second input channel to the convolutional neural network (CNN) for increasing the attention of nodule regions in images. We evaluated the classification performance of different CNN architectures such as Inception and Resnet50 CNN architectures with different input images. The InceptionV3 model showed the best performance on the test dataset: 86% (sensitivity), 90% (specificity), and 90% precision when the threshold was set for highest accuracy. When the threshold was set for maximum sensitivity (0 missed cancers), the ROC curve suggests the number of biopsies may be reduced by 52% without missing patients with malignant thyroid nodules. We anticipate that this performance can be further improved with including more patients and the information from other ultrasound modalities.
Direct prediction of cardiovascular mortality from low-dose chest CT using deep learning
Cardiovascular disease (CVD) is a leading cause of death in the lung cancer screening population. Chest CT scans made in lung cancer screening are suitable for identification of participants at risk of CVD. Existing methods analyzing CT images from lung cancer screening for prediction of CVD events or mortality use engineered features extracted from the images combined with patient information. In this work we propose a method that automatically predicts 5-year cardiovascular mortality directly from chest CT scans without the need for hand-crafting image features. A set of 1,583 participants of the National Lung Screening Trial was included (1,188 survivors, 395 nonsurvivors). Low-dose chest CT images acquired at baseline were analyzed and the follow-up time was 5 years. To limit the analysis to the heart region, the heart was first localized by our previously developed algorithm for organ localization exploiting convolutional neural networks. Thereafter, a convolutional autoencoder was used to encode the identified heart region. Finally, based on the extracted encodings subjects were classified into survivors or non-survivors using a neural network. The performance of the method was assessed in eight cross-validation experiments with 1,433 images used for training, 50 for validation and 100 for testing. The method achieved a performance with an area under the ROC curve of 0.73. The results demonstrate that prediction of cardiovascular mortality directly from low-dose screening chest CT scans, without hand-crafted features, is feasible, allowing identification of subjects at risk of fatal CVD events.
Spatial integration of radiology and pathology images to characterize breast cancer aggressiveness on pre-surgical MRI
The widespread use of screening mammography has resulted in a remarkable rise in the diagnosis of Ductal Carcinoma In Situ (DCIS). A resultant challenge is the early screening of these patients to identify those with concurrent invasive breast cancer (IBC), as one in five DCIS at biopsy, are upgraded to IBC following surgery. Both x-ray mammography and multi-parametric Magnetic Resonance Imaging (MRI) lack the ability to distinguish DCIS from IBC reliably. Our robust methodology for 3D alignment of histopathology images and MRI provides a unique opportunity to spatially map digitized histopathology slides on pre-surgical MRI which is particularly important in the tumors where DCIS and IBC co-occur as well as for the study of tumor heterogeneity. In this proof-of-concept study, we developed and evaluated a methodological framework for the 3D spatial alignment of MRI and histopathology slices, using x-ray radiographs as intermediate modality. Our methodology involves (1) the co-registration of 2D x-ray radiographs showing macrosections and corresponding 2D histology slices, (2) the 3D reconstruction of the ex vivo specimen based on the x-ray images, and aligned histology slices, and (3) the registration of the 3D reconstructed ex vivo specimen with the 3D MRI. The spatially co-registered MRI and histopathology images may enable the identification of MRI features that distinguish aggressive from indolent disease on in vivo MRI.
A computational method to aid the detection and annotation of pleural lesions in CT images of the thorax
Azael de Melo e Sousa, Alexandre Xavier Falcão, Ericson Bagatin, et al.
Several thoracic diseases can affect the pleural space. Pleural-based lesions usually require a careful and timeconsuming visual inspection of the computed tomography (CT) slices to be detected. In order to facilitate this task, we propose a computational method that automatically detects pleural-based lesion candidates in the lung’s surface. The first step of this method is the segmentation of both lungs. For that purpose, any segmentation method can be applied but in this work we used ALTIS, a fast sequence of image processing operators that automatically segments each lung (i.e., air volume) and the trachea. The proposed approach helps the specialist during the annotation process, allowing the creation of properly annotated datasets, and the development of machine learning methods for computer-aided diagnosis. The evaluation of the proposed method was performed in a set of 40 CT scans of patients with pleural plaques and tumor (lung nodules). Two thoracic radiologists and one pulmonologist assessed the images and provided clinical data. Experiments indicate that the proposed method managed to detect most anomalies in a matter of seconds.
Classification
icon_mobile_dropdown
Body part and imaging modality classification for a general radiology cognitive assistant
Chinyere Agunwa, Mehdi Moradi, Ken C. L. Wong, et al.
Decision support systems built for radiologists need to cover a fairly wide range of image types, with the ability to route each image to the relevant algorithm. Furthermore, the training of such networks requires building large datasets with significant efforts in image curation. In situations where the DICOM tag of an image is unavailable, or unreliable, a classifier that can automatically detect the body part depicted in the image, as well as the imaging modality, is necessary. Previous work has shown the use of imaging and textual features to distinguish between imaging modalities. In this work, we present a model for the simultaneous classification of body part and imaging modality, which to our knowledge has not been done before, as part of the larger work to create a cognitive assistant for radiologists. This classification network consists of 10 classes built from a VGG network architecture using transfer learning to learn generic features. An accuracy of 94.8% is achieved.
Interpretable explanations of black box classifiers applied on medical images by meaningful perturbations using variational autoencoders
Hristina Uzunova, Jan Ehrhardt, Timo Kepp, et al.
The growing popularity of black box machine learning methods for medical image analysis makes their interpretability to a crucial task. To make a system, e.g. a trained neural network, trustworthy for a clinician, it needs to be able to explain its decisions and predictions. In this work, we tackle the problem of generating plausible explanations for the predictions of medical image classifiers, that are trained to differentiate between different types of pathologies and healthy tissues. An intuitive solution to determine which image regions influence the trained classifier is to find out whether the classification results change when those regions are deleted. This idea can be formulated as a minimization problem and thus efficiently implemented. However, the meaning of “deletion” of image regions, in our case pathologies in medical images, is not defined. We contribute by defining the deletion of pathologies, as the replacement by their healthy looking equivalent generated using variational autoencoders. The experiments with a classification neural network on OCT (Optical Coherence Tomography) images and brain lesion MRIs show that a meaningful replacement of “deleted” image regions has significant impact on the reliability of the generated explanations. The proposed deletion method is proven to be successful since our approach delivers the best results compared to four other established methods.
Fourier decomposition free-breathing 1H MRI perfusion maps in asthma
Objective: We aimed to develop a user-friendly image-analysis pipeline to simultaneously generate perfusion and ventilation maps derived from Fourier decomposition of free-breathing pulmonary 1H magnetic resonance imaging (FDMRI). Methods: Free-breathing 1H MR images were non-rigidly deformed to a 1H reference image selected halfway between inspiration and expiration, using modality independent neighbourhood descriptor-based registration. The 1H reference image was segmented using multi-region coupled continuous max-flow. The co-registered image sequence was Fourier transformed on a voxel-by-voxel basis to generate images of the voxel-wise power spectrum. The two largest intensity peaks in the power spectrum corresponded to respiratory and cardiac frequencies, which were used to generate ventilation and perfusion maps, respectively. Perfusion and ventilation defects were measured using fuzzy c-means clustering in 15 asthmatics who provided written-informed-consent to pulmonary function tests and MRI. Results: The proposed FDMRI pipeline was used to generate perfusion maps in 15 asthma patients for direct comparison with 3He and FDMRI ventilation maps. FDMRI perfusion measurements were significantly correlated with FDMRI (r2=0.48, p=0.03) and 3He MRI ventilation (r2=0.44, p=0.05). Conclusion: Ventilation and perfusion free-breathing 1H MRI maps were generated in asthmatics with clinicallyacceptable accuracy and minimal user interaction using a pipeline compatible with high throughput clinical workflows.
Localization and labeling of cervical vertebral bones in the micro-CT images of rabbit fetuses using a 3D deep convolutional neural network
Antong Chen, Dahai Xue, Tosha Shah, et al.
In developmental and reproductive toxicology (DART) studies, high-throughput micro-CT imaging of Dutch-Belted rabbit fetuses has been used as a method for the assessment of compound-induced skeletal abnormalities. Since performing visual inspection of the micro-CT images by the DART scientists is a time- and resource-intensive task, an automatic strategy was proposed to localize, segment out, label, and evaluate each bone on the skeleton in a testing environment. However, due to the lack of robustness in this bone localization approach, failures on localizing certain bones on the critical path while traversing the skeleton, e.g., the cervical vertebral bones, could lead to localization errors for other bones downstream. Herein an approach based on deep convolutional neural networks (CNN) is proposed to automatically localize each cervical vertebral bone represented by its center. For each center, a 3D probability map with Gaussian decay is computed with the center itself being the maximum. From cervical vertebrae C1 to C7, the 7 volumes of distance transforms are stacked in order to form a 4-dimensional array. The deep CNN with a 3D U-Net architecture is used to estimate the probability maps for vertebral bone centers from the CT images as the input. A post-processing scheme is then applied to find all the regions with positive response, eliminate the false ones using a point-based registration method, and provide the locations and labels for the 7 cervical vertebral bones. Experiments were carried out on a dataset of 345 rabbit fetus micro-CT volumes. The images were randomly divided into training/validation/testing sets at an 80/10/10 ratio. Results demonstrated a 94.3% success rate for localization and labeling on the testing dataset of 35 images, and for all the successful cases the average bone-by-bone localization error was at 0.84 voxel.
Quantitative and qualitative methods for efficient evaluation of multiple 3D organ segmentations
Volker Dicken, Annika Hänsch, Jan Moltz, et al.
Quantitative comparison of automatic results for multi-organ segmentation by means of Dice scores often does not yield satisfactory results. It is especially challenging, when reference contours may be prone to errors. We developed a novel approach that analyzes regions of high mismatch between automatic and reference segmentations. We extract various metrics characterizing these mismatch clusters and compare them to other metrics derived from volume overlap and surface distance histograms by correlating them with qualitative ratings from clinical experts. We show that some novel features based on the mismatch sets or surface distance histograms performed better than the Dice score. We also show how the mismatch clusters can be used to generate visualizations to reduce the workload for visual inspection of segmentation results. The visualizations directly compare reference to automatic result at locations of high mismatch in orthogonal 2D views and 3D scenes zoomed to the appropriate positions. This can make it easier to detect systematic problems of an algorithm or to compare recurrent error patterns for different variants of segmentation algorithms, such as differently parameterized or trained CNN models.
Cardiac Imaging
icon_mobile_dropdown
Automatic cardiac landmark localization by a recurrent neural network
Mike van Zon, Mitko Veta, Shuo Li
Localization of cardiac anatomical landmarks is an important step towards a more robust and accurate analysis of the heart. A fully automatic hybrid framework is proposed that detects key landmark locations in cardiac magnetic resonance (MR) images. Our method is trained and evaluated for the detection of mitral valve points on long-axis MRI and RV insert points in short-axis MRI. The framework incorporates four key modules for the localization of the landmark points. The first module crops the MR image around the heart using a convolutional neural network (CNN). The second module employs a U-Net to obtain an efficient feature representation of the cardiac image, as well as detect a preliminary location of the landmark points. In the third module, the feature representation of a cardiac image is processed with a Recurrent Neural Network (RNN). The RNN leverages either spatial or temporal dynamics from neighboring slides in time or space and obtains a second prediction for the landmark locations. In the last module the two predictions from the U-Net and RNN are combined and final locations for the landmarks are extracted. The framework is separately trained and evaluated for the localization of each landmark, it achieves a final average error of 2.87 mm for the mitral valve points and an average error of 3.64 mm for the right ventricular insert points. Our method shows that the use of a recurrent neural network for the modeling of additional temporal or spatial dependencies improves localization accuracy and achieves promising results.
Coronary calcium detection using 3D attention identical dual deep network based on weakly supervised learning
Coronary artery calcium (CAC) is biomarker of advanced subclinical coronary artery disease and predicts myocardial infarction and death prior to age 60 years. The slice-wise manual delineation has been regarded as the gold standard of coronary calcium detection. However, manual efforts are time and resource consuming and even impracticable to be applied on large-scale cohorts. In this paper, we propose the attention identical dual network (AID-Net) to perform CAC detection using scan-rescan longitudinal non-contrast CT scans with weakly supervised attention by only using per scan level labels. To leverage the performance, 3D attention mechanisms were integrated into the AID-Net to provide complementary information for classification tasks. Moreover, the 3D Gradient-weighted Class Activation Mapping (Grad-CAM) was also proposed at the testing stage to interpret the behaviors of the deep neural network. 5075 non-contrast chest CT scans were used as training, validation and testing datasets. Baseline performance was assessed on the same cohort. From the results, the proposed AID-Net achieved the superior performance on classification accuracy (0.9272) and AUC (0.9627).
Semi-automatic aortic valve tract segmentation in 3D cardiac magnetic resonance images using shape-based B-spline explicit active surfaces
Sandro Queirós, Pedro Morais, Jaime C. Fonseca, et al.
Accurate preoperative sizing of the aortic annulus (AoA) is crucial to determine the best fitting prosthesis to be implanted during transcatheter aortic valve (AV) implantation (TAVI). Although multidetector row computed tomography is currently the standard imaging modality for such assessment, 3D cardiac magnetic resonance (CMR) is a feasible radiation-free alternative. However, automatic AV segmentation and sizing in 3D CMR images is so far underexplored. In this sense, this study proposes a novel semi-automatic algorithm for AV tract segmentation and sizing in 3D CMR images using the recently presented shape-based B-spline Explicit Active Surfaces (BEAS) framework. Upon initializing the AV tract surface using two user-defined points, a dual-stage shape-based BEAS evolution is performed to segment the patient-specific AV wall. The obtained surface is then aligned with multiple reference AV tract surfaces to estimate the location of the aortic annulus, allowing to extract the relevant clinical measurements. The framework was validated in thirty datasets from a publicly available CMR benchmark, assessing the segmentation accuracy and the measurements’ agreement against manual sizing. The automated segmentation showed an average absolute distance error of 0.54 mm against manually delineated surfaces, while demonstrating to be robust against the algorithm’s parameters. In its turn, automated AoA area-derived diameters showed an excellent agreement against manual-based ones (-0.30±0.77 mm), being comparable to the interobserver agreement. Overall, the proposed framework proved to be accurate, robust and computationally efficient (around 1 sec) for AV tract segmentation and sizing in 3D CMR images, thus showing its potential for preoperative TAVI planning.
Towards increased trustworthiness of deep learning segmentation methods on cardiac MRI
Current state-of-the-art deep learning segmentation methods have not yet made a broad entrance into the clinical setting in spite of high demand for such automatic methods. One important reason is the lack of reliability caused by models that fail unnoticed and often locally produce anatomically implausible results that medical experts would not make. This paper presents an automatic image segmentation method based on (Bayesian) dilated convolutional networks (DCNN) that generate segmentation masks and spatial uncertainty maps for the input image at hand. The method was trained and evaluated using segmentation of the left ventricle (LV) cavity, right ventricle (RV) endocardium and myocardium (Myo) at end-diastole (ED) and end-systole (ES) in 100 cardiac 2D MR scans from the MICCAI 2017 Challenge (ACDC). Combining segmentations and uncertainty maps and employing a human-in-the-loop setting, we provide evidence that image areas indicated as highly uncertain, regarding the obtained segmentation, almost entirely cover regions of incorrect segmentations. The fused information can be harnessed to increase segmentation performance. Our results reveal that we can obtain valuable spatial uncertainty maps with low computational effort using DCNNs.
Automatic identification of coronary arteries and viewpoints in 2D x-ray angiography using deep learning (Conference Presentation)
Tanveer F. Syeda-Mahmood, Hui Tang
The growth in volume of multi-dimensional imaging and the advent of high resolution scanners is placing a large viewing burden on clinicians. In many situations, summaries of these studies would suffice, particularly for quick viewing, easy transport and procedure planning. One easy way to organize these studies is by viewpoints depicting left and right coronary arteries. This is a difficult problem, however, requiring automated methods to (a) extract coronary arteries, (b) recognize identity of arteries as left or right coronary arteries, and recognize (c) the viewpoints from which they are taken to examine their potential pathologies. In this paper, we present a deep learning solution that addresses this problem by using a segmentation network for detection of coronary arteries and a residual deep learning network for recognizing simultaneously the viewpoint and artery identity. Results show that the deep learning method produces reliable classification for many viewpoints.
Registration and Motion
icon_mobile_dropdown
Unsupervised learning for large motion thoracic CT follow-up registration
Image registration is the process of aligning two or more images to achieve point-wise spatial correspondence. Typically, image registration is phrased as an optimization problem w.r.t. a spatial mapping that minimizes a suitable cost function and common approaches estimate solutions by applying iterative optimization schemes such as gradient descent or Newton-type methods. This optimization is performed independently for each pair of images, which can be time consuming. In this paper we present an unsupervised learning-based approach for deformable image registration of thoracic CT scans. Our experiments show that our method performs comparable to conventional image registration methods and in particular is able to deal with large motions. Registration of a new unseen pair of images only requires a single forward pass through the network yielding the desired deformation field in less than 0.2 seconds. Furthermore, as a novelty in the context of deep-learning-based registration, we use the edge-based normalized gradient fields distance measure together with the curvature regularization as a loss function of the registration network.
Progressively growing convolutional networks for end-to-end deformable image registration
Koen A. J. Eppenhof, Maxime W. Lafarge, Josien P. W. Pluim
Deformable image registration is often a slow process when using conventional methods. To speed up deformable registration, there is growing interest in using convolutional neural networks. They are comparatively fast and can be trained to estimate full-resolution deformation fields directly from pairs of images. Because deep learningbased registration methods often require rigid or affine pre-registration of the images, they do not perform true end-to-end image registration. To address this, we propose a progressive training method for end-to-end image registration with convolutional networks. The network is first trained to find large deformations at a low resolution using a smaller part of the full architecture. The network is then gradually expanded during training by adding higher resolution layers that allow the network to learn more fine-grained deformations from higher resolution data. By starting at a lower resolution, the network is able to learn larger deformations more quickly at the start of training, making pre-registration redundant. We apply this method to pulmonary CT data, and use it to register inhalation to exhalation images. We train the network using the CREATIS pulmonary CT data set, and apply the trained network to register the DIRLAB pulmonary CT data set. By computing the target registration error at corresponding landmarks we show that the error for end-to-end registration is significantly reduced by using progressive training, while retaining sub-second registration times.
Accurate registration of in vivo time-lapse images
Seyed M. M. Kahaki, Shih-Luen Wang, Armen Stepanyants
In vivo imaging experiments often require automated detection and tracking of changes in the specimen. These tasks can be hindered by variations in the position and orientation of the specimen relative to the microscope, as well as by linear and nonlinear tissue deformations. We propose a feature-based registration method, coupled with optimal transformations, designed to address these problems in 3D time-lapse microscopy images. Features are detected as local regions of maximum intensity in source and target image stacks, and their bipartite intensity dissimilarity matrix is used as an input to the Hungarian algorithm to establish initial correspondences. A random sampling refinement method is employed to eliminate outliers, and the resulting set of corresponding features is used to determine an optimal translation, rigid, affine, or B-spline transformation for the registration of the source and target images. Accuracy of the proposed algorithm was tested on fluorescently labeled axons imaged over a 68-day period with a two-photon laser scanning microscope. To that end, multiple axons in individual stacks of images were traced semi-manually and optimized in 3D, and the distances between the corresponding traces were measured before and after the registration. The results show that there is a progressive improvement in the registration accuracy with increasing complexity of the transformations. In particular, sub-micrometer accuracy (2-3 voxels) was achieved with the regularized affine and Bspline transformations.
Analysis of the kinematic motion of the wrist from 4D magnetic resonance imaging
Batool Abbas, James Fishbaugh, Catherine Petchprapa, et al.
Static MRI and CT are both limited in their capacity and usability in the study of wrist kinematics of living human subjects. 4D MRI provides an effective means to addressing these limitations but it comes with its own set of challenges, including low resolution and anisotropic voxel size. In this paper, we describe our methodology to effectively solve these challenges and quantify the 3D kinematics of dynamic wrist data acquired from two volunteers using 4D MRI.
Automatic left ventricular segmentation in 4D interventional ultrasound data using a patient-specific temporal synchronized shape prior
Pedro Morais, Sandro Queirós, Carla Pereira, et al.
The fusion of pre-operative 3D magnetic resonance (MR) images with real-time 3D ultrasound (US) images can be the most beneficial way to guide minimally invasive cardiovascular interventions without radiation. Previously, we addressed this topic through a strategy to segment the left ventricle (LV) on interventional 3D US data using a personalized shape prior obtained from a pre-operative MR scan. Nevertheless, this approach was semi-automatic, requiring a manual alignment between US and MR image coordinate systems. In this paper, we present a novel solution to automate the abovementioned pipeline. In this sense, a method to automatically detect the right ventricular (RV) insertion point on the US data was developed, which is subsequently combined with pre-operative annotations of the RV position in the MR volume, therefore allowing an automatic alignment of their coordinate systems. Moreover, a novel strategy to ensure a correct temporal synchronization of the US and MR models is applied. Finally, a full evaluation of the proposed automatic pipeline is performed. The proposed automatic framework was tested in a clinical database with 24 patients containing both MR and US scans. A similar performance between the proposed and the previous semi-automatic version was found in terms of relevant clinical measurements. Additionally, the automatic strategy to detect the RV insertion point showed its effectiveness, with a good agreement against manually identified landmarks. The proposed automatic method showed high feasibility and a performance similar to the semi-automatic version, reinforcing its potential for normal clinical routine.
Lung cancer detection using co-learning from chest CT images and clinical demographics
Early detection of lung cancer is essential in reducing mortality. Recent studies have demonstrated the clinical utility of low-dose computed tomography (CT) to detect lung cancer among individuals selected based on very limited clinical information. However, this strategy yields high false positive rates, which can lead to unnecessary and potentially harmful procedures. To address such challenges, we established a pipeline that co-learns from detailed clinical demographics and 3D CT images. Toward this end, we leveraged data from the Consortium for Molecular and Cellular Characterization of Screen-Detected Lesions (MCL), which focuses on early detection of lung cancer. A 3D attention-based deep convolutional neural net (DCNN) is proposed to identify lung cancer from the chest CT scan without prior anatomical location of the suspicious nodule. To improve upon the non-invasive discrimination between benign and malignant, we applied a random forest classifier to a dataset integrating clinical information to imaging data. The results show that the AUC obtained from clinical demographics alone was 0.635 while the attention network alone reached an accuracy of 0.687. In contrast when applying our proposed pipeline integrating clinical and imaging variables, we reached an AUC of 0.787 on the testing dataset. The proposed network both efficiently captures anatomical information for classification and also generates attention maps that explain the features that drive performance.
Deep Learning: Lesions and Pathologies
icon_mobile_dropdown
Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder
Lesions that appear hyperintense in both Fluid Attenuated Inversion Recovery (FLAIR) and T2-weighted magnetic resonance images (MRIs) of the human brain are common in the brains of the elderly population and may be caused by ischemia or demyelination. Lesions are biomarkers for various neurodegenerative diseases, making accurate quantification of them important for both disease diagnosis and progression. Automatic lesion detection using supervised learning requires manually annotated images, which can often be impractical to acquire. Unsupervised lesion detection, on the other hand, does not require any manual delineation; however, these methods can be challenging to construct due to the variability in lesion load, placement of lesions, and voxel intensities. Here we present a novel approach to address this problem using a convolutional autoencoder, which learns to segment brain lesions as well as the white matter, gray matter, and cerebrospinal fluid by reconstructing FLAIR images as conical combinations of softmax layer outputs generated from the corresponding T1, T2, and FLAIR images. Some of the advantages of this model are that it accurately learns to segment lesions regardless of lesion load, and it can be used to quickly and robustly segment new images that were not in the training set. Comparisons with state-of-the-art segmentation methods evaluated on ground truth manual labels indicate that the proposed method works well for generating accurate lesion segmentations without the need for manual annotations.
Fully automated unruptured intracranial aneurysm detection and segmentation from digital subtraction angiography series using an end-to-end spatiotemporal deep neural network
Hailan Jin, Yin Yin, Minghui Hu, et al.
Introduction: Digital subtraction angiography (DSA) is the gold standard in detection of intracranial aneurysms, a potential life-threatening condition. Early detection, diagnosis and treatment of unruptured intracranial aneurysms (UIAs) based on DSA can effectively decrease the incidence of cerebral hemorrhage. Methods: We proposed and evaluated a novel fully automated detection and segmentation deep neural network structure to help neurologists find and contour UIAs from 2D+time DSA sequences during UIA treatment. The network structure is based on a general U-shape design for medical image segmentation and detection. The network further includes fully convolutional technique to detect aneurysms in high resolution DSA frames. In addition, a bidirectional convolutional long short-term memory (LSTM) module is introduced at each level of the network to capture the contrast medium flow change across the DSA 2D frames. The resulting network incorporates both spatial and temporal information from DSA sequences and can be trained end-to-end. Experiments: The proposed network structure was trained with DSA sequences from 347 patients with presence of UIAs. After that, the system was evaluated on an independent test set with 947 DSA sequences from 146 patients. Results: 316 out of 354 (89.3%) aneurysms were successfully detected, which corresponds to a more clinical related blood vessel level sensitivity 94.3% at a false positive rate 3.77 per sequence. The system runs less than one second per sequence with an average Dice coefficient score 0.533.
CT synthesis from MR images for orthopedic applications in the lower arm using a conditional generative adversarial network
F. Zijlstra, K. Willemsen M.D., M. C. Florkow, et al.
Purpose: To assess the feasibility of deep learning-based high resolution synthetic CT generation from MRI scans of the lower arm for orthopedic applications. Methods: A conditional Generative Adversarial Network was trained to synthesize CT images from multi-echo MR images. A training set of MRI and CT scans of 9 ex vivo lower arms was acquired and the CT images were registered to the MRI images. Three-fold cross-validation was applied to generate independent results for the entire dataset. The synthetic CT images were quantitatively evaluated with the mean absolute error metric, and Dice similarity and surface to surface distance on cortical bone segmentations. Results: The mean absolute error was 63.5 HU on the overall tissue volume and 144.2 HU on the cortical bone. The mean Dice similarity of the cortical bone segmentations was 0.86. The average surface to surface distance between bone on real and synthetic CT was 0.48 mm. Qualitatively, the synthetic CT images corresponded well with the real CT scans and partially maintained high resolution structures in the trabecular bone. The bone segmentations on synthetic CT images showed some false positives on tendons, but the general shape of the bone was accurately reconstructed. Conclusions: This study demonstrates that high quality synthetic CT can be generated from MRI scans of the lower arm. The good correspondence of the bone segmentations demonstrates that synthetic CT could be competitive with real CT in applications that depend on such segmentations, such as planning of orthopedic surgery and 3D printing.
Weakly supervised fully convolutional network for PET lesion segmentation
S. Afshari, A. BenTaieb, Z. MiriKharaji, et al.
The effort involved in creating accurate ground truth segmentation maps hinders advances in machine learning approaches to tumor delineation in clinical positron emission tomography (PET) scans. To address this challenge, we propose a fully convolutional network (FCN) model to delineate tumor volumes from PET scans automatically while relying on weak annotations in the form of bounding boxes (without delineations) around tumor lesions. To achieve this, we propose a novel loss function that dynamically combines a supervised component, designed to leverage the training bounding boxes, with an unsupervised component, inspired by the Mumford-Shah piecewise constant level-set image segmentation model. The model is trained end-to-end with the proposed differentiable loss function and is validated on a public clinical PET dataset of head and neck tumors. Using only bounding box annotations as supervision, the model achieves competitive results with state-of-the-art supervised and semiautomatic segmentation approaches. Our proposed approach improves the Dice similarity by approximately 30% and reduces the unsigned distance error by approximately 7 mm compared to a model trained with only bounding boxes (weak supervision). Also, after the post-processing step (morphological operations), our weak supervision approach differs only 7% in terms of the Dice similarity from the quality of the fully supervised model, for segmentation task.
Lesion focused super-resolution
Jin Zhu, Guang Yang, Pietro Lio
Super-resolution (SR) for image enhancement has great importance in medical image applications. Broadly speaking, there are two types of SR, one requires multiple low resolution (LR) images from different views of the same object to be reconstructed to the high resolution (HR) output, and the other one relies on the learning from a large amount of training datasets, i.e., LR-HR pairs. In real clinical environment, acquiring images from multi-views is expensive and sometimes infeasible. In this paper, we present a novel Generative Adversarial Networks (GAN) based learning framework to achieve SR from its LR version. By performing simulation based studies on the Multimodal Brain Tumor Segmentation Challenge (BraTS) datasets, we demonstrate the efficacy of our method in application of brain tumor MRI enhancement. Compared to bilinear interpolation and other state-of-the-art SR methods, our model is lesion focused, which has not only resulted in better perceptual image quality without blurring, but also been more efficient and directly benefit for the following clinical tasks, e.g., lesion detection and abnormality enhancement. Therefore, we can envisage the application of our SR method to boost image spatial resolution while maintaining crucial diagnostic information for further clinical tasks.
OCT and Microscopy
icon_mobile_dropdown
Variational autoencoding tissue response to microenvironment perturbation
Geoffrey F. Schau, Guillaume Thibault, Mark A. Dane, et al.
This work applies deep variational autoencoder learning architecture to study multi-cellular growth characteristics of human mammary epithelial cells in response to diverse microenvironment perturbations. Our approach introduces a novel method of visualizing learned feature spaces of trained variational autoencoding models that enables visualization of principal features in two dimensions. We find that unsupervised learned features more closely associate with expert annotation of cell colony organization than biologically-inspired hand-crafted features, demonstrating the utility of deep learning systems to meaningfully characterize features of multi-cellular growth characteristics in a fully unsupervised and data-driven manner.
Approximation of a pipeline of unsupervised retina image analysis methods with a CNN
A pipeline of unsupervised image analysis methods for extraction of geometrical features from retinal fundus images has previously been developed. Features related to vessel caliber, tortuosity and bifurcations, have been identified as potential biomarkers for a variety of diseases, including diabetes and Alzheimer’s. The current computationally expensive pipeline takes 24 minutes to process a single image, which impedes implementation in a screening setting. In this work, we approximate the pipeline with a convolutional neural network (CNN) that enables processing of a single image in a few seconds. As an additional benefit, the trained CNN is sensitive to key structures in the retina and can be used as a pretrained network for related disease classification tasks. Our model is based on the ResNet-50 architecture and outputs four biomarkers that describe global properties of the vascular tree in retinal fundus images. Intraclass correlation coefficients between the predictions of the CNN and the results of the pipeline showed strong agreement (0.86 - 0.91) for three of four biomarkers and moderate agreement (0.42) for one biomarker. Class activation maps were created to illustrate the attention of the network. The maps show qualitatively that the activations of the network overlap with the biomarkers of interest, and that the network is able to distinguish venules from arterioles. Moreover, local high and low tortuous regions are clearly identified, confirming that a CNN is sensitive to key structures in the retina.
Segmentation of corneal optical coherence tomography images using Graph Search and Radon transform
Amr Elsawy, Mohamed Abdel-Mottaleb, Mohamed Abou Shousha
Various common corneal eye diseases, such as dry eye, Fuchs endothelial dystrophy, Keratoconus and corneal graft rejection, can be diagnosed based on the changes in the thickness of corneal microlayers. Optical Coherence Tomography (OCT) technology made it possible to obtain high resolution corneal images that show the microlayered structures of the cornea. Manual segmentation is subjective and not feasible due to the large volume of obtained images. Existing automatic methods, used for segmenting corneal layer interfaces, are not robust and they segment few corneal microlayer interfaces. Moreover, there is no large annotated database of corneal OCT images, which is an obstacle towards the application of powerful machine learning methods such as deep learning for the segmentation of corneal interfaces. In this paper, we propose a novel segmentation method for corneal OCT images using Graph Search and Radon Transform. To the best of our knowledge, we are the first to develop an automatic segmentation method for the six corneal microlayer interfaces. The proposed method involves a novel image denoising method and an inner interfaces localization method. The proposed method was tested on 15 corneal OCT images. The images were randomly selected and manually segmented by two operators. Experimental results show that our method has a mean segmentation error of 3.87 ± 5.21 pixels (i.e. 5.81 ± 7.82μm) across all interfaces compared to the segmentation of the manual operators. The two manual operators have mean segmentation difference of 4.07 ± 4.71 pixels (i.e. 6.11 ± 7.07μm). The mean running time to segment all the corneal microlayer interfaces is 6.66 ± 0.22 seconds.
Framework for the co-registration of MRI and histology images in prostate cancer patients with radical prostatectomy
Mirabela Rusu, Christian Kunder, Richard Fan, et al.
Prostate magnetic resonance imaging (MRI) allows the detection and treatment planning of clinically significant cancers. However, indolent cancers, e.g., those with Gleason scores 3+3, are not readily distinguishable on MRI. Thus an image-guided biopsy is still required before proceeding with a radical treatment for aggressive tumors or considering active surveillance for indolent disease. The excision of the prostate as part of radical prostatectomy treatments provides a unique opportunity to correlate whole-mount histology slices with MRI. Through a careful spatial alignment of histology slices and MRI, the extent of aggressive and indolent disease can be mapped on MRI which allows one to investigate MRI-derived features that might be able to distinguish aggressive from indolent cancers. Here, we introduce a framework for the 3D spatial integration of radiology and pathology images in the prostate. Our approach, first, uses groupwise-registration methods to reconstruct the histology specimen prior to sectioning, and incorporates the MRI as a spatial constraint, and, then, performs a multi-modal 3D affine and deformable alignment between the reconstructed histology specimen and the MRI. We tested our approach on 15 studies and found a Dice similarity coefficient of 0.94±0.02 and a urethra deviation of 1.11±0.34 mm between the histology reconstruction and the MRI. Our robust framework successfully mapped the extent of disease from histology slices on MRI and created ground truth labels for characterizing aggressive and indolent disease on MRI.
Predicting histopathological findings of gastric cancer via deep generalized multi-instance learning
Mengjie Fang, Wenjuan Zhang, Di Dong, et al.
In this paper, we investigate the problem of predicting the histopathological findings of gastric cancer (GC) from preoperative CT image. Unlike most existing classification systems assess the global imaging phenotype of tissues directly, we formulate the problem as a generalized multi-instance learning (GMIL) task and design a deep GMIL framework to address it. Specifically, the proposed framework aims at training a powerful convolutional neural network (CNN) which is able to discriminate the informative patches from the neighbor confusing patches and yield accurate patient-level classification. To achieve this, we firstly train a CNN for coarse patch-level classification in a GMIL manner to develop several groups which contain the informative patches for each histopathological category, the intra-tumor ambiguous patches, and the extra-tumor irrelative patches respectively. Then we modify the fully-connected layer to introduce the latter two classes of patches and retrain the CNN model. In the inference stage, patient-level classification is implemented based on the group of candidate informative patches automatically recognized by the model. To evaluate the performance and generalizability of our approach, we successively apply it to predict two kinds of histopathological findings (differentiation degree [two categories] and Lauren classification [three categories]) on a dataset including 433 GC patients with venous phase contrast-enhanced CT scans. Experimental results reveal that our deep GMIL model has a powerful predictive ability with accuracies of 0.815 and 0.731 in the two applications respectively, and it significantly outperforms the standard CNN model and the traditional texture-based model (more than 14% and 17% accuracy increase).
Objects characterization-based approach to enhance detection of degree of malignancy in breast cancer histopathology
Histologic grading from images has become widely accepted as a powerful indicator of prognosis in breast cancer. Automated grading can assist the doctor diagnosing the medical condition. But algorithms still lag behind human experts in this task, as human experts excel in identifying parts, detecting characteristics and relating concepts and semantics. This can be improved by making algorithms distinguish and characterize the most relevant types of objects in the image and characterizing images from that. We propose a three-stage automated approach named OBI (Object-based Identification) with steps: 1. Object-based identification, which identifies the “type of object” of each region and characterizes it; 2. Learn about image, which characterizes distribution of characteristics of those types of objects in image; 3. Determination of degree of malignancy, which assigns a degree of malignancy based on a classifier over object type characteristics (the statistical distribution of characteristics of structures) in the image. Our proof-of-concept prototype uses publicly-available Mytos-Atypia dataset [19] to compare accuracy with alternatives. Results summary: human expert (medical doctor) 84%, classic machine learning 74%, convolution neural networks (CNN), 78%, our approach (OBI) 86%. As future work, we expect to generalize our results to other datasets and problems, explore mimicking knowledge of human concepts further, merge the object-based approach with CNN techniques and adapt it to other medical imaging contexts.
Poster Session
icon_mobile_dropdown
Bayesian inference for uncertainty quantification in point-based deformable image registration
Sandra Schultz, Julia Krüger, Heinz Handels, et al.
In image guided diagnostics the treatment of patients is often decided based on registered image data. During the registration process errors can occur, e.g., due to incorrect model assumptions or non-corresponding areas due to image artifacts or pathologies. Therefore, the study of approaches that analyze the accuracy and reliability of registration results has become increasingly important in recent years. One way to quantify registration uncertainty is based on the posterior distribution of the transformation parameters. Since the exact computation of the posterior distribution is intractable, variational Bayes inference can be used to efficiently provide an approximate solution. Recently, a probabilistic approach to intensity-based registration has been developed that uses sparse point-based representations of images and shows an intrinsic ability to deal with corrupted data. A natural output are correspondence probabilities between the two point sets which provide a measure for potentially non-corresponding and thus incorrectly deformed regions. In order to perform a comparative analysis of registration uncertainty and correspondence probabilities, we integrate a nonlinear point-based probabilistic registration method in a variational Bayesian framework. The developed method is applied to MR images with brain lesions, where both measures show moderate correlations, but a different behavior with respect to altered regularization. Further, we simulate realistic ground-truth data to allow for a correlation analysis between both measures and local registration errors. In fact, registration errors due to model differences cannot be depicted by registration uncertainty, however, in the presence of corrupted image areas, a strong correlation can be found.
3D bifurcations characterization for intra-cranial aneurysms prediction
An aneurysm is a vascular disorder represented by a ballooning of a blood vessel. The blood vessel’s wall is distorted by the blood flow, and a bulge forms there. When ruptured, the aneurysm may cause irreversible damage and could even lead to premature death. Intra-cranial aneurysms are the ones presenting the higher risks. In this work, thanks to a graph based approach, we detect the bifurcations located on the circle of Willis within brain mice cerebral vasculature. Once properly located in the 3D stack, we characterize the cerebral arteries bifurcations, i.e. we gather several properties of the bifurcation, such as their angles, or area cross section, in order to further estimate geometrical patterns that can favor the risk of occurrence of an intra-cranial aneurysm. Effectively, apart from genetic predisposition, and environmental risk factors (high blood pressure, smoking habits, ...) the anatomical disposition of the brain vasculature may influence the chances of an aneurysm to form. Our objectives in this paper is to obtain accurate measurements on the 3D bifurcations.
Shape-based three-dimensional body composition extrapolation using multimodality registration
Yao Lu, James K. Hahn
The ubiquity of commodity-level optical scan devices and reconstruction technologies has enabled the general public to monitor their body shape related health status anywhere, anytime, without assistance from professionals. Commercial optical body scan systems extract anthropometries from the virtual body shapes, from which body compositions are estimated. However, in most cases, these estimations are limited to the quantity of fat in the whole body instead of a fine-granularity voxel-level fat distribution estimation. To bridge the gap between the 3D body shape and fine-granularity voxel-level fat distribution, we present an innovative shape-based voxel-level body composition extrapolation method using multimodality registration. First, we optimize shape compliance between a generic body composition template and the 3D body shape. Then, we optimize data compliance between the shape-optimized body composition template and a body composition reference from the DXA pixel-level body composition assessment. We evaluate the performance of our method with different subjects. On average, the Root Mean Square Error (RMSE) of our body composition extrapolation is 1.19%, and the R-squared value between our estimation and the ground truth is 0.985. The experimental result shows that our algorithm can robustly estimate voxel-level body composition for 3D body shapes with a high degree of accuracy.
Optimal input configuration of dynamic contrast enhanced MRI in convolutional neural networks for liver segmentation
Mariëlle J. A. Jansen, Hugo J. Kuijf, Josien P. W. Pluim
Most MRI liver segmentation methods use a structural 3D scan as input, such as a T1 or T2 weighted scan. Segmentation performance may be improved by utilizing both structural and functional information, as contained in dynamic contrast enhanced (DCE) MR series. Dynamic information can be incorporated in a segmentation method based on convolutional neural networks in a number of ways. In this study, the optimal input configuration of DCE MR images for convolutional neural networks (CNNs) is studied. The performance of three different input configurations for CNNs is studied for a liver segmentation task. The three configurations are I) one phase image of the DCE-MR series as input image; II) the separate phases of the DCE-MR as input images; and III) the separate phases of the DCE-MR as channels of one input image. The three input configurations are fed into a dilated fully convolutional network and into a small U-net. The CNNs were trained using 19 annotated DCE-MR series and tested on another 19 annotated DCE-MR series. The performance of the three input configurations for both networks is evaluated against manual annotations. The results show that both neural networks perform better when the separate phases of the DCE-MR series are used as channels of an input image in comparison to one phase as input image or the separate phases as input images. No significant difference between the performances of the two network architectures was found for the separate phases as channels of an input image.
Incorporating CT prior information in the robust fuzzy C-means algorithm for QSPECT image segmentation
Bones are a common site of metastases in a number of cancers including prostate and breast cancer. Assessing response or progression typically relies on planar bone scintigraphy. However, quantitative bone SPECT (BQSPECT) has the potential to provide more accurate assessment. An important component of BQSPECT is segmenting lesions and bones in order to calculate metrics like tumor uptake and metabolic tumor burden. However, due to the poor spatial resolution, noise, and contrast properties of SPECT images, segmentation of bone SPECT images is challenging. In this study, we propose and evaluate a fuzzy C-means (FCM) clustering based semi-automatic segmentation method on quantitative Tc-99m MDP quantitative SPECT/CT. The FCM clustering algorithm has been widely used in medical image segmentation. Yet, the poor resolution and noise properties of SPECT images result in sub-optimal segmentation. We propose to incorporate information from registered CT images, which can be used to segment normal bones quite readily, into the FCM segmentation algorithm. The proposed method modifies the objective function of the robust fuzzy C-means (RFCM) method to include prior information about bone from CT images and spatial information from the SPECT image to allow for simultaneously segmenting lesion and bone in BQSPECT/CT images. The method was evaluated using realistic simulated BQSPECT images. The method and algorithm parameters were evaluated with respect to the dice similarity coefficient (DSC) computed using segmentation results. The effect of the number of iterations used to reconstruct the BQSPECT images was also studied. For the simulated BQSPECT images studied, an average DSC value of 0.75 was obtained for lesions larger than 2 cm3 with the proposed method.
Automatic two-chamber segmentation in cardiac CTA using 3D fully convolutional neural networks
Cardiac chamber segmentation has proved to be essential in many clinical applications including cardiac functional analysis, myocardium analysis and electrophysiology studies for ablation planning. Traditional rule-based or modelbased approaches have been widely developed and employed, however these methods can be time consuming to run and sometimes fail when certain rules are not met. Recent advances in deep learning provide a new approach in solving these segmentation problems. In this work we employ a TensorFlow implementation of the 3D U-Net trained with 413 cardiac CTA volumes to segment the left ventricle (LV) and the left atrium (LA). The network is tested on 162 unseen volumes. For LV the Dice similarity coefficient (DSC) reaches 90.2±2.6% and for LA 87.6±7.5%. The number of training and testing samples far exceeds the common use of datasets seen in literature thanks to the existing rule-based algorithm in Vitrea®’s Cardiac Functional CT protocol which was used to provide the segmentation labels. The labels are manually filtered, and only accurate labels are kept for training and testing. For the datasets with inaccurate labels, the trained network has proved to perform better in generating more accurate boundaries around the aortic valve, mitral valve and the apex of LV. The TensorFlow implementation allows for faster training which takes 3-4 hours and inferencing which takes less than 6 seconds to simultaneously segment 12 CT volumes. This significantly reduces the pre-processing time required for cardiac functional CT studies which usually consist of 10-20 cardiac phases and take minutes to segment with traditional methods.
Multiscale deep desmoking for laparoscopic surgery
In minimally invasive surgery, smoke generated by such as electrocautery and laser ablation deteriorates image quality severely. This creates discomfortable view for the surgeon which may increase surgical risk and degrade the performance of computer assisted surgery algorithms such as segmentation, reconstruction, tracking, etc. Therefore, real-time smoke removal is required to keep a clear field of view. In this paper, we propose a real-time smoke removal approach based on Convolutional Neural Network (CNN). An encoder-decoder architecture with Laplacian image pyramid decomposition input strategy is proposed. This is an end-to-end network which takes the smoke image and its Laplacian image pyramid decomposition as inputs, and outputs a smoke free image directly without relying on any physical models or estimation of intermediate parameters. This design can be further embedded to deep learning based follow-up image guided surgery processes such as segmentation and tracking tasks easily. A dataset with synthetic smoke images generated from Blender and Adobe Photoshop is employed for training the network. The result is evaluated quantitatively on synthetic images and qualitatively on a laparoscopic dataset degraded with real smoke. Our proposed method can eliminate smoke effectively while preserving the original colors and reaches 26 fps for a video of size 512 × 512 on our training machine. The obtained results not only demonstrate the efficiency and effectiveness of the proposed CNN structure, but also prove the potency of training the network on synthetic dataset.
Subvoxel vessel wall thickness measurements from vessel wall MR images
K. M. van Hespen, J. J. M. Zwanenburg, J. Hendrikse, et al.
Early vessel wall thickening is seen as an indicator for the development of cerebrovascular disease. Quantification of wall thickening using conventional measurement methods is difficult owing to the relatively thin vessel wall thickness compared to the acquired MR voxel size. We hypothesize that a convolutional neural network (CNN), can incorporate spatial orientation, shape, and intensity distribution of the vessel wall in an accurate thickness estimation for subvoxel walls. MR imaging of 34 post-mortem specimens was performed using a 3D gradient echo protocol (isotropic acquired voxel size: 0.11 mm; acquisition time: 5h46m). Simulating clinically feasible resolutions, image patches were sampled at a clinically feasible isotropic voxel size of 0.8 mm (patch size: 113 voxels). Image patches were sampled centered around vessel wall voxels where the wall thickness of the center voxel was measured at the original resolution using a validated measurement method. The image patches were fed into our CNN, which consisted of five subsequent 3D convolutional layers, followed by two fully connected layers feeding into the linearly activated output layer. Our network can distinguish walls with a target thickness between 0.2-1.0 mm. In this range, the median offset between the target thickness and estimated thickness is 0.14 mm (interquartile range: 0.22 mm). For walls with a target thickness below and above half the voxel size (0.4 mm), the median offset is 0.17 mm and 0.10 mm, respectively. In conclusion, our results show that a CNN can accurately measure the thickness of subvoxel vessel walls, down to half the voxel size.
VinceptionC3D: a 3D convolutional neural network for retinal OCT image classification
Shuanglang Feng, Weifang Zhu, Heming Zhao, et al.
In order to make further and more accurate automatic analysis and processing of optical coherence tomography (OCT) images, such as layer segmentation, disease region segmentation, registration, etc, it is necessary to screen OCT images first. In this paper, we propose an efficient multi-class 3D retinal OCT image classification network named as VinceptionC3D. VinceptionC3D is a 3D convolutional neural network which is improved from basic C3D by adding improved 3D inception modules. Our main contributions are: (1) Demonstrate that a fine-tuned C3D which is pretrained on nature action video datasets can be applied for the classification of 3D retinal OCT images; (2) Improve the network by employing 3D inception module which can capture multi-scale features. The proposed method is trained and tested on 873 3D OCT images with 6 classes. The average accuracy of the C3D with random initialization weights, the C3D with pre-trained weights, and the proposed VinceptionC3D with pre-trained weights are 89.35%, 92.09% and 94.04%, respectively. The result shows that the proposed VinceptionC3D is effective for the 6-class 3D retinal OCT image classification.
Choroid segmentation in OCT images based on improved U-net
Xuena Cheng, Xinjian Chen, Yuhui Ma, et al.
Change of the thickness and volume of the choroid, which can be observed and quantified from optical coherence tomography (OCT) images, is a feature of many retinal diseases, such as aged-related macular degeneration and myopic maculopathy. In this paper, we make purposeful improvements on the U-net for segmenting the choroid of either normal or pathological myopia retina, obtaining the Bruch’s membrane (BM) and the choroidal-scleral interface (CSI). There are two main improvements to the U-net framework: (1) Adding a refinement residual block (RRB) to the back of each encoder. This strengthens the recognition ability of each stage; (2) The channel attention block (CAB) is integrated with the U-net. This enables high-level semantic information to guide the underlying details and handle the intra-class inconsistency problem. We validated our improved network on a dataset which consists of 952 OCT Bscans obtained from 95 eyes from both normal subjects and patients suffering from pathological myopia. Comparing with manual segmentation, the mean choroid thickness difference is 8μm, and the mean Dice similarity coefficient is 85.0%.
Towards machine learning prediction of deep brain stimulation (DBS) intra-operative efficacy maps
Camilo Bermudez, William Rodriguez, Yuankai Huo, et al.
Deep brain stimulation (DBS) has the potential to improve the quality of life of people with a variety of neurological diseases. A key challenge in DBS is in the placement of a stimulation electrode in the anatomical location that maximizes efficacy and minimizes side effects. Pre-operative localization of the optimal stimulation zone can reduce surgical times and morbidity. Current methods of producing efficacy probability maps follow an anatomical guidance on magnetic resonance imaging (MRI) to identify the areas with the highest efficacy in a population. In this work, we propose to revisit this problem as a classification problem, where each voxel in the MRI is a sample informed by the surrounding anatomy. We use a patch-based convolutional neural network to classify a stimulation coordinate as having a positive reduction in symptoms during surgery. We use a cohort of 187 patients with a total of 2,869 stimulation coordinates, upon which 3D patches were extracted and associated with an efficacy score. We compare our results with a registration-based method of surgical planning. We show an improvement in the classification of intraoperative stimulation coordinates as a positive response in reduction of symptoms with AUC of 0.670 compared to a baseline registration-based approach, which achieves an AUC of 0.627 (p < 0.01). Although additional validation is needed, the proposed classification framework and deep learning method appear well-suited for improving pre-surgical planning and personalize treatment strategies.
Automated segmentation of the optic disc using the deep learning
Lei Wang, Han Liu, Jian Zhang, et al.
Accurate segmentation of the optic disc (OD) depicted on color fundus images plays an important role in the early detection and quantitative diagnosis of retinal diseases, such as glaucoma and optic atrophy. In this study, we proposed a coarse-to-fine deep learning framework on the basis of a classical convolutional neural network (CNN), known as the Unet model, for extracting the optic disc from fundus images. This network was trained separately on fundus images and their vessel density maps, leading to two coarse segmentation results from the entire images. We combined the results using an overlap strategy to identify a local image patch (disc candidate region), which was then fed into the U-net model for further segmentation. Our experiments demonstrated that the developed framework achieved an average intersection over union (IoU) and a dice similarity coefficient (DSC) of 89.1% and 93.9%, respectively, based on a total of 2,978 test images from our collected dataset and six public datasets, as compared to 87.4% and 92.5% obtained by only using the sole U-net model. This suggests that the proposed method can provide better segmentation performances and have the potential for population based disease screening.
Generation of retinal OCT images with diseases based on cGAN
Data imbalance is a classic problem in image classification, especially for medical images where normal data is much more than data with diseases. To make up for the absence of disease images, methods which can generate retinal OCT images with diseases from normal retinal images are investigated. Conditional GANs (cGAN) have shown significant success in natural images generation, but the applications for medical images are limited. In this work, we propose an end-to-end framework for OCT image generation based on cGAN. The new structural similarity index (SSIM) loss is introduced so that the model can take the structure-related details into consideration. In experiments, three kinds of retinal disease images are generated. The generated images assume the natural structure of the retina and thus are visually appealing. The method is further validated by testing the classification performance trained by the generated images.
A probabilistic approach for the registration of images with missing correspondences
Julia Krüger, Jan Ehrhardt, Sandra Schultz, et al.
The registration of two medical images is usually based on the assumption that corresponding regions exist in both images. If this assumption is violated by e. g. pathologies, most approaches encounter problems. The here proposed registration method is based on the use of probabilistic correspondences between sparse image representations, leading to a robust handling of potentially missing correspondences. A maximum-a-posteriori framework is used to derive the optimization criterion with respect to deformation parameters that aim to compensate not only spatial differences between the images but also appearance differences. A multi-resolution scheme speeds-up the optimization and increases the robustness. The approach is compared to a state-of-theart intensity-based variational registration method using MR brain images. The comprehensive quantitative evaluation using images with simulated stroke lesions shows a significantly higher accuracy and robustness of the proposed approach.
Active shape dictionary for automatic segmentation of pathological lung in low-dose CT image
Accurate lung segmentation is of great significance in clinical application. However, it is still a challenging task due to its complex structures, pathological changes, individual differences and low image quality. In the paper, a novel shape dictionary-based approach, named active shape dictionary, is introduced to automatically delineate pathological lungs from clinical 3D CT images. The active shape dictionary improves sparse shape composition in eigenvector space to effectively reduce local shape reconstruction error. The proposed framework makes the shape model to be iteratively deformed to target boundary with discriminative appearance dictionary learning and gradient vector flow to drive the landmarks. The proposed algorithm is tested on 40 3D low-dose CT images with lung tumors. Compared to state-of-the-art methods, the proposed approach can robustly and accurately detect pathological lung surface.
A generative-predictive framework to capture altered brain activity in fMRI and its association with genetic risk: application to Schizophrenia
Sayan Ghosal, Qiang Chen, Aaron L. Goldman, et al.
We present a generative-predictive framework that captures the differences in regional brain activity between a neurotypical cohort and a clinical population, as guided by patient-specific genetic risk. Our model assumes that the functional activations in the neurotypical subjects are distributed around a population mean, and that the altered brain activity in neuropsychiatric patients is defined via deviations from this neurotypical mean. We employ group sparsity to identify a set of brain regions that simultaneously explain the salient functional differences and specify a set of basis vector, that span the low dimensional data subspace. The patient-specific projections onto this subspace are used as feature vectors to identify multivariate associations with genetic risk. We have evaluated our model on a task-based fMRI dataset from a population study of schizophrenia. We compare our model with two baseline methods, regression using Least Absolute Shrinkage and Selection Operator (LASSO) and Random Forest (RF) regression, which establishes direct association between the brain activity during a working memory task and schizophrenia polygenic risk. Our model demonstrates greater consistency and robustness across bootstrapping experiments than the machine learning baselines. Moreover, the set of brain regions implicated by our model underlie the well documented executive cognitive deficits in schizophrenia.
Stack-U-Net: refinement network for improved optic disc and cup image segmentation
In this work, we propose a special cascade network for image segmentation, which is based on the U-Net networks as building blocks and the idea of the iterative refinement. The model was mainly applied to achieve higher recognition quality for the task of finding borders of the optic disc and cup, which are relevant to the presence of glaucoma. Compared to a single U-Net and the state-of-the-art methods for the investigated tasks, the presented method outperforms others by multiple benchmarks without increasing the volume of datasets. Our experiments include comparison with the best-known methods on publicly available databases DRIONS-DB, RIM-ONE v.3, DRISHTI-GS, and evaluation on a private data set collected in collaboration with University of California San Francisco Medical School. The analysis of the architecture details is presented. It is argued that the model can be employed for a broad scope of image segmentation problems of similar nature.
Left ventricle segmentation in LGE-MRI using multiclass learning
Tanja Kurzendorfer, Katharina Breininger, Stefan Steidl, et al.
Cardiovascular diseases are the major cause of death worldwide. Magnetic resonance imaging (MRI) is often used for the diagnosis of cardiac diseases because of its good soft tissue contrast. Furthermore, the fibrosis characterization of the myocardium can be important for accurate diagnosis and treatment planning. The clinical gold standard to visualize myocardial scarring is late gadolinium enhanced (LGE) MRI. However, the challenge arises in the accurate segmentation of the endocardial and epicardial border because of the smooth transition between the blood pool and scarred myocardium, as contrast agent accumulates in the damaged tissue and leads to hyper-enhancements. An exact segmentation, is essential for the scar tissue quantification. We propose a deep learning-based method to segment the left ventricle’s endocardium and epicardium in LGE-MRI. To this end, a multi-scale fully convolutional neural network with skip-connections (U-Net) and residual units is applied to solve the multiclass segmentation problem. As a loss function, weighted cross-entropy is used. The network is trained on 70 clinical LGE MRI sequences, validated with 5, and evaluated with 26 data sets. The approach yields a mean Dice coefficient of 0.90 for the endocard and 0.87 for the epicard. The proposed method segments the endocardium and epicardium of the left ventricle fully automatically with a high accuracy.
A CNN based retinal regression model for Bruch’s membrane opening detection
Glaucoma is a common eye disease. It causes damage to patient’s vision and is difficult to diagnose. By locating Bruch’s membrane opening (BMO) in the Optical Coherence Tomography (OCT) image we can compute important diagnostic parameters which can increase the probability of early diagnosis of glaucoma. Besides the traditional methods, which depend on stratification results, this paper introduces a new method based on an end-to-end deep learning model to detect the BMO. Our model is composed of three parts. The first part is a CNN based retinal feature extraction network. It extracts feature map for both Optic Nerve Head (ONH) proposal and BMO detection. The second part is an ONH proposal network to detect region of interest (ROI) containing BMO. The third part is using the feature map from ONH proposal network to regress the location of BMO. The model has shown a clear precedence over other methods in terms of accuracy. Satisfactory results have been obtained when compared with clinical results.
Robust harmonic field based tooth segmentation in real-life noisy scanned mesh
Jaedong Hwang, Sanghyeok Park, Seokjin Lee, et al.
Dental segmentation plays an important role in prosthetic dentistry such as crowns, implants and even orthodontics. Since people have different dental structures, it is hard to make a general dental segmentation model. Recently, there are only a few studies which try to tackle this problem. In this paper, we propose simple and intuitive algorithms for harmonic field based dental segmentation method to provide robustness for clinical dental mesh data. Our model includes additional grounds to gum, a pair of different Dirichlet boundary conditions, and convex segmentation for post-processing. Our data is generated for clinical usage and therefore has many noise, holes, and crowns. Moreover, some meshes have abraded teeth which deter the performance of harmonic field due to its dramatic gradient change. To the best of our knowledge, the proposed method and experiments are the first that deals with real clinical data containing noise and fragmented areas. We evaluate the results qualitatively and quantitatively to demonstrate the performance of the model. The model separates teeth from gum and other teeth very accurately. We use intersection over union (IoU) to calculate the overlap ratio between tooth. Moreover, human evaluation is used for measuring and comparing the performance of our segmentation model to other models. We compare the segmentation results of a baseline model and our model. Ablation study shows that our model improves the segmentation performance. Our model outperforms the baseline model at the expanse of some overlap which can be ignored.
Bioresorbable scaffold visualization in IVOCT images using CNNs and weakly supervised localization
Nils Gessert, Sarah Latus, Youssef S. Abdelwahed, et al.
Bioresorbable scaffolds have become a popular choice for treatment of coronary heart disease, replacing traditional metal stents. Often, intravascular optical coherence tomography is used to assess potential malapposition after implantation and for follow-up examinations later on. Typically, the scaffold is manually reviewed by an expert, analyzing each of the hundreds of image slices. As this is time consuming, automatic stent detection and visualization approaches have been proposed, mostly for metal stent detection based on classic image processing. As bioresorbable scaffolds are harder to detect, recent approaches have used feature extraction and machine learning methods for automatic detection. However, these methods require detailed, pixel-level labels in each image slice and extensive feature engineering for the particular stent type which might limit the approaches’ generalization capabilities. Therefore, we propose a deep learning-based method for bioresorbable scaffold visualization using only image-level labels. A convolutional neural network is trained to predict whether an image slice contains a metal stent, a bioresorbable scaffold, or no device. Then, we derive local stent strut information by employing weakly supervised localization using saliency maps with guided backpropagation. As saliency maps are generally diffuse and noisy, we propose a novel patch-based method with image shifting which allows for high resolution stent visualization. Our convolutional neural network model achieves a classification accuracy of 99.0 % for image-level stent classification which can be used for both high quality in-slice stent visualization and 3D rendering of the stent structure.
Simultaneous and automatic two surface detection of renal cortex in 3D CT images by enhanced sparse shape composition
Automatic organ localization plays a significant role in medical image segmentation. This paper introduces a novel approach for simultaneous and automatic two surface detection of renal cortex from contrast enhanced abdominal CT scans. The proposed framework is an integrated procedure consisting of three main parts: (i) cortex model training, both two shape variabilities are detected using principal components analysis from the manual annotation, and dual shape dictionaries and appearance dictionaries are constructed; (ii) outer mesh reconstruction, the initialized outer mesh is iteratively deformed to the target boundary; (iii) inner mesh reconstruction, the inner mesh can be reconstructed using the same deformation coefficients and similarity transformation of the outer mesh with the inner mesh shape dictionary. Our method was validated on a clinical data set of 37 CT scans using the leave-one-out cross validation strategy. The proposed method has improved the overall segmentation accuracy of Dice similarity coefficient to 91.95%±3.15% for renal cortex segmentation.
Predicting cognitive scores from resting fMRI data and geometric features of the brain
Anand A. Joshi, Jian Li, Haleh Akrami, et al.
Anatomical T1 weighted Magnetic Resonance Imaging (MRI) and functional magnetic resonance imaging collected during resting (rfMRI) are promising markers that offer insight into structure and function of the human brain. The objective of this work is to explore the use of a deep learning neural network to predict cognitive performance scores and ADHD indices in a group of ADHD and control subjects. First, we processed the rfMRI and MRI data of subjects using the BrainSuite fMRI Processing (BFP) pipeline to perform anatomical and functional preprocessing. This produces for each subject fMRI and geometric (anatomical) features represented in a standardized grayordinate system. The geometric and functional cortical data corresponding to the two hemispheres were then transformed to 128x128 multichannel images and input to a convolutional component of the neural network. Subcortical data were presented in a standard vector form and input to a standard input layer of the network. The neural network was implemented in Python using the Keras library with a TensorFlow backend. Training was performed on 168 images with 90 images used for testing. We observed significant correlation between predicted and actual values of the indices tested: Performance IQ: 0.47; Verbal IQ: 0.41, ADHD: 0.57. Comparing these values to those from network trained on functional-only and structural-only data, we saw that rfMRI is more informative than MRI, but the two modalities are highly complementary in terms of predicting these indices.
The segmentation of bladder cancer using the voxel-features-based method
Accurate segmentation of bladder cancer is the basis for determining the staging of bladder cancer. In our previous study, we have segmented the inner and outer surface of bladder wall and obtained the candidate region of bladder cancer, however, it is hard to segment the cancer region from the candidate region. To segment the cancer region accurately, we proposed a voxel-feature-based method and extracted 1159 features from each voxel of candidate region. After feature extraction, the recursive feature elimination-based support vector machine classifier (SVM-RFE) method was adopted to obtain an optimal feature subset for the classification of the cancer and the wall regions. According to feature selection and ranking, 125 top-ranked features were selected as the optimal subset, with an area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity of 1, 99.99%, 99.98%, and 1. Using the optimal subset, we calculated the probability value of each voxel belonging to the cancer region, then obtained the boundary to separate the tumor and wall regions. The mean DSC of the segmentation results in the testing set is 0.9127, indicating that the proposed method can accurately segment the bladder cancer region.
Multi-coil magnetic resonance imaging reconstruction with a Markov random field prior
Marko Panić, Jan Aelterman, Vladimir Crnojević, et al.
Recent improvements in magnetic resonance image (MRI) reconstruction from partial data have been reported using spatial context modelling with Markov random field (MRF) priors. However, these algorithms have been developed only for magnitude images from single-coil measurements. In practice, most of the MRI images today are acquired using multi-coil data. In this paper, we extend our recent approach for MRI reconstruction with MRF priors to deal with multi-coil data i.e., to be applicable in parallel MRI (pMRI) settings. Instead of reconstructing images from different coils independently and subsequently combining them into the final image, we recover MRI image by processing jointly the undersampled measurements from all coils together with their estimated sensitivity maps. The proposed method incorporates a Bayesian formulation of the spatial context into the reconstruction problem. To solve the resulting problem, we derive an efficient algorithm based on the alternating direction method of multipliers (ADMM). Experimental results demonstrate the effectiveness of the proposed approach in comparison to some well-adopted methods for accelerated pMRI reconstruction from undersampled data.
Deep learning-based stenosis quantification from coronary CT angiography
Youngtaek Hong, Frederic Commandeur, Sebastien Cadet, et al.
Background: Coronary computed tomography angiography (CTA) allows quantification of stenosis. However, such quantitative analysis is not part of clinical routine. We evaluated the feasibility of utilizing deep learning for quantifying coronary artery disease from CTA. Methods: A total of 716 diseased segments in 156 patients (66 ± 10 years) who underwent CTA were analyzed. Minimal luminal area (MLA), percent diameter stenosis (DS), and percent contrast density difference (CDD) were measured using semi-automated software (Autoplaque) by an expert reader. Using the expert annotations, deep learning was performed with convolutional neural networks using 10-fold cross-validation to segment CTA lumen and calcified plaque. MLA, DS and CDD computed using deep-learning-based approach was compared to expert reader measurements. Results: There was excellent correlation between the expert reader and deep learning for all quantitative measures (r=0.984 for MLA; r=0.957 for DS; and r=0.975 for CDD, p<0.001 for all). The expert reader and deep learning method was not significantly different for MLA (median 4.3 mm2 for both, p=0.68) and CDD (11.6 vs 11.1%, p=0.30), and was significantly different for DS (26.0 vs 26.6%, p<0.05); however, the ranges of all the quantitative measures were within inter-observer variability between 2 expert readers. Conclusions: Our deep learning-based method allows quantitative measurement of coronary artery disease segments accurately from CTA and may enhance clinical reporting.
Tissue segmentation in volumetric laser endomicroscopy data using FusionNet and a domain-specific loss function
Volumetric Laser Endomicroscopy (VLE) is a promising balloon-based imaging technique for detecting early neoplasia in Barretts Esophagus. Especially Computer Aided Detection (CAD) techniques show great promise compared to medical doctors, who cannot reliably find disease patterns in the noisy VLE signal. However, an essential pre-processing step for the CAD system is tissue segmentation. At present, tissue is segmented manually but is not scalable for the entire VLE scan consisting of 1,200 frames of 4,096 × 2,048 pixels. Furthermore, the current CAD methods cannot use the VLE scans to their full potential, as only a small segment of the esophagus is selected for further processing, while an automated segmentation system results in significantly more available data. This paper explores the possibility of automatically segmenting relevant tissue for VLE scans using FusionNet and a domain-specific loss function. The contribution of this work is threefold. First, we propose a tissue segmentation algorithm for VLE scans. Second, we introduce a weighted ground truth that exploits the signal-to-noise ratio characteristics of the VLE data. Third, we compare our algorithm segmentation against two additional VLE experts. The results show that our algorithm annotations are indistinguishable from the expert annotations and therefore the algorithm can be used as a preprocessing step for further classification of the tissue.
Projection image-to-image translation in hybrid x-ray/MR imaging
The potential benefit of hybrid X-ray and MR imaging in the interventional environment is large due to the combination of fast imaging with high contrast variety. However, a vast amount of existing image enhancement methods requires the image information of both modalities to be present in the same domain. To unlock this potential, we present a solution to image-to-image translation from MR projections to corresponding x-ray projection images. The approach is based on a state-of-the-art image generator network that is modified to fit the specific application. Furthermore, we propose the inclusion of a gradient map in the loss function to allow the network to emphasize high-frequency details in image generation. Our approach is capable of creating x-ray projection images with natural appearance. Additionally, our extensions show clear improvement compared to the baseline method.
Deep learning based classification for metastasis of hepatocellular carcinoma with microscopic images
Hui Meng, Yuan Gao, Kun Wang, et al.
Hepatocellular carcinoma (HCC) is the second leading cause of cancer-related death worldwide. The high probability of metastasis makes its prognosis very poor even after potentially curative treatment. Detecting high metastatic HCC will allow for the development of effective approaches to reduce HCC mortality. The mechanism of HCC metastasis has been studied using gene profiling analysis, which indicated that HCC with different metastatic capability was differentiable. However, it is time consuming and complex to analyze gene expression level with conventional method. To distinguish HCC with different metastatic capabilities, we proposed a deep learning based method with microscopic images in animal models. In this study, we adopted convolutional neural networks (CNN) to learn the deep features of microscopic images for classifying each image into low metastatic HCC or high metastatic HCC. We evaluated our proposed classification method on the dataset containing 1920 white-light microscopic images of frozen sections from three tumor-bearing mice injected with HCC-LM3 (high metastasis) tumor cells and another three tumor-bearing mice injected with SMMC-7721(low metastasis) tumor cells. Experimental results show that our method achieved an average accuracy of 0.85. The preliminary study demonstrated that our deep learning method has the potential to be applied to microscopic images for metastasis of HCC classification in animal models.
Improving myocardium segmentation in cardiac CT angiography using spectral information
Accurate segmentation of the left ventricle myocardium in cardiac CT angiography (CCTA) is essential for e.g. the assessment of myocardial perfusion. Automatic deep learning methods for segmentation in CCTA might suffer from differences in contrast-agent attenuation between training and test data due to non-standardized contrast administration protocols and varying cardiac output. We propose augmentation of the training data with virtual mono-energetic reconstructions from a spectral CT scanner which show different attenuation levels of the contrast agent. We compare this to an augmentation by linear scaling of all intensity values, and combine both types of augmentation. We train a 3D fully convolutional network (FCN) with 10 conventional CCTA images and corresponding virtual mono-energetic reconstructions acquired on a spectral CT scanner, and evaluate on 40 CCTA scans acquired on a conventional CT scanner. We show that training with data augmentation using virtual mono-energetic images improves upon training with only conventional images (Dice similarity coefficient (DSC) 0.895 ± 0.039 vs. 0.846 ± 0.125). In comparison, training with data augmentation using linear scaling improves the DSC to 0.890 ± 0.039. Moreover, combining the results of both augmentation methods leads to a DSC of 0.901 ± 0.036, showing that both augmentations lead to different local improvements of the segmentations. Our results indicate that virtual mono-energetic images improve the generalization of an FCN used for myocardium segmentation in CCTA images.
Automatic dental root CBCT image segmentation based on CNN and level set method
Jun Ma, Xiaoping Yang
Accurate segmentation of the teeth from Cone Beam Computed Tomography (CBCT) images is a critical step towards building the personalized 3D digital model as it can provide important information to orthodontists for clinical treatments. However, the teeth CBCT image segmentation is a challenging task, especially for the root parts, because the root contour of a tooth may be degenerated by noise and surrounding alveolar bone or neighboring teeth. Most existing methods employ semi-automatic or interactive methods and there are few automatic and high-precision methods for teeth root segmentation. In this paper, we design a lightweight CNN architecture to accomplish this task as an end-to-end framework which can automatically segment the teeth from CBCT images. Specifically, we use ordinary convolutions, dilated convolutions and residual connections as the basic module to build the network. After that, a geodesic active contour model is employed to refine the CNN’s outputs which can further improve the segmentation results. The whole pipeline is fully automatic and without any image-specific fine tune. The method is evaluated on a dental CBCT segmentation challenge and achieves state-of-the-art results.
Automatic rat brain segmentation from MRI using statistical shape models and random forest
In MRI neuroimaging, the shimming procedure is used before image acquisition to correct for inhomogeneity of the static magnetic field within the brain. To correctly adjust the field, the brain’s location and edges must first be identified from quickly-acquired low resolution data. This process is currently carried out manually by an operator, which can be time-consuming and not always accurate. In this work, we implement a quick and automatic technique for brain segmentation to be potentially used during the shimming. Our method is based on two main steps. First, a random forest classifier is used to get a preliminary segmentation from an input MRI image. Subsequently, a statistical shape model of the brain, which was previously generated from ground-truth segmentations, is fitted to the output of the classifier to obtain a model-based segmentation mask. In this way, a-priori knowledge on the brain’s shape is included in the segmentation pipeline. The proposed methodology was tested on low resolution images of rat brains and further validated on rabbit brain images of higher resolution. Our results suggest that the present method is promising for the desired purpose in terms of time efficiency, segmentation accuracy and repeatability. Moreover, the use of shape modeling was shown to be particularly useful when handling low-resolution data, which could lead to erroneous classifications when using only machine learning-based methods.
Semantic segmentation of computed tomography for radiotherapy with deep learning: compensating insufficient annotation quality using contour augmentation
Umair Javaid, Damien Dasnoy, John A. Lee
In radiotherapy treatment planning, manual annotation of organs-at-risk and target volumes is a difficult and time-consuming task, prone to intra and inter-observer variabilities. Deep learning networks (DLNs) are gaining worldwide attention to automate such annotative tasks because of their ability to capture data hierarchy. However, for better performance DLNs require large number of data samples whereas annotated medical data is scarce. To remedy this, data augmentation is used to increase the training data for DLNs that enables robust learning by incorporating spatial/translational invariance into the training phase. Importantly, performance of DLNs is highly dependent on the ground truth (GT) quality: if manual annotation is not accurate enough, the network cannot learn better than the annotated example. This highlights the need to compensate for possibly insufficient GT quality using augmentation, i.e., by providing more GTs per image, in order to improve performance of DLNs. In this work, small random alterations were applied to GT and each altered GT was considered as an additional annotation. Contour augmentation was used to train a dilated U-Net in multiple GTs per image setting, which was tested on a pelvic CT dataset acquired from 67 patients to segment bladder and rectum in a multi-class segmentation setting. By using contour augmentation (coupled with data augmentation), the network learnt better than with data augmentation only, as it was able to correct slightly offset contours in GT. The segmentation results produced were quantified using spatial overlap, distance-based and probabilistic measures. The Dice score for bladder and rectum are reported as 0.88±0.19 and 0.89±0.04, whereas the average symmetric surface distance are 0.22 ± 0.09 mm and 0.09 ± 0.05 mm, respectively.
An automatic end-to-end pipeline for CT image-based EGFR mutation status classification
Lin Tian, Rong Yuan
The epidermal growth factor receptor (EGFR) mutation status play a key role in clinical decision support and prognosis for non-small-cell lung cancer (NSCLC). In this study, we present an automatic end-to-end pipeline to classify the EGFR mutation status according to the features extracted from medical images via deep learning. We tried to solve this problem with three steps: (I) locating tumor candidates via a 3D convolutional neural network (CNN), (II) extracting features via pre-trained lower convolutional layers (layers before the fully connected layers) of VGG16 network, (III) classifying EGFR mutation status according to the extracted features with a logistic regression model. In the experiments, the dataset contains 83 Chest CT series collecting from patients with non-small-cell lung cancer, half of whom are positive for a mutation in EGFR. The whole dataset was divided into two splits for training and testing with 66 CT series and 17 CT series respectively. Our pipeline achieves AUC of 0.725 (±0.009) when running a five-fold cross validation on training dataset and AUC of 0.75 on testing dataset, which validates the efficacy and generalizability of our approach and shows potential usage of non-invasive medical image analysis in detecting EGFR mutation status.
Sparse low-dimensional causal modeling for the analysis of brain function
Dushyant Sahoo, Nicolas Honnorat, Christos Davatzikos
Resting-state fMRI (rs-fMRI) provides a means to study how the information is processed in the brain. This modality has been increasingly used to estimate dynamical interactions between brain regions. However, the noise and the limited temporal resolution obtained from typical rs-fMRI scans make the extraction of reliable dynamical interactions challenging. In this work, we propose a new approach to tackle these issues. We estimate Granger Causality in full resolution rs-fMRI data by fitting sparse low-dimensional multivariate autoregressive models. We elaborate an efficient optimization strategy by combining spatial and temporal dimensionality reduction, extrapolation and stochastic gradient descent. We demonstrate by processing the rs-fMRI scans of the hundred unrelated Human Connectome Project subjects that our method captures interpretable brain interactions, in particular when a differentiable sparsity-inducing regularization is introduced in our framework.
Automated prostate segmentation of volumetric CT images using 3D deeply supervised dilated FCN
Bo Wang, Yang Lei, Tonghe Wang, et al.
Segmentation of the prostate in 3D CT images is a crucial step in treatment planning and procedure guidance such as brachytherapy and radiotherapy. However, manual segmentation of the prostate is very time-consuming and depends on the experience of the clinician. On the contrary, automated prostate segmentation is more helpful in practice, whereas the task is very challenging due to low soft-tissue contrast in CT images. In this paper, we propose a 3D deeply supervised fully-convolutional-network (FCN) with dilated convolution kernel to automatically segment prostate in CT images. A deep supervision strategy could acquire more powerful discriminative capability and accelerate the optimization convergence in training stage, while concatenating the dilated convolution enlarges the receptive field to extract more global contextual information for accurate prostate segmentation. The presented method was evaluated using 15 prostate CT images and obtained a mean Dice similarity coefficient (DSC) of 0.85±0.04 and mean surface distance (MSD) of 1.92±0.46 mm. The experimental results show that our approach yields accurate CT prostate segmentation, which can be employed for the prostate-cancer treatment planning of brachytherapy and external beam radiotherapy.
MRI-based synthetic CT generation using deep convolutional neural network
Yang Lei, Tonghe Wang, Yingzi Liu, et al.
We propose a learning method to generate synthetic CT (sCT) image for MRI-only radiation treatment planning. The proposed method integrated a dense-block concept into a cycle-generative adversarial network (cycle-GAN) framework, which is named as dense-cycle-GAN in this study. Compared with GAN, the cycle-GAN includes an inverse transformation between CT (ground truth) and sCT, which could further constrain the learning model. A 2.5D fully convolution neural network (FCN) with dense-block was introduced in generator to enable end-to-end transformation. A FCN is used in discriminator to urge the generator’s sCT to be similar with the ground-truth CT images. The well-trained model was used to generate the sCT of a new MRI. This proposed algorithm was evaluated using 14 patients’ data with both MRI and CT images. The mean absolute error (MAE), peak signal-to-noise ratio (PSNR) and normalized cross correlation (NCC) indexes were used to quantify the correction accuracy of the prediction algorithm. Overall, the MAE, PSNR and NCC were 60.9−11.7 HU, 24.6±0.9 dB, and 0.96±0.01. We have developed a novel deep learning-based method to generate sCT with a high accuracy. The proposed method makes the sCT comparable to that of the planning CT. With further evaluation and clinical implementation, this method could be a useful tool for MRI-based radiation treatment planning and attenuation correction in a PET/MRI scanner.
Montage based 3D medical image retrieval from traumatic brain injury cohort using deep convolutional neural network
Cailey I. Kerley, Yuankai Huo, Shikha Chaganti, et al.
Brain imaging analysis on clinically acquired computed tomography (CT) is essential for the diagnosis, risk prediction of progression, and treatment of the structural phenotypes of traumatic brain injury (TBI). However, in real clinical imaging scenarios, entire body CT images (e.g., neck, abdomen, chest, pelvis) are typically captured along with whole brain CT scans. For instance, in a typical sample of clinical TBI imaging cohort, only ~15% of CT scans actually contain whole brain CT images suitable for volumetric brain analyses; the remaining are partial brain or non-brain images. Therefore, a manual image retrieval process is typically required to isolate the whole brain CT scans from the entire cohort. However, the manual image retrieval is time and resource consuming and even more difficult for the larger cohorts. To alleviate the manual efforts, in this paper we propose an automated 3D medical image retrieval pipeline, called deep montage-based image retrieval (dMIR), which performs classification on 2D montage images via a deep convolutional neural network. The novelty of the proposed method for image processing is to characterize the medical image retrieval task based on the montage images. In a cohort of 2000 clinically acquired TBI scans, 794 scans were used as training data, 206 scans were used as validation data, and the remaining 1000 scans were used as testing data. The proposed achieved accuracy=1.0, recall=1.0, precision=1.0, f1=1.0 for validation data, while achieved accuracy=0.988, recall=0.962, precision=0.962, f1=0.962 for testing data. Thus, the proposed dMIR is able to perform accurate CT whole brain image retrieval from largescale clinical cohorts.
Reproducibility evaluation of SLANT whole brain segmentation across clinical magnetic resonance imaging protocols
Whole brain segmentation on structural magnetic resonance imaging (MRI) is essential for understanding neuroanatomical-functional relationships. Traditionally, multi-atlas segmentation has been regarded as the standard method for whole brain segmentation. In past few years, deep convolutional neural network (DCNN) segmentation methods have demonstrated their advantages in both accuracy and computational efficiency. Recently, we proposed the spatially localized atlas network tiles (SLANT) method, which is able to segment a 3D MRI brain scan into 132 anatomical regions. Commonly, DCNN segmentation methods yield inferior performance under external validations, especially when the testing patterns were not presented in the training cohorts. Recently, we obtained a clinically acquired, multi-sequence MRI brain cohort with 1480 clinically acquired, de-identified brain MRI scans on 395 patients using seven different MRI protocols. Moreover, each subject has at least two scans from different MRI protocols. Herein, we assess the SLANT method’s intra- and inter-protocol reproducibility. SLANT achieved less than 0.05 coefficient of variation (CV) for intra-protocol experiments and less than 0.15 CV for inter-protocol experiments. The results show that the SLANT method achieved high intra- and inter- protocol reproducibility.
Group-wise alignment of resting fMRI in space and time
Haleh Akrami, Anand A. Joshi, Jian Li, et al.
Spontaneous brain activity is an important biomarker for various neurological and psychological conditions and can be measured using resting functional Magnetic Resonance Imaging (rfMRI). Since brain activity during resting is spontaneous, it is not possible to directly compare rfMRI time-courses across subjects. Moreover, the spatial configuration of functionally specialized brain regions can vary across subjects throughout the cortex limiting our ability to make precise spatial comparisons. We describe a new approach to jointly align and synchronize fMRI data in space and time, across a group of subjects. We build on previously described methods for inter-subject spatial “Hyper-Alignment” and temporal synchronization through the “BrainSync” transform. We first describe BrainSync Alignment (BSA), a group-based extension of the pair-wise BrainSync transform, that jointly synchronizes resting or task fMRI data across time for multiple subjects. We then explore the combination of BSA with Response Hyper-Alignment (RHA) and compare with Connectivity Hyper-Alignment (CHA), an alternative approach to spatial alignment based on resting fMRI. The result of applying RHA and BSA is both to produce improved functional spatial correspondence across a group of subjects, and to align their time-series so that, even for spontaneous resting data, we see highly correlated temporal dynamics at homologous locations across the group. These spatiotemporally aligned data can then be used as an atlas in future applications. We validate these transfer functions by applying them to z-score maps of an independent dataset and calculating inter-subject correlation. The results show that RHA can be calculated from rfMRI and have comparable output with CHA by leveraging BSA. Moreover, through calculation and application to task fMRI-based spatial transformations on an independent dataset, we show that the combination of RHA and BSA produces improved spatial functional alignment significantly relative to either RHA or CHA alone.
A robust index for global tissue deformation analysis in ultrasound images
Arnaud Brignol, Farida Cheriet, Catherine Laporte
In this paper, a new index for global 2D tissue deformation analysis without correlation is computed from an ultrasound video sequence. First, the vertical and horizontal projections of the frames are computed, followed by the mean of the outer product of the two projections. Finally, the deformation index is the relative variation of the mean of the outer product with respect to the first frame. The index was validated on simulated data (valve and echocardiography) and ex vivo (raw meat). In the latter case, an ultrasound probe was robotically moved along the vertical axis to compress the meat. Results showed that the proposed index is robust and highly correlated with the average relative displacement of the landmarks located on the boundaries of the deformed part for the simulations (r = 0.90) and with the probe motion in the ex vivo case (r = 0.83). In comparison, a simple normalized cross correlation approach gives poor results (r < 0.2) due to a lack of robustness in the tracking.
Nuclei counting in microscopy images with three dimensional generative adversarial networks
Shuo Han, Soonam Lee, Chichen Fu, et al.
Microscopy image analysis can provide substantial information for clinical study and understanding of biological structures. Two-photon microscopy is a type of fluorescence microscopy that can image deep into tissue with near-infrared excitation light. We are interested in methods that can detect and characterize nuclei in 3D fluorescence microscopy image volumes. In general, several challenges exist for counting nuclei in 3D image volumes. These include “crowding” and touching of nuclei, overlapping of nuclei, and shape and size variances of the nuclei. In this paper, a 3D nuclei counter using two different generative adversarial networks (GAN) is proposed and evaluated. Synthetic data that resembles real microscopy image is generated with a GAN and used to train another 3D GAN that counts the number of nuclei. Our approach is evaluated with respect to the number of groundtruth nuclei and compared with common ways of counting used in the biological research. Fluorescence microscopy 3D image volumes of rat kidneys are used to test our 3D nuclei counter. The accuracy results of proposed nuclei counter are compared with the ImageJ’s 3D object counter (JACoP) and the 3D watershed. Both the counting accuracy and the object-based evaluation show that the proposed technique is successful for counting nuclei in 3D.
Cycle-consistent 3D-generative adversarial network for virtual bowel cleansing in CT colonography
CT colonography (CTC) uses orally administered fecal-tagging agents to indicate residual materials that could otherwise interfere with the interpretation of CTC images. To visualize the colon in virtual 3D endoluminal views, electronic cleansing (EC) can be used to subtract the fecal-tagged materials from the CTC images. However, conventional EC methods produce subtraction artifacts that distract readers and computer-aided detection systems. In this study, we used generative adversarial learning to transform fecal-tagged CTC input image volumes to corresponding virtually cleansed image volumes. To overcome the need for paired training samples, we used a cycle-consistent 3D-generative adversarial network (3D EC-cycleGAN) scheme that can be trained with unpaired samples. The associated generator and discriminator networks were implemented as 3D-convolutional networks, and the loss functions were adapted to the unique requirements of EC in CTC. To investigate the feasibility of the approach, the 3D EC-cycleGAN was trained and tested with CTC image volumes of an anthropomorphic phantom filled partially with fecal tagging to recreate the attenuation ranges observed in clinical CTC. Our preliminary results indicate that the proposed 3D EC-cycleGAN can potentially learn to perfor
Robust discomfort detection for infants using an unsupervised roll estimation
Cheng Li, Arash Pourtaherian, W. E. Tjon A Ten, et al.
Discomfort detection for infants is essential in the healthcare domain, since infants lack the ability to verbalize their pain and discomfort. In this paper, we propose a robust and generic discomfort detection for infants by exploiting a novel and efficient initialization method for facial landmark localization, using an unsupervised rollangle estimation. The roll-angle estimation is achieved by fitting a 1st-order B-spline model to facial features obtained from the scaled-normalized Laplacian of the Gaussian operator. The proposed method can be adopted both for daylight and infrared-light images and supports real-time implementation. Experimental results have shown that the proposed method improves the performance of discomfort detection by 6.0% and 4.2% for the AUC and AP using daylight images, together with 6.9% and 3.8% for infrared-light images, respectively.
Automatic detection of the region of interest in corneal endothelium images using dense convolutional neural networks
In images of the corneal endothelium (CE) acquired by specular microscopy, endothelial cells are commonly only visible in a part of the image due to varying contrast, mainly caused by challenging imaging conditions as a result of a strongly curved endothelium. In order to estimate the morphometric parameters of the corneal endothelium, the analyses need to be restricted to trustworthy regions – the region of interest (ROI) – where individual cells are discernible. We developed an automatic method to find the ROI by Dense U-nets, a densely connected network of convolutional layers. We tested the method on a heterogeneous dataset of 140 images, which contains a large number of blurred, noisy, and/or out of focus images, where the selection of the ROI for automatic biomarker extraction is vital. By using edge images as input, which can be estimated after retraining the same network, Dense U-net detected the trustworthy areas with an accuracy of 98.94% and an area under the ROC curve (AUC) of 0.998, without being affected by the class imbalance (9:1 in our dataset). After applying the estimated ROI to the edge images, the mean absolute percentage error (MAPE) in the estimated endothelial parameters was 0.80% for ECD, 3.60% for CV, and 2.55% for HEX.
Pulmonary lobar segmentation from computed tomography scans based on a statistical finite element analysis of lobe shape
Yuwen Zhang, Mahyar Osanlouy, Alys R. Clark, et al.
Automatic identification of pulmonary lobes from imaging is important in disease assessment and treatment planning. However, the lobar fissures can be difficult to detect automatically, as they are thin, usually of fuzzy appearance and incomplete on CT scans. The fissures can also be obscured by or confused with features of disease, for example the tissue abnormalities that characterise fibrosis. Traditional anatomical knowledge-based methods rely heavily on anatomic knowledge and largely ignore individual variability, which may result in failure to segment pathological lungs. In this study, we aim to overcome difficulties in identifying pulmonary fissures by using a statistical finite element shape model of lobes to guide lobar segmentation. By deforming a principle component analysis based statistical shape model onto an individual’s lung shape, we predict the likely region of fissure locations, to initialize the search region for fissures. Then, an eigenvalue of Hessian matrix analysis and a connected component eigenvector based analysis are used to determine a set of fissure-like candidate points. A smooth multi-level β-spline curve is fitted to the most fissure-like points (those with high fissure probability) and the fitted fissure plane is extrapolated to the lung boundaries. The method was tested on 20 inspiratory and expiratory CT scans, and the results show that the algorithm performs well both in healthy young subjects and older subjects with fibrosis. The method was able to estimate the fissure location in 100% of cases, whereas two comparison segmentation softwares that use anatomy-based methods were unable to segment 7/20 and 9/20 subjects, respectively.
Fully automated detection and quantification of multiple retinal lesions in OCT volumes based on deep learning and improved DRLSE
Automated and quantitative analysis of the retinal lesions region is very needed in clinical practice. In this paper, we have proposed a method which effectively combines deep learning and improved distance regularized level set evolution (DRLSE) for automatically detecting and segmenting multiple retinal lesions in OCT volumes. The proposed method can segment five different retinal lesions: pigment epithelium detachment (PED), sub-retinal fluid (SRF), drusen, choroidal neovascularization (CNV), macular holes (MH). We tested 500 B-scans from 15 3D OCT volumes. The experimental results have validated the effectiveness and efficiency of the proposed method. The quantitative indices of average precision (AP), area under the curve (AUC) at intersection-over-union (IoU) that is equal to 0.50 : 0.05 : 0.95 and dice similarity coefficient (DICE) in average of 93.2%, 90.6% and 90.3% can be achieved, respectively.
Is hippocampus getting bumpier with age: a quantitative analysis of fine-scale dentational feature under the hippocampus on 552 healthy subjects
Shuxiu Cai, Xiaxia Yu, Qiaochu Zhang, et al.
Shown in high resolution images, a morphological feature that can be clearly observed is the bumpy ridges on the inferior aspect of hippocampus, which we refer to as hippocampal dentation. The dentations of the hippocampus in normal individuals vary greatly from highly smooth to highly dentated. The degree of dentation could be an interesting feature which has been shown to be correlated with episodic memory performance and not to be correlated with hippocampal volume. Here we presented a study which quantitatively evaluated the degree of bumpiness under the hippocampi in 552 healthy subjects with the age of mid-20 to 80. Specifically, the principal component analysis (PCA) which is nonlinearly fitted for quantifying the magnitude and the frequency of the hippocampal dentations has been used to identify the major axes of the hippocampus and the dentations under it. Preliminary results have demonstrated that the level of dentations varies between left and right hippocampi in subjects, as well as across different age groups. This can establish an objective and quantitative measurement for such a feature and can be extended for future comparisons between non-clinical and clinical groups.
Local and global transformations to improve learning of medical images applied to chest radiographs
Vidya M. S., Manikanda Krishnan V., Anirudh G., et al.
A potential drawback of computer-aided diagnosis (CAD) systems is that they tend to capture the noise characteristics along with signal variations due to a limited number of sources used in training. This leads to a decrease in performance on data from different sources. The variations in scanner settings, device manufacturers and sites pose a significant challenge to the learning capabilities of the CAD systems like chest radiographs, also called Chest X-rays (CXR). In the proposed work, we investigate if preprocessing transformations like global normalization along with local enhancements are good to tackle the variability of data from multiple sources on a supervised CXR classification system. We also propose a detail enhancement filter to enhance both finer structures and opacities in CXRs. With the proposed preprocessing improvement, experiments were performed on 13,000 images across 3 public and one private data source using Dense Convolutional Network (DenseNet). The sensitivity at equal error rate (mean ± sd) improved from 0.888 ± 0.043 to 0.931 ± 0.030 by applying a combination of global histogram equalization with the proposed detail enhancement filter when compared to the raw images. We conclude that the proposed transformations are effective in improving the learning of CXRs from different data sources.
Region-guided adversarial learning for anatomical landmark detection in uterus ultrasound image
The length and thickness of the uterus and endometrium are morphology characteristics as important measures for uterine diagnosis. In diagnosing uterine, doctors mark anatomical landmark points of uterus and endometrium in order to measure their length and thickness. However, it is difficult to reliably detect the landmarks of the uterus and endometrium due to the ambiguous boundaries and heterogeneous textures of uterus transvaginal ultrasound image. In this paper, we propose a novel region-guided adversarial learning framework for anatomical landmark detection in transvaginal ultrasound image, aiming at automatically detecting the landmark points of uterus and endometrium of transvaginal ultrasound image to a diagnostical precision. In the proposed adversarial learning scheme, the proposed framework consists of a landmark predictor and two discriminators for the uterus and endometrium. The proposed landmark predictor is to detect the desired landmarks of both uterus and endometrium regions from transvaginal ultrasound image. The discriminator is to determine whether the predicted landmarks of uterus and endometrium are related with their regions or not (i.e., whether the predicted landmark points are on the region boundaries or not.). By adversarial learning between the predictor and the discriminators with uterus and endometrium region images, the performance of the landmark predictor can be improved. In testing, with the trained predictor only, uterus and endometrium landmarks are predicted. Experimental results demonstrated that the proposed method achieved a high accuracy in detecting landmarks of the uterus and endometrium in the ultrasound image.
The impact of MRI-CT registration errors on deep learning-based synthetic CT generation
Purpose To investigate the impact of image registration on deep learning-based synthetic CT (sCT) generation. Methods Paired MR images and CT scans of the pelvic region of radiotherapy patients were obtained and non-rigidly registered. After a manual verification of the registrations, the dataset was split into two groups containing either well-registered or poorly-registered MR-CT pairs. In three scenarios, a patch-based U-Net deep learning architecture was trained for sCT generation on (i) exclusively well-registered data, (ii) mixtures of well-registered and poorly-registered data or on (iii) poorly-registered data only. Furthermore, a failure case was designed by introducing a single misregistered subject in the training set of six well-registered subjects. Reconstruction quality was assessed using mean absolute error (MAE) in the entire body and specifically in bone and Dice similarity coefficient (DSC) evaluated cortical bone geometric fidelity. Results The model trained on well registered data had an average MAE of 27.6 ± 2.6HU on the entire body contour and 79.1 ± 16.1HU on the bone. The average cortical bone DSC was 0.89. When patients with registration errors were added to the training, MAE’s were higher and DSC lower with variations by up to 36HU for the average MAEbone. The failure mode demonstrated the potential far-reaching consequences of a single misregistered subject in the training set with variations by up to 38HU for MAEbone. Conclusion Poor registration quality of the training set had a negative impact on paired, deep learning-based sCT generation. Notably, as low as one poorly-registered MR-CT pair in the training phase was capable of drastically altering a model.
Evolutionary multi-objective meta-optimization of deformation and tissue removal parameters improves the performance of deformable image registration of pre- and post-surgery images
Kleopatra Pirpinia, Peter A. N. Bosman, Jan-Jakob Sonke, et al.
Breast conserving surgery followed by radiotherapy is the standard of care for early-stage breast cancer patients. Deformable image registration (DIR) can in principle be of great value for accurate localization of the original tumor site to optimize breast irradiation after surgery. However, current state-of-the-art DIR methods are not very successful when tissue is present in one image but not in the other (i.e., in case of content mismatch). To tackle this challenge, we combined a multi-objective DIR approach with simulated tissue removal. Parameters defining the area to be removed as well as key DIR parameters (that are often tuned manually for each DIR case) are determined by a multi-objective optimization process. In multi-objective optimization, not one, but a set of solutions is found, that represent high-quality trade-offs between objectives of interest. We used three state-of-the-art multi-objective evolutionary algorithms as meta-optimizers to search for the optimal parameters, and tested our approach on four test cases of computed tomography (CT) images of breast cancer patients before and after surgery. Results show that using meta-optimization with simulated tissue removal improves the performance of DIR. This way, sets of high-quality solutions could be obtained with a mean target registration error of 2.4 mm over four test cases and an estimated excised volume that is within 20% from the measured volume of the surgical resection specimen.
Renal parenchyma segmentation from abdominal CT images using multi-atlas method with intensity and shape constraints
Segmentation of the renal parenchyma consisting of the cortex and the medulla responsible for the renal function is necessary to assess contralateral renal hypertrophy and to predict renal function after renal partial nephrectomy (RPN). In this paper, we propose an automatic renal parenchyma segmentation from abdominal CT images using multi-atlas methods with intensity and shape constraints. First, atlas selection is performed to select the training images in a training set which is similar in appearance to the target image using volume-based registration and intensity similarity. Second, renal parenchyma is segmented using volume- and model-based registration and intensity-constrained locally-weighted voting to segment the cortex and medulla with different intensities. Finally, the cortex and medulla are refined with the threshold value selected by applying a Gaussian mixture model and the cortex slab accumulation map to reduce leakage to the adjacent organs with similar intensity to the medulla and under-segmented area due to lower intensity than the training set. The average dice similarity coefficient of renal parenchyma was 92.68%, showed better results of 15.84% and 2.47% compared to the segmentation method using majority voting and intensity-constrained locally-weighted voting, respectively. Our method can be used to assess the contralateral renal hypertrophy and to predict the renal function by measuring the volume change of the renal parenchyma, and can establish the basis for treatment after renal partial nephrectomy.
Discrimination of benign and malignant pulmonary tumors in computed tomography: effective priori information of fast learning network architecture
Hao-Jen Wang, Leng-Rong Chen, Li-Wei Chen, et al.
This study explores the influence of prior information for deep learning networks to discriminate the benign and malignant of pulmonary tumors in computed tomography. In this study, because the number of nodule samples is sparse, this study proposes the concept of Multiple-Window to provide prior knowledge for Convolutional neural network (CNN). In the Multiple-Window CNN, we use the 5 windows including lung window, abdomen window, bone window, and chest window to generate the nodule sample. The sparse number of nodule samples, through the characteristics of the CT image dynamic range, make more prior information in a limited amount of data. The results show that the increase of suitably prior information (window channel) be included, CNN performance has improved. When the input is original dicom image, the accuracy of CNN is 0.82, sensitivity is 0.82, and specificity is 0.82. When the input is 4 kinds channel of window type, the accuracy is 0.9, sensitivity is 0.84, and specificity is 0.96.
Constructing an average geometry and diffusion tensor magnetic resonance field from freshly explanted porcine hearts
Mia Mojica, Mihaela Pop, Maxime Sermesant, et al.
The local arrangement of cardiac fibers provides insight into the electrical and mechanical functions of the heart. Fiber directions can be obtained using diffusion tensor (DT) MR imaging and further integrated into computational heart models for accurate predictions of activation times and contraction. However, this information is not available due to limitations of cardiac in-vivo DTI; thus, an average atlas could be used instead of individual fiber directions. In this work, we present a simple and computationally efficient pipeline for constructing a novel statistical cardiac atlas from ex-vivo high resolution DT images of porcine hearts. Our framework involves normalizing the cardiac geometries, reorienting local directional information on diffusion, and computing the average diffusion tensor field. The registration step eliminates the need for landmarks, while the tensor reorientation strategy enables the transformation of the diffusion tensors and preserves the diffusion tensor orientations.
Orbital bone segmentation in head and neck CT images using multi-gray level fully convolutional networks
Min Jin Lee, Helen Hong, Kyu Won Shim, et al.
Segmentation of the orbital bone is necessary for orbital wall reconstruction in cranio-maxillofacial surgery to support the eyeball position and restore the volume and shape of the orbit. However, orbital bone segmentation has a challenging issue that the orbital bone is composed of high-intensity cortical bones and low-intensity trabecular and thin bones. Especially, the thin bones of the orbital medial wall and the orbital floor have similar intensity values that are indistinguishable from surrounding soft tissues due to the partial volume effect that occurs when CT images are generated. Thus, we propose an orbital bone segmentation method using multi-graylevel FCNs that segment cortical bone, trabecular bone and thin bones with different intensities in head-and-neck CT images. To adjust the image properties of each dataset, pixel spacing normalization and the intensity normalization is performed. To overcome the under-segmentation of the thin bones of the orbital medial wall, a single orbital bone mask is divided into cortical and thin bone masks. Multi-graylevel FCNs are separately trained on the cortical and thin bone masks based on 2D U-Net, and each cortical and thin bone segmentation result is integrated to obtain the whole orbital bone segmentation result. As a result, it showed that multi-graylevel FCNs improves segmentation accuracy of the thin bones of the medial wall compared to a single gray-level FCNs and thresholding.
Semi-automatic segmentation of JIA-induced inflammation in MRI images of ankle joints
Anqi Wang, Andreas Franke, Stefan Wesarg
The autoimmune disease Juvenile Idiopathic Arthritis (JIA) affects children of under 16 years and leads to the symptom of inflamed synovial membranes in affected joints. In clinical practice, characteristics of these inflamed membranes are used to stage the disease progression and to predict erosive bone damage. Manual outlining of inflammatory regions in each slide of a MRI dataset is still the gold standard for detection and quantification, however, this process is very tiresome and time-consuming. In addition, the inter- and intra-observer variability is a known problem of human annotators. We have developed the first method to detect inflamed regions in and around major joints in the human ankle. First, we use an adapted coupled shape model framework to segment the ankle bones in a MRI dataset. Based on these segmentations, joints are defined as locations where two bones are particularly close to each other. A number of potential inflammation candidates are generated using multi-level thresholding. Since it is known that inflamed synovial membranes occur in the proximity of joints, we filter out structures with similar intensities such as vessels and tendons sheaths using not only a vesselness filter, but also their distance to the joints and their size. The method has been evaluated on a set of 10 manually annotated clinical MRI datasets and achieved the following results: Precision 0.6785 ± 0.1584, Recall 0.5388 ± 0.1213, DICE 0.5696 ± 0.0976.
Obtaining the potential number of models/atlases needed for capturing anatomic variations in population images
Ze Jin, Jayaram K. Udupa, Drew A. Torigian
Many medical image processing and analysis operations can benefit a great deal from prior information encoded in the form of models/atlases to capture variations over a population in form, shape, anatomic layout, and image appearance of objects. However, the fundamental question “How many models/ atlases are needed for optimally encoding prior information to address the differing body habitus factor in a given population?” has remained a difficult and open problem. We propose a method to seek an answer to this question assuming that a set Ι of images representative of the population for the body region is given. Our approach, after images in Ι are trimmed to the exact body region, is to create a partition of Ι into a specified number n of groups by optimizing the collective similarity of images in each group. We then ascertain how the overall goodness of partition Pn(Ι) varies as we change n from 1 to |Ι|. Subsequently, values of n at which there are significant changes in the goodness value are determined. These breakpoints are taken as the recommended number of groups/ models/ atlases. Our results on 284 thoracic computed tomography (CT) scans show that at least 8 groups are essential, and 15, 21, or 32 could be optimum numbers if a finer classification is needed for this population. This method may be helpful for constructing high quality models/atlases with a proper grouping of the images from a sufficiently large population and in selecting optimally the training image sets needed for each class in deep learning strategies.
Evaluating the impact of intensity normalization on MR image synthesis
Jacob C. Reinhold, Blake E. Dewey, Aaron Carass, et al.
Image synthesis learns a transformation from the intensity features of an input image to yield a different tissue contrast of the output image. This process has been shown to have application in many medical image analysis tasks including imputation, registration, and segmentation. To carry out synthesis, the intensities of the input images are typically scaled—i.e., normalized—both in training to learn the transformation and in testing when applying the transformation, but it is not presently known what type of input scaling is optimal. In this paper, we consider seven different intensity normalization algorithms and three different synthesis methods to evaluate the impact of normalization. Our experiments demonstrate that intensity normalization as a preprocessing step improves the synthesis results across all investigated synthesis algorithms. Furthermore, we show evidence that suggests intensity normalization is vital for successful deep learning-based MR image synthesis.
The utility of deep learning: evaluation of a convolutional neural network for detection of intracranial bleeds on non-contrast head computed tomography studies
P. Ojeda, M. Zawaideh, M. Mossa-Basha, et al.
While rapid detection of intracranial hemorrhage (ICH) on computed tomography (CT) is a critical step in assessing patients with acute neurological symptoms in the emergency setting, prioritizing scans for radiologic interpretation by the acuity of imaging findings remains a challenge and can lead to delays in diagnosis at centers with heavy imaging volumes and limited staff resources. Deep learning has shown promise as a technique in aiding physicians in performing this task accurately and expeditiously and may be especially useful in a resource-constrained context. Our group evaluated the performance of a convolutional neural network (CNN) model developed by Aidoc (Tel Aviv, Israel). This model is one of the first artificial intelligence devices to receive FDA clearance for enabling radiologists to triage patients after scan acquisition. The algorithm was tested on 7112 non-contrast head CTs acquired during 2016–2017 from a two, large urban academic and trauma centers. Ground truth labels were assigned to the test data per PACS query and prior reports by expert neuroradiologists. No scans from these two hospitals had been used during the algorithm training process and Aidoc staff were at all times blinded to the ground truth labels. Model output was reviewed by three radiologists and manual error analysis performed on discordant findings. Specificity was 99%, sensitivity was 95%, and overall accuracy was 98%. In summary, we report promising results of a scalable and clinically pragmatic deep learning model tested on a large set of real-world data from high-volume medical centers. This model holds promise for assisting clinicians in the identification and prioritization of exams suspicious for ICH, facilitating both the diagnosis and treatment of an emergent and life-threatening condition.
Offset regression networks for view plane estimation in 3D fetal ultrasound
Ultrasound (US) is the modality of choice for fetal screening, which includes the assessment of a variety of standardized growth measurements, like the abdominal circumference (AC). Screening guidelines define criteria on the scan plane, in which the measurement is taken. As US is increasingly becoming a 3D modality, approaches for automated determination of the optimal scan plane in a volumetric dataset would greatly improve the workflow. In this work, a novel framework for deep hyperplane learning is proposed and applied for view plane estimation in fetal US examinations. The approach is tightly integrated in the clinical workflow and consists of two main steps. First, the bounding box around the structure of interest is determined in the central slice (MPR). Second, offsets from the structure in the bounding box to the optimal view plane are estimated. By linear regression through the estimated offsets, the view plane coordinates can then be determined. The presented approach is successfully applied on clinical screening data for AC plane estimation and a high accuracy is obtained, outperforming or comparable to recent publications on the same application.