No Arabic abstract
Since the beginning of this decade, CNN has been a very successful tool in the field of Computer Vision tasks.The invention of CNN was inspired from neuroscience and it shares a lot of anatomical similarities with our visual system.Inspired by the anatomyof humanvisual system, wearguethat the existing U-Net architecture can be improvedin many ways. As human visual system uses attention mechanism, we have used attention concatenation in place of normalconcatenation.Although, CNN is purely feed-forward in nature but anatomical evidences show that our brain contains recurrent synapses and they often outnumber feed-forward and top-down connections. Thisfact inspiresus to userecurrent convolution connectionsin place of normalconvolution blocksin U-Net.Thispaper also addressesthe class imbalance issuein the field of medical image analysis. The paperresolvestheproblem of class imbalanceswith the help of state-of-the-art loss functions.Weargue thatourproposed architecturecan be trained end to end with a few training data and it outperforms the other variantsof U-Net.
Purpose: To evaluate nerve fiber layer (NFL) reflectance for glaucoma diagnosis. Methods: Participants were imaged with 4.5X4.5-mm volumetric disc scans using spectral-domain optical coherence tomography (OCT). The normalized NFL reflectance map was processed by an azimuthal filter to reduce directional reflectance bias due to variation of beam incidence angle. The peripapillary area of the map was divided into 160 superpixels. Average reflectance was the mean of superpixel reflectance. Low-reflectance superpixels were identified as those with NFL reflectance below the 5 percentile normative cutoff. Focal reflectance loss was measure by summing loss in low-reflectance superpixels. Results: Thirty-five normal, 30 pre-perimetric and 35 perimetric glaucoma participants were enrolled. Azimuthal filtering improved the repeatability of the normalized NFL reflectance, as measured by the pooled superpixel standard deviation (SD), from 0.73 to 0.57 dB (p<0.001, paired t-test) and reduced the population SD from 2.14 to 1.78 dB (p<0.001, t-test). Most glaucomatous reflectance maps showed characteristic patterns of contiguous wedge or diffuse defects. Focal NFL reflectance loss had significantly higher diagnostic sensitivity than the best NFL thickness parameter (overall, inferior, or focal loss volume): 53% v. 23% (p=0.027) in PPG eyes and 100% v. 80% (p=0.023) in PG eyes, with the specificity fixed at 99%. Conclusions: Azimuthal filtering reduces the variability of NFL reflectance measurements. Focal NFL reflectance loss has excellent glaucoma diagnostic accuracy compared to the standard NFL thickness parameters. The reflectance map may be useful for localizing NFL defects.
Narwhal is one of the most mysterious marine mammals, due to its isolated habitat in the Arctic region. Tagging is a technology that has the potential to explore the activities of this species, where behavioral information can be collected from instrumented individuals. This includes accelerometer data, diving and acoustic data as well as GPS positioning. An essential element in understanding the ecological role of toothed whales is to characterize their feeding behavior and estimate the amount of food consumption. Buzzes are sounds emitted by toothed whales that are related directly to the foraging behaviors. It is therefore of interest to measure or estimate the rate of buzzing to estimate prey intake. The main goal of this paper is to find a way to detect prey capture attempts directly from accelerometer data, and thus be able to estimate food consumption without the need for the more demanding acoustic data. We develop 3 automated buzz detection methods based on accelerometer and depth data solely. We use a dataset from 5 narwhals instrumented in East Greenland in 2018 to train, validate and test a logistic regression model and the machine learning algorithms random forest and deep learning, using the buzzes detected from acoustic data as the ground truth. The deep learning algorithm performed best among the tested methods. We conclude that reliable buzz detectors can be derived from high-frequency-sampling, back-mounted accelerometer tags, thus providing an alternative tool for studies of foraging ecology of marine mammals in their natural environments. We also compare buzz detection with certain movement patterns, such as sudden changes in acceleration (jerks), found in other marine mammal species for estimating prey capture. We find that narwhals do not seem to make big jerks when foraging and conclude that their hunting patterns in that respect differ from other marine mammals.
The novel coronavirus disease 2019 (COVID-19) has been spreading rapidly around the world and caused significant impact on the public health and economy. However, there is still lack of studies on effectively quantifying the lung infection caused by COVID-19. As a basic but challenging task of the diagnostic framework, segmentation plays a crucial role in accurate quantification of COVID-19 infection measured by computed tomography (CT) images. To this end, we proposed a novel deep learning algorithm for automated segmentation of multiple COVID-19 infection regions. Specifically, we use the Aggregated Residual Transformations to learn a robust and expressive feature representation and apply the soft attention mechanism to improve the capability of the model to distinguish a variety of symptoms of the COVID-19. With a public CT image dataset, we validate the efficacy of the proposed algorithm in comparison with other competing methods. Experimental results demonstrate the outstanding performance of our algorithm for automated segmentation of COVID-19 Chest CT images. Our study provides a promising deep leaning-based segmentation tool to lay a foundation to quantitative diagnosis of COVID-19 lung infection in CT images.
Fundus photography has routinely been used to document the presence and severity of retinal degenerative diseases such as age-related macular degeneration (AMD), glaucoma, and diabetic retinopathy (DR) in clinical practice, for which the fovea and optic disc (OD) are important retinal landmarks. However, the occurrence of lesions, drusen, and other retinal abnormalities during retinal degeneration severely complicates automatic landmark detection and segmentation. Here we propose HBA-U-Net: a U-Net backbone enriched with hierarchical bottleneck attention. The network consists of a novel bottleneck attention block that combines and refines self-attention, channel attention, and relative-position attention to highlight retinal abnormalities that may be important for fovea and OD segmentation in the degenerated retina. HBA-U-Net achieved state-of-the-art results on fovea detection across datasets and eye conditions (ADAM: Euclidean Distance (ED) of 25.4 pixels, REFUGE: 32.5 pixels, IDRiD: 32.1 pixels), on OD segmentation for AMD (ADAM: Dice Coefficient (DC) of 0.947), and on OD detection for DR (IDRiD: ED of 20.5 pixels). Our results suggest that HBA-U-Net may be well suited for landmark detection in the presence of a variety of retinal degenerative diseases.
Automatic segmentation of multi-sequence (multi-modal) cardiac MR (CMR) images plays a significant role in diagnosis and management for a variety of cardiac diseases. However, the performance of relevant algorithms is significantly affected by the proper fusion of the multi-modal information. Furthermore, particular diseases, such as myocardial infarction, display irregular shapes on images and occupy small regions at random locations. These facts make pathology segmentation of multi-modal CMR images a challenging task. In this paper, we present the Max-Fusion U-Net that achieves improved pathology segmentation performance given aligned multi-modal images of LGE, T2-weighted, and bSSFP modalities. Specifically, modality-specific features are extracted by dedicated encoders. Then they are fused with the pixel-wise maximum operator. Together with the corresponding encoding features, these representations are propagated to decoding layers with U-Net skip-connections. Furthermore, a spatial-attention module is applied in the last decoding layer to encourage the network to focus on those small semantically meaningful pathological regions that trigger relatively high responses by the network neurons. We also use a simple image patch extraction strategy to dynamically resample training examples with varying spacial and batch sizes. With limited GPU memory, this strategy reduces the imbalance of classes and forces the model to focus on regions around the interested pathology. It further improves segmentation accuracy and reduces the mis-classification of pathology. We evaluate our methods using the Myocardial pathology segmentation (MyoPS) combining the multi-sequence CMR dataset which involves three modalities. Extensive experiments demonstrate the effectiveness of the proposed model which outperforms the related baselines.