REFUGE Challenge: A Unified Framework for Evaluating Automated Methods for Glaucoma Assessment from Fundus Photographs

102 0 0.0 ( 0 )

Download Cite

Added by Jos\\'e Ignacio Orlando PhD

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Jose Ignacio Orlando - Huazhu Fu - Jo~ao Barbossa Breda

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Glaucoma is one of the leading causes of irreversible but preventable blindness in working age populations. Color fundus photography (CFP) is the most cost-effective imaging modality to screen for retinal disorders. However, its application to glaucoma has been limited to the computation of a few related biomarkers such as the vertical cup-to-disc ratio. Deep learning approaches, although widely applied for medical image analysis, have not been extensively used for glaucoma assessment due to the limited size of the available data sets. Furthermore, the lack of a standardize benchmark strategy makes difficult to compare existing methods in a uniform way. In order to overcome these issues we set up the Retinal Fundus Glaucoma Challenge, REFUGE (url{https://refuge.grand-challenge.org}), held in conjunction with MICCAI 2018. The challenge consisted of two primary tasks, namely optic disc/cup segmentation and glaucoma classification. As part of REFUGE, we have publicly released a data set of 1200 fundus images with ground truth segmentations and clinical glaucoma labels, currently the largest existing one. We have also built an evaluation framework to ease and ensure fairness in the comparison of different models, encouraging the development of novel techniques in the field. 12 teams qualified and participated in the online challenge. This paper summarizes their methods and analyzes their corresponding results. In particular, we observed that two of the top-ranked teams outperformed two human experts in the glaucoma classification task. Furthermore, the segmentation results were in general consistent with the ground truth annotations, with complementary outcomes that can be further exploited by ensembling the results.

rate research

Predicting Cardiovascular Risk Factors from Retinal Fundus Photographs using Deep Learning

374 - Ryan Poplin , Avinash V. Varadarajan , Katy Blumer 2017

Traditionally, medical discoveries are made by observing associations and then designing experiments to test these hypotheses. However, observing and quantifying associations in images can be difficult because of the wide variety of features, patterns, colors, values, shapes in real data. In this paper, we use deep learning, a machine learning technique that learns its own features, to discover new knowledge from retinal fundus images. Using models trained on data from 284,335 patients, and validated on two independent datasets of 12,026 and 999 patients, we predict cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as such as age (within 3.26 years), gender (0.97 AUC), smoking status (0.71 AUC), HbA1c (within 1.39%), systolic blood pressure (within 11.23mmHg) as well as major adverse cardiac events (0.70 AUC). We further show that our models used distinct aspects of the anatomy to generate each prediction, such as the optic disc or blood vessels, opening avenues of further research.

Computer Vision and Pattern Recognition

A Fully Automated System for Sizing Nasal PAP Masks Using Facial Photographs

246 - Benjamin Johnston , Philip de Chazal 2018

We present a fully automated system for sizing nasal Positive Airway Pressure (PAP) masks. The system is comprised of a mix of HOG object detectors as well as multiple convolutional neural network stages for facial landmark detection. The models were trained using samples from the publicly available PUT and MUCT datasets while transfer learning was also employed to improve the performance of the models on facial photographs of actual PAP mask users. The fully automated system demonstrated an overall accuracy of 64.71% in correctly selecting the appropriate mask size and 86.1% accuracy sizing within 1 mask size.

Computer Vision and Pattern Recognition

A Framework for Evaluating Approximation Methods for Gaussian Process Regression

599 - Krzysztof Chalupka , Christopher K. I. Williams , Iain Murray 2012

Gaussian process (GP) predictors are an important component of many Bayesian approaches to machine learning. However, even a straightforward implementation of Gaussian process regression (GPR) requires O(n^2) space and O(n^3) time for a dataset of n examples. Several approximation methods have been proposed, but there is a lack of understanding of the relative merits of the different approximations, and in what situations they are most useful. We recommend assessing the quality of the predictions obtained as a function of the compute time taken, and comparing to standard baselines (e.g., Subset of Data and FITC). We empirically investigate four different approximation algorithms on four different prediction problems, and make our code available to encourage future comparisons.

Machine Learning Machine Learning Computation

Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation

111 - Mingkai Deng , Bowen Tan , Zhengzhong Liu 2021

Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific objectives and desires different properties of generated text. The complexity makes automatic evaluation of NLG particularly challenging. Previous work has typically focused on a single task and developed individual evaluation metrics based on specific intuitions. In this paper, we propose a unifying perspective based on the nature of information change in NLG tasks, including compression (e.g., summarization), transduction (e.g., text rewriting), and creation (e.g., dialog). Information alignment between input, context, and output text plays a common central role in characterizing the generation. With automatic alignment prediction models, we develop a family of interpretable metrics that are suitable for evaluating key aspects of different NLG tasks, often without need of gold reference data. Experiments show the uniformly designed metrics achieve stronger or comparable correlations with human judgement compared to state-of-the-art metrics in each of diverse tasks, including text summarization, style transfer, and knowledge-grounded dialog.

Computation and Language Machine Learning

A Unified Multiscale Framework for Discrete Energy Minimization

301 - Shai Bagon , Meirav Galun 2012

Discrete energy minimization is a ubiquitous task in computer vision, yet is NP-hard in most cases. In this work we propose a multiscale framework for coping with the NP-hardness of discrete optimization. Our approach utilizes algebraic multiscale principles to efficiently explore the discrete solution space, yielding improved results on challenging, non-submodular energies for which current methods provide unsatisfactory approximations. In contrast to popular multiscale methods in computer vision, that builds an image pyramid, our framework acts directly on the energy to construct an energy pyramid. Deriving a multiscale scheme from the energy itself makes our framework application independent and widely applicable. Our framework gives rise to two complementary energy coarsening strategies: one in which coarser scales involve fewer variables, and a more revolutionary one in which the coarser scales involve fewer discrete labels. We empirically evaluated our unified framework on a variety of both non-submodular and submodular energies, including energies from Middlebury benchmark.

Computer Vision and Pattern Recognition Discrete Mathematics