Speed/accuracy trade-offs for modern convolutional object detectors

287 0 0.0 ( 0 )

Download Cite

Added by Jonathan Huang

Publication date 2016

fields Informatics Engineering

and research's language is English

Authors Jonathan Huang - Vivek Rathod - Chen Sun

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The goal of this paper is to serve as a guide for selecting a detection architecture that achieves the right speed/memory/accuracy balance for a given application and platform. To this end, we investigate various ways to trade accuracy for speed and memory usage in modern convolutional object detection systems. A number of successful systems have been proposed in recent years, but apples-to-apples comparisons are difficult due to different base feature extractors (e.g., VGG, Residual Networks), different default image resolutions, as well as different hardware and software platforms. We present a unified implementation of the Faster R-CNN [Ren et al., 2015], R-FCN [Dai et al., 2016] and SSD [Liu et al., 2015] systems, which we view as meta-architectures and trace out the speed/accuracy trade-off curve created by using alternative feature extractors and varying other critical parameters such as image size within each of these meta-architectures. On one extreme end of this spectrum where speed and memory are critical, we present a detector that achieves real time speeds and can be deployed on a mobile device. On the opposite end in which accuracy is critical, we present a detector that achieves state-of-the-art performance measured on the COCO detection task.

rate research

Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training

211 - Shuai Zhao , Liguang Zhou , Wenxiao Wang 2020

The width of a neural network matters since increasing the width will necessarily increase the model capacity. However, the performance of a network does not improve linearly with the width and soon gets saturated. In this case, we argue that increasing the number of networks (ensemble) can achieve better accuracy-efficiency trade-offs than purely increasing the width. To prove it, one large network is divided into several small ones regarding its parameters and regularization components. Each of these small networks has a fraction of the original ones parameters. We then train these small networks together and make them see various views of the same data to increase their diversity. During this co-training process, networks can also learn from each other. As a result, small networks can achieve better ensemble performance than the large one with few or no extra parameters or FLOPs. Small networks can also achieve faster inference speed than the large one by concurrent running on different devices. We validate our argument with 8 different neural architectures on common benchmarks through extensive experiments. The code is available at url{https://github.com/mzhaoshuai/Divide-and-Co-training}.

Computer Vision and Pattern Recognition

Diversity-enabled sweet spots in layered architectures and speed-accuracy trade-offs in sensorimotor control

67 - Yorie Nakahira , Quanying Liu , Terrence J. Sejnowski 2019

Nervous systems sense, communicate, compute and actuate movement using distributed components with severe trade-offs in speed, accuracy, sparsity, noise and saturation. Nevertheless, brains achieve remarkably fast, accurate, and robust control performance due to a highly effective layered control architecture. Here we introduce a driving task to study how a mountain biker mitigates the immediate disturbance of trail bumps and responds to changes in trail direction. We manipulated the time delays and accuracy of the control input from the wheel as a surrogate for manipulating the characteristics of neurons in the control loop. The observed speed-accuracy trade-offs (SATs) motivated a theoretical framework consisting of layers of control loops with components having diverse speeds and accuracies within each physical level, such as nerve bundles containing axons with a wide range of sizes. Our model explains why the errors from two control loops -- one fast but inaccurate reflexive layer that corrects for bumps, and a planning layer that is slow but accurate -- are additive, and show how the errors in each control loop can be decomposed into the errors caused by the limited speeds and accuracies of the components. These results demonstrate that an appropriate diversity in the properties of neurons across layers helps to create diversity-enabled sweet spots (DESSs) so that both fast and accurate control is achieved using slow or inaccurate components.

Optimization and Control Information Theory Systems and Control

Privacy-accuracy trade-offs in noisy digital exposure notifications

224 - Abbas Hammoud , Yun William Yu 2020

Since the global spread of Covid-19 began to overwhelm the attempts of governments to conduct manual contact-tracing, there has been much interest in using the power of mobile phones to automate the contact-tracing process through the development of exposure notification applications. The rough idea is simple: use Bluetooth or other data-exchange technologies to record contacts between users, enable users to report positive diagnoses, and alert users who have been exposed to sick users. Of course, there are many privacy concerns associated with this idea. Much of the work in this area has been concerned with designing mechanisms for tracing contacts and alerting users that do not leak additional information about users beyond the existence of exposure events. However, although designing practical protocols is of crucial importance, it is essential to realize that notifying users about exposure events may itself leak confidential information (e.g. that a particular contact has been diagnosed). Luckily, while digital contact tracing is a relatively new task, the generic problem of privacy and data disclosure has been studied for decades. Indeed, the framework of differential privacy further permits provable query privacy by adding random noise. In this article, we translate two results from statistical privacy and social recommendation algorithms to exposure notification. We thus prove some naive bounds on the degree to which accuracy must be sacrificed if exposure notification frameworks are to be made more private through the injection of noise.

Cryptography and Security Computers and Society

Interpretable Trade-offs Between Robot Task Accuracy and Compute Efficiency

395 - Bineet Ghosh , Sandeep Chinchali , Parasara Sridhar Duggirala 2021

A robot can invoke heterogeneous computation resources such as CPUs, cloud GPU servers, or even human computation for achieving a high-level goal. The problem of invoking an appropriate computation model so that it will successfully complete a task while keeping its compute and energy costs within a budget is called a model selection problem. In this paper, we present an optimal solution to the model selection problem with two compute models, the first being fast but less accurate, and the second being slow but more accurate. The main insight behind our solution is that a robot should invoke the slower compute model only when the benefits from the gain in accuracy outweigh the computational costs. We show that such cost-benefit analysis can be performed by leveraging the statistical correlation between the accuracy of fast and slow compute models. We demonstrate the broad applicability of our approach to diverse problems such as perception using neural networks and safe navigation of a simulated Mars rover.

Robotics

Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning

90 - Yuyan Wang , Xuezhi Wang , Alex Beutel 2021

As multi-task models gain popularity in a wider range of machine learning applications, it is becoming increasingly important for practitioners to understand the fairness implications associated with those models. Most existing fairness literature focuses on learning a single task more fairly, while how ML fairness interacts with multiple tasks in the joint learning setting is largely under-explored. In this paper, we are concerned with how group fairness (e.g., equal opportunity, equalized odds) as an ML fairness concept plays out in the multi-task scenario. In multi-task learning, several tasks are learned jointly to exploit task correlations for a more efficient inductive transfer. This presents a multi-dimensional Pareto frontier on (1) the trade-off between group fairness and accuracy with respect to each task, as well as (2) the trade-offs across multiple tasks. We aim to provide a deeper understanding on how group fairness interacts with accuracy in multi-task learning, and we show that traditional approaches that mainly focus on optimizing the Pareto frontier of multi-task accuracy might not perform well on fairness goals. We propose a new set of metrics to better capture the multi-dimensional Pareto frontier of fairness-accuracy trade-offs uniquely presented in a multi-task learning setting. We further propose a Multi-Task-Aware Fairness (MTA-F) approach to improve fairness in multi-task learning. Experiments on several real-world datasets demonstrate the effectiveness of our proposed approach.

Machine Learning