ﻻ يوجد ملخص باللغة العربية
The width of a neural network matters since increasing the width will necessarily increase the model capacity. However, the performance of a network does not improve linearly with the width and soon gets saturated. In this case, we argue that increasing the number of networks (ensemble) can achieve better accuracy-efficiency trade-offs than purely increasing the width. To prove it, one large network is divided into several small ones regarding its parameters and regularization components. Each of these small networks has a fraction of the original ones parameters. We then train these small networks together and make them see various views of the same data to increase their diversity. During this co-training process, networks can also learn from each other. As a result, small networks can achieve better ensemble performance than the large one with few or no extra parameters or FLOPs. Small networks can also achieve faster inference speed than the large one by concurrent running on different devices. We validate our argument with 8 different neural architectures on common benchmarks through extensive experiments. The code is available at url{https://github.com/mzhaoshuai/Divide-and-Co-training}.
A robot can invoke heterogeneous computation resources such as CPUs, cloud GPU servers, or even human computation for achieving a high-level goal. The problem of invoking an appropriate computation model so that it will successfully complete a task w
The goal of this paper is to serve as a guide for selecting a detection architecture that achieves the right speed/memory/accuracy balance for a given application and platform. To this end, we investigate various ways to trade accuracy for speed and
This paper is a study of performance-efficiency trade-offs in pre-trained models for automatic speech recognition (ASR). We focus on wav2vec 2.0, and formalize several architecture designs that influence both the model performance and its efficiency.
Trade-offs between accuracy and efficiency are found in multiple non-computing domains, such as law and public health, which have developed rules and heuristics to guide how to balance the two in conditions of uncertainty. While accuracy-efficiency t
Since the global spread of Covid-19 began to overwhelm the attempts of governments to conduct manual contact-tracing, there has been much interest in using the power of mobile phones to automate the contact-tracing process through the development of