ﻻ يوجد ملخص باللغة العربية
Recently, deep learning has been utilized to solve video recognition problem due to its prominent representation ability. Deep neural networks for video tasks is highly customized and the design of such networks requires domain experts and costly trial and error tests. Recent advance in network architecture search has boosted the image recognition performance in a large margin. However, automatic designing of video recognition network is less explored. In this study, we propose a practical solution, namely Practical Video Neural Architecture Search (PV-NAS).Our PV-NAS can efficiently search across tremendous large scale of architectures in a novel spatial-temporal network search space using the gradient based search methods. To avoid sticking into sub-optimal solutions, we propose a novel learning rate scheduler to encourage sufficient network diversity of the searched models. Extensive empirical evaluations show that the proposed PV-NAS achieves state-of-the-art performance with much fewer computational resources. 1) Within light-weight models, our PV-NAS-L achieves 78.7% and 62.5% Top-1 accuracy on Kinetics-400 and Something-Something V2, which are better than previous state-of-the-art methods (i.e., TSM) with a large margin (4.6% and 3.4% on each dataset, respectively), and 2) among median-weight models, our PV-NAS-M achieves the best performance (also a new record)in the Something-Something V2 dataset.
In the field of complex action recognition in videos, the quality of the designed model plays a crucial role in the final performance. However, artificially designed network structures often rely heavily on the researchers knowledge and experience. A
Differential Neural Architecture Search (NAS) requires all layer choices to be held in memory simultaneously; this limits the size of both search space and final architecture. In contrast, Probabilistic NAS, such as PARSEC, learns a distribution over
We present BN-NAS, neural architecture search with Batch Normalization (BN-NAS), to accelerate neural architecture search (NAS). BN-NAS can significantly reduce the time required by model training and evaluation in NAS. Specifically, for fast evaluat
Efficient search is a core issue in Neural Architecture Search (NAS). It is difficult for conventional NAS algorithms to directly search the architectures on large-scale tasks like ImageNet. In general, the cost of GPU hours for NAS grows with regard
The state-of-the-art object detection method is complicated with various modules such as backbone, feature fusion neck, RPN and RCNN head, where each module may have different designs and structures. How to leverage the computational cost and accurac