أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Chao Ma

Learning to Track Objects from Unlabeled Videos

218 - Jilai Zheng , Chao Ma , Houwen Peng 2021

In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch. We identify that three major challenges, i.e., moving object discovery, rich temporal variation exploitation, and online update, are the central causes of t he performance bottleneck of existing unsupervised trackers. To narrow the gap between unsupervised trackers and supervised counterparts, we propose an effective unsupervised learning approach composed of three stages. First, we sample sequentially moving objects with unsupervised optical flow and dynamic programming, instead of random cropping. Second, we train a naive Siamese tracker from scratch using single-frame pairs. Third, we continue training the tracker with a novel cycle memory learning scheme, which is conducted in longer temporal spans and also enables our tracker to update online. Extensive experiments show that the proposed USOT learned from unlabeled videos performs well over the state-of-the-art unsupervised trackers by large margins, and on par with recent supervised deep trackers. Code is available at https://github.com/VISION-SJTU/USOT.

الرؤية الحاسوبية وتمييز الأنماط

Generalized Kn{o}rrers Periodicity Theorem

190 - Ji-Wei He , Xin-Chao Ma , Yu Ye 2021

Let $A$ be a noetherian Koszul Artin-Schelter regular algebra, and let $fin A_2$ be a central regular element of $A$. The quotient algebra $A/(f)$ is usually called a (noncommutative) quadric hypersurface. In this paper, we use the Clifford deformati on to study the quadric hypersurfaces obtained from the tensor products. We introduce a notion of simple graded isolated singularity and proved that, if $B/(g)$ is a simple graded isolated singularity of 0-type, then there is an equivalence of triangulated categories $underline{text{mcm}},A/(f)congunderline{text{mcm}},(Aotimes B)/(f+g)$ of the stable categories of maximal Cohen-Macaulay modules. This result may be viewed as a generalization of Kn{o}rrers periodicity theorem. As an application, we study the double branch cover $(A/(f))^#=A[x]/(f+x^2)$ of a noncommutative conic $A/(f)$.

حلقات وجبر الجبر التبادلي

Continual Learning for Blind Image Quality Assessment

389 - Weixia Zhang , Dingquan Li , Chao Ma 2021

The explosive growth of image data facilitates the fast development of image processing and computer vision methods for emerging visual applications, meanwhile introducing novel distortions to the processed images. This poses a grand challenge to exi sting blind image quality assessment (BIQA) models, failing to continually adapt to such subpopulation shift. Recent work suggests training BIQA methods on the combination of all available human-rated IQA datasets. However, this type of approach is not scalable to a large number of datasets, and is cumbersome to incorporate a newly created dataset as well. In this paper, we formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets, building on what was learned from previously seen data. We first identify five desiderata in the new setting with a measure to quantify the plasticity-stability trade-off. We then propose a simple yet effective method for learning BIQA models continually. Specifically, based on a shared backbone network, we add a prediction head for a new dataset, and enforce a regularizer to allow all prediction heads to evolve with new data while being resistant to catastrophic forgetting of old data. We compute the quality score by an adaptive weighted summation of estimates from all prediction heads. Extensive experiments demonstrate the promise of the proposed continual learning method in comparison to standard training techniques for BIQA.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

Highway freight transportation diversity of cities based on radiation models

85 - Li Wang 2020

Using a unique data set containing about 15.06 million truck transportation records in five months, we investigate the highway freight transportation diversity of 338 Chinese cities based on the truck transportation probability $p_{ij}$ from one city to the other. The transportation probabilities are calculated from the radiation model based on the geographic distance and its cost-based version based on the driving distance as the proxy of cost. For each model, we consider both the population and the gross domestic product, and find quantitatively very similar results. We find that the transportation probabilities have nice power-law tails with the tail exponents close to 0.5 for all the models. The two transportation probabilities in each model fall around the diagonal $p_{ij}=p_{ji}$ but are often not the same. In addition, the corresponding transportation probabilities calculated from the raw radiation model and the cost-based radiation model also fluctuate around the diagonal $p_{ij}^{rm{geo}}=p_{ij}^{rm{cost}}$. We calculate four sets of highway truck transportation diversity according to the four sets of transportation probabilities that are found to be close to each other for each city pair. Further, it is found that the population, the gross domestic product, the in-flux, and the out-flux scale as power laws with respect to the transportation diversity in the raw and cost-based radiation models. It implies that a more developed city usually has higher diversity in highway truck transportation, which reflects the fact that a more developed city usually has a more diverse economic structure.

الفيزياء والمجتمع

First-Order Methods for Convex Constrained Optimization under Error Bound Conditions with Unknown Growth Parameters

130 - Qihang Lin , Runchao Ma , Selvaprabu Nadarajah 2020

We propose first-order methods based on a level-set technique for convex constrained optimization that satisfies an error bound condition with unknown growth parameters. The proposed approach solves the original problem by solving a sequence of uncon strained subproblems defined with different level parameters. Different from the existing level-set methods where the subproblems are solved sequentially, our method applies a first-order method to solve each subproblem independently and simultaneously, which can be implemented with either a single or multiple processors. Once the objective value of one subproblem is reduced by a constant factor, a sequential restart is performed to update the level parameters and restart the first-order methods. When the problem is non-smooth, our method finds an $epsilon$-optimal and $epsilon$-feasible solution by computing at most $O(frac{G^{2/d}}{epsilon^{2-2/d}}ln^3(frac{1}{epsilon}))$ subgradients where $G>0$ and $dgeq 1$ are the growth rate and the exponent, respectively, in the error bound condition. When the problem is smooth, the complexity is improved to $O(frac{G^{1/d}}{epsilon^{1-1/d}}ln^3(frac{1}{epsilon}))$. Our methods do not require knowing $G$, $d$ and any problem dependent parameters.

التحسين والتحكم

Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning

92 - Pan Zhou , Jiashi Feng , Chao Ma 2020

It is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse generalization performance than SGD despite their faster training speed. This work aims to provide understandings on this generalization gap by analyzing their local co nvergence behaviors. Specifically, we observe the heavy tails of gradient noise in these algorithms. This motivates us to analyze these algorithms through their Levy-driven stochastic differential equations (SDEs) because of the similar convergence behaviors of an algorithm and its SDE. Then we establish the escaping time of these SDEs from a local basin. The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD. So SGD is more locally unstable than ADAM~at sharp minima defined as the minima whose local basins have small Radon measure, and can better escape from them to flatter ones with larger Radon measure. As flat minima here which often refer to the minima at flat or asymmetric basins/valleys often generalize better than sharp ones~cite{keskar2016large,he2019asymmetric}, our result explains the better generalization performance of SGD over ADAM. Finally, experimental results confirm our heavy-tailed gradient noise assumption and theoretical affirmation.

التعلم الآلي الذكاء الاصطناعي التحسين والتحكم

Infrared target tracking based on proximal robust principal component analysis method

96 - Chao Ma , Guohua Gu , Xin Miao 2020

Infrared target tracking plays an important role in both civil and military fields. The main challenges in designing a robust and high-precision tracker for infrared sequences include overlap, occlusion and appearance change. To this end, this paper proposes an infrared target tracker based on proximal robust principal component analysis method. Firstly, the observation matrix is decomposed into a sparse occlusion matrix and a low-rank target matrix, and the constraint optimization is carried out with an approaching proximal norm which is better than L1-norm. To solve this convex optimization problem, Alternating Direction Method of Multipliers (ADMM) is employed to estimate the variables alternately. Finally, the framework of particle filter with model update strategy is exploited to locate the target. Through a series of experiments on real infrared target sequences, the effectiveness and robustness of our algorithm are proved.

الرؤية الحاسوبية وتمييز الأنماط

Interpreting and Boosting Dropout from a Game-Theoretic View

102 - Hao Zhang , Sen Li , Yinchao Ma 2020

This paper aims to understand and improve the utility of the dropout operation from the perspective of game-theoretic interactions. We prove that dropout can suppress the strength of interactions between input variables of deep neural networks (DNNs) . The theoretic proof is also verified by various experiments. Furthermore, we find that such interactions were strongly related to the over-fitting problem in deep learning. Thus, the utility of dropout can be regarded as decreasing interactions to alleviate the significance of over-fitting. Based on this understanding, we propose an interaction loss to further improve the utility of dropout. Experimental results have shown that the interaction loss can effectively improve the utility of dropout and boost the performance of DNNs.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data

190 - Chao Ma , Sebastian Tschiatschek , Jose Miguel Hernandez-Lobato 2020

Deep generative models often perform poorly in real-world applications due to the heterogeneity of natural data sets. Heterogeneity arises from data containing different types of features (categorical, ordinal, continuous, etc.) and features of the s ame type having different marginal distributions. We propose an extension of variational autoencoders (VAEs) called VAEM to handle such heterogeneous data. VAEM is a deep generative model that is trained in a two stage manner such that the first stage provides a more uniform representation of the data to the second stage, thereby sidestepping the problems caused by heterogeneous data. We provide extensions of VAEM to handle partially observed data, and demonstrate its performance in data generation, missing data prediction and sequential feature selection tasks. Our results show that VAEM broadens the range of real-world applications where deep generative models can be successfully deployed.

التعلم الآلي التعلم الالي

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

335 - Weichao Mao , Kaiqing Zhang , Erik Miehling 2020

Multi-agent reinforcement learning (MARL) under partial observability has long been considered challenging, primarily due to the requirement for each agent to maintain a belief over all other agents local histories -- a domain that generally grows ex ponentially over time. In this work, we investigate a partially observable MARL problem in which agents are cooperative. To enable the development of tractable algorithms, we introduce the concept of an information state embedding that serves to compress agents histories. We quantify how the compression error influences the resulting value functions for decentralized control. Furthermore, we propose an instance of the embedding based on recurrent neural networks (RNNs). The embedding is then used as an approximate information state, and can be fed into any MARL algorithm. The proposed embed-then-learn pipeline opens the black-box of existing (partially observable) MARL algorithms, allowing us to establish some theoretical guarantees (error bounds of value functions) while still achieving competitive performance with many end-to-end approaches.

الذكاء الاصطناعي التعلم الآلي أنظمة متعددة العملاء

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد