ﻻ يوجد ملخص باللغة العربية
The discriminator from generative adversarial nets (GAN) has been used by researchers as a feature extractor in transfer learning and appeared worked well. However, there are also studies that believe this is the wrong research direction because intuitively the task of the discriminator focuses on separating the real samples from the generated ones, making features extracted in this way useless for most of the downstream tasks. To avoid this dilemma, we first conducted a thorough theoretical analysis of the relationship between the discriminator task and the features extracted. We found that the connection between the task of the discriminator and the feature is not as strong as was thought, for that the main factor restricting the feature learned by the discriminator is not the task, but is the need to prevent the entire GAN model from mode collapse during the training. From this perspective and combined with further analyses, we found that to avoid mode collapse, the features extracted by the discriminator are not guided to be different for the real samples, but divergence without noise is indeed allowed and occupies a large proportion of the feature space. This makes the features more robust and helps answer the question as to why the discriminator can succeed as a feature extractor in related research. Consequently, to expose the essence of the discriminator extractor as different from other extractors, we analyze the counterpart of the discriminator extractor, the classifier extractor that assigns the target samples to different categories. We found the performance of the discriminator extractor may be inferior to the classifier based extractor when the source classification task is similar to the target task, which is the common case, but the ability to avoid noise prevents the discriminator from being replaced by the classifier.
We propose a Variational Time Series Feature Extractor (VTSFE), inspired by the VAE-DMP model of Chen et al., to be used for action recognition and prediction. Our method is based on variational autoencoders. It improves VAE-DMP in that it has a bett
Given a sufficiently large amount of labeled data, the non-convex low-rank matrix recovery problem contains no spurious local minima, so a local optimization algorithm is guaranteed to converge to a global minimum starting from any initial guess. How
It is difficult to be able to imitate well in unknown states from a small amount of expert data and sampling data. Supervised learning methods such as Behavioral Cloning do not require sampling data, but usually suffer from distribution shift. The me
Deep neural networks are typically initialized with random weights, with variances chosen to facilitate signal propagation and stable gradients. It is also believed that diversity of features is an important property of these initializations. We cons
Since the early 1980s, the research community has developed ever more sophisticated algorithms for the problem of energy disaggregation, but despite decades of research, there is still a dearth of applications with demonstrated value. In this work, w