ترغب بنشر مسار تعليمي؟ اضغط هنا

Deep reinforcement learning (RL) has shown great empirical successes, but suffers from brittleness and sample inefficiency. A potential remedy is to use a previously-trained policy as a source of supervision. In this work, we refer to these policies as teachers and study how to transfer their expertise to new student policies by focusing on data usage. We propose a framework, Data CUrriculum for Reinforcement learning (DCUR), which first trains teachers using online deep RL, and stores the logged environment interaction history. Then, students learn by running either offline RL or by using teacher data in combination with a small amount of self-generated data. DCURs central idea involves defining a class of data curricula which, as a function of training time, limits the student to sampling from a fixed subset of the full teacher data. We test teachers and students using state-of-the-art deep RL algorithms across a variety of data curricula. Results suggest that the choice of data curricula significantly impacts student learning, and that it is beneficial to limit the data during early training stages while gradually letting the data availability grow over time. We identify when the student can learn offline and match teacher performance without relying on specialized offline RL algorithms. Furthermore, we show that collecting a small fraction of online data provides complementary benefits with the data curriculum. Supplementary material is available at https://tinyurl.com/teach-dcur.
Mixup is a recent regularizer for current deep classification networks. Through training a neural network on convex combinations of pairs of examples and their labels, it imposes locally linear constraints on the models input space. However, such str ict linear constraints often lead to under-fitting which degrades the effects of regularization. Noticeably, this issue is getting more serious when the resource is extremely limited. To address these issues, we propose the Adversarial Mixing Policy (AMP), organized in a min-max-rand formulation, to relax the Locally Linear Constraints in Mixup. Specifically, AMP adds a small adversarial perturbation to the mixing coefficients rather than the examples. Thus, slight non-linearity is injected in-between the synthetic examples and synthetic labels. By training on these data, the deep networks are further regularized, and thus achieve a lower predictive error rate. Experiments on five text classification benchmarks and five backbone models have empirically shown that our methods reduce the error rate over Mixup variants in a significant margin (up to 31.3%), especially in low-resource conditions (up to 17.5%).
We show the properties and characterization of coherence witnesses. We show methods for constructing coherence witnesses for an arbitrary coherent state. We investigate the problem of finding common coherence witnesses for certain class of states. We show that finitely many different witnesses $W_1, W_2, cdots, W_n$ can detect some common coherent states if and only if $sum_{i=1}^nt_iW_i$ is still a witnesses for any nonnegative numbers $t_i(i=1,2,cdots,n)$. We show coherent states play the role of high-level witnesses. Thus, the common state problem is changed into the question of when different high-level witnesses (coherent states) can detect the same coherence witnesses. Moreover, we show a coherent state and its robust state have no common coherence witness and give a general way to construct optimal coherence witnesses for any comparable states.
In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch. We identify that three major challenges, i.e., moving object discovery, rich temporal variation exploitation, and online update, are the central causes of t he performance bottleneck of existing unsupervised trackers. To narrow the gap between unsupervised trackers and supervised counterparts, we propose an effective unsupervised learning approach composed of three stages. First, we sample sequentially moving objects with unsupervised optical flow and dynamic programming, instead of random cropping. Second, we train a naive Siamese tracker from scratch using single-frame pairs. Third, we continue training the tracker with a novel cycle memory learning scheme, which is conducted in longer temporal spans and also enables our tracker to update online. Extensive experiments show that the proposed USOT learned from unlabeled videos performs well over the state-of-the-art unsupervised trackers by large margins, and on par with recent supervised deep trackers. Code is available at https://github.com/VISION-SJTU/USOT.
By using the quantum Ising chain as a test bed and treating the spin polarization along the external transverse field as the generalized density, we examine the performance of different levels of density functional approximations parallel to those wi dely used for interacting electrons, such as local density approximation (LDA) and generalized gradient approximation (GGA). We show that by adding the lowest-order and nearest-neighbor density variation correction to the simple LDA, a semi-local energy functional in the spirit of GGA is almost exact over a wide range of inhomogeneous density distribution. In addition, the LDA and GGA error structures bear a high level of resemblance to the quantum phase diagram of the system. These results provide insights into the triumph and failure of these approximations in a general context.
Most few-shot learning models utilize only one modality of data. We would like to investigate qualitatively and quantitatively how much will the model improve if we add an extra modality (i.e. text description of the image), and how it affects the le arning procedure. To achieve this goal, we propose four types of fusion method to combine the image feature and text feature. To verify the effectiveness of improvement, we test the fusion methods with two classical few-shot learning models - ProtoNet and MAML, with image feature extractors such as ConvNet and ResNet12. The attention-based fusion method works best, which improves the classification accuracy by a large margin around 30% comparing to the baseline result.
190 - Ji-Wei He , Xin-Chao Ma , Yu Ye 2021
Let $A$ be a noetherian Koszul Artin-Schelter regular algebra, and let $fin A_2$ be a central regular element of $A$. The quotient algebra $A/(f)$ is usually called a (noncommutative) quadric hypersurface. In this paper, we use the Clifford deformati on to study the quadric hypersurfaces obtained from the tensor products. We introduce a notion of simple graded isolated singularity and proved that, if $B/(g)$ is a simple graded isolated singularity of 0-type, then there is an equivalence of triangulated categories $underline{text{mcm}},A/(f)congunderline{text{mcm}},(Aotimes B)/(f+g)$ of the stable categories of maximal Cohen-Macaulay modules. This result may be viewed as a generalization of Kn{o}rrers periodicity theorem. As an application, we study the double branch cover $(A/(f))^#=A[x]/(f+x^2)$ of a noncommutative conic $A/(f)$.
Recently, physics-driven deep learning methods have shown particular promise for the prediction of physical fields, especially to reduce the dependency on large amounts of pre-computed training data. In this work, we target the physics-driven learnin g of complex flow fields with high resolutions. We propose the use of emph{Convolutional neural networks} (CNN) based U-net architectures to efficiently represent and reconstruct the input and output fields, respectively. By introducing Navier-Stokes equations and boundary conditions into loss functions, the physics-driven CNN is designed to predict corresponding steady flow fields directly. In particular, this prevents many of the difficulties associated with approaches employing fully connected neural networks. Several numerical experiments are conducted to investigate the behavior of the CNN approach, and the results indicate that a first-order accuracy has been achieved. Specifically for the case of a flow around a cylinder, different flow regimes can be learned and the adhered twin-vortices are predicted correctly. The numerical results also show that the training for multiple cases is accelerated significantly, especially for the difficult cases at low Reynolds numbers, and when limited reference solutions are used as supplementary learning targets.
Federated learning can be a promising solution for enabling IoT cybersecurity (i.e., anomaly detection in the IoT environment) while preserving data privacy and mitigating the high communication/storage overhead (e.g., high-frequency data from time-s eries sensors) of centralized over-the-cloud approaches. In this paper, to further push forward this direction with a comprehensive study in both algorithm and system design, we build FedIoT platform that contains FedDetect algorithm for on-device anomaly data detection and a system design for realistic evaluation of federated learning on IoT devices. Furthermore, the proposed FedDetect learning framework improves the performance by utilizing a local adaptive optimizer (e.g., Adam) and a cross-round learning rate scheduler. In a network of realistic IoT devices (Raspberry PI), we evaluate FedIoT platform and FedDetect algorithm in both model and system performance. Our results demonstrate the efficacy of federated learning in detecting a wider range of attack types occurred at multiple devices. The system efficiency analysis indicates that both end-to-end training time and memory cost are affordable and promising for resource-constrained IoT devices. The source code is publicly available at https://github.com/FedML-AI/FedIoT
Quantum key distribution (QKD) provides information theoretically secures key exchange requiring authentication of the classic data processing channel via pre-sharing of symmetric private keys. In previous studies, the lattice-based post-quantum digi tal signature algorithm Aigis-Sig, combined with public-key infrastructure (PKI) was used to achieve high-efficiency quantum security authentication of QKD, and its advantages in simplifying the MAN network structure and new user entry were demonstrated. This experiment further integrates the PQC algorithm into the commercial QKD system, the Jinan field metropolitan QKD network comprised of 14 user nodes and 5 optical switching nodes. The feasibility, effectiveness and stability of the post-quantum cryptography (PQC) algorithm and advantages of replacing trusted relays with optical switching brought by PQC authentication large-scale metropolitan area QKD network were verified. QKD with PQC authentication has potential in quantum-secure communications, specifically in metropolitan QKD networks.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا