ترغب بنشر مسار تعليمي؟ اضغط هنا

107 - Yongtao Li , Yuejian Peng 2021
A graph $G$ is $k$-edge-Hamiltonian if any collection of vertex-disjoint paths with at most $k$ edges altogether belong to a Hamiltonian cycle in $G$. A graph $G$ is $k$-Hamiltonian if for all $Ssubseteq V(G)$ with $|S|le k$, the subgraph induced by $V(G)setminus S$ has a Hamiltonian cycle. These two concepts are classical extensions for the usual Hamiltonian graphs. In this paper, we present some spectral sufficient conditions for a graph to be $k$-edge-Hamiltonian and $k$-Hamiltonian in terms of the adjacency spectral radius as well as the signless Laplacian spectral radius. Our results extend the recent works proved by Li and Ning [Linear Multilinear Algebra 64 (2016)], Nikiforov [Czechoslovak Math. J. 66 (2016)] and Li, Liu and Peng [Linear Multilinear Algebra 66 (2018)]. Moreover, we shall prove a stability result for graphs being $k$-Hamiltonian, which can be viewed as a complement of two recent results of F{u}redi, Kostochka and Luo [Discrete Math. 340 (2017)] and [Discrete Math. 342 (2019)].
The control variates (CV) method is widely used in policy gradient estimation to reduce the variance of the gradient estimators in practice. A control variate is applied by subtracting a baseline function from the state-action value estimates. Then t he variance-reduced policy gradient presumably leads to higher learning efficiency. Recent research on control variates with deep neural net policies mainly focuses on scalar-valued baseline functions. The effect of vector-valued baselines is under-explored. This paper investigates variance reduction with coordinate-wise and layer-wise control variates constructed from vector-valued baselines for neural net policies. We present experimental evidence suggesting that lower variance can be obtained with such baselines than with the conventional scalar-valued baseline. We demonstrate how to equip the popular Proximal Policy Optimization (PPO) algorithm with these new control variates. We show that the resulting algorithm with proper regularization can achieve higher sample efficiency than scalar control variates in continuous control benchmarks.
Although of practical importance, there is no established modeling framework to accurately predict high-temperature cyclic oxidation kinetics of multi-component alloys due to the inherent complexity. We present a data analytics approach to predict th e oxidation rate constant of NiCr-based alloys as a function of composition and temperature with a highly consistent and well-curated experimental dataset. Two characteristic oxidation models, i.e., a simple parabolic law and a statistical cyclic-oxidation model, have been chosen to numerically represent the high-temperature oxidation kinetics of commercial and model NiCr-based alloys. We have successfully trained machine learning (ML) models using highly ranked key input features identified by correlation analysis to accurately predict experimental parabolic rate constants (kp). This study demonstrates the potential of ML approaches to predict oxidation kinetics of alloys over a wide composition and temperature ranges. This approach can also serve as a basis for introducing more physically meaningful ML input features to predict the comprehensive cyclic oxidation behavior of multi-component high-temperature alloys with proper constraints based on the known underlying mechanisms.
Reinforcement learning from self-play has recently reported many successes. Self-play, where the agents compete with themselves, is often used to generate training data for iterative policy improvement. In previous work, heuristic rules are designed to choose an opponent for the current learner. Typical rules include choosing the latest agent, the best agent, or a random historical agent. However, these rules may be inefficient in practice and sometimes do not guarantee convergence even in the simplest matrix games. In this paper, we propose a new algorithmic framework for competitive self-play reinforcement learning in two-player zero-sum games. We recognize the fact that the Nash equilibrium coincides with the saddle point of the stochastic payoff function, which motivates us to borrow ideas from classical saddle point optimization literature. Our method trains several agents simultaneously, and intelligently takes each other as opponent based on simple adversarial rules derived from a principled perturbation-based saddle optimization method. We prove theoretically that our algorithm converges to an approximate equilibrium with high probability in convex-concave games under standard assumptions. Beyond the theory, we further show the empirical superiority of our method over baseline methods relying on the aforementioned opponent-selection heuristics in matrix games, grid-world soccer, Gomoku, and simulated robot sumo, with neural net policy function approximators.
In this paper, we propose an effective knowledge transfer framework to boost the weakly supervised object detection accuracy with the help of an external fully-annotated source dataset, whose categories may not overlap with the target domain. This se tting is of great practical value due to the existence of many off-the-shelf detection datasets. To more effectively utilize the source dataset, we propose to iteratively transfer the knowledge from the source domain by a one-class universal detector and learn the target-domain detector. The box-level pseudo ground truths mined by the target-domain detector in each iteration effectively improve the one-class universal detector. Therefore, the knowledge in the source dataset is more thoroughly exploited and leveraged. Extensive experiments are conducted with Pascal VOC 2007 as the target weakly-annotated dataset and COCO/ImageNet as the source fully-annotated dataset. With the proposed solution, we achieved an mAP of $59.7%$ detection performance on the VOC test set and an mAP of $60.2%$ after retraining a fully supervised Faster RCNN with the mined pseudo ground truths. This is significantly better than any previously known results in related literature and sets a new state-of-the-art of weakly supervised object detection under the knowledge transfer setting. Code: url{https://github.com/mikuhatsune/wsod_transfer}.
High-temperature alloy design requires a concurrent consideration of multiple mechanisms at different length scales. We propose a workflow that couples highly relevant physics into machine learning (ML) to predict properties of complex high-temperatu re alloys with an example of the 9-12 wt.% Cr steels yield strength. We have incorporated synthetic alloy features that capture microstructure and phase transformations into the dataset. Identified high impact features that affect yield strength of 9Cr from correlation analysis agree well with the generally accepted strengthening mechanism. As part of the verification process, the consistency of sub-datasets has been extensively evaluated with respect to temperature and then refined for the boundary conditions of trained ML models. The predicted yield strength of 9Cr steels using the ML models is in excellent agreement with experiments. The current approach introduces physically meaningful constraints in interrogating the trained ML models to predict properties of hypothetical alloys when applied to data-driven materials.
We present an extensive first-principles database of solute-vacancy, homoatomic, heteroatomic solute-solute, and solute-solute-vacancy binding energies of relevant alloying elements in aluminum. We particularly focus on the systems with major alloyin g elements in aluminum, i.e., Cu, Mg, and Si. We consider physical factors such as solute size and formation energies of intermetallic compounds to correlate with binding energies. Systematic studies of the homoatomic solute-solute-vacancy and heteroatomic (Cu, Mg, or Si)-solute-vacancy complexes reveal the overarching effect of the vacancy in stabilizing solute-solute pairs. The computed binding energies of the solute-solute-vacancy triplet successfully explain several experimental observations that remained unexplained by the reported pair binding energies in literature. The binding energy database presented here elucidates the interaction between solute cluster and vacancy in aluminum, and it is expected to provide insight into the design of advanced Al alloys with tailored properties.
Change detection is a basic task of remote sensing image processing. The research objective is to identity the change information of interest and filter out the irrelevant change information as interference factors. Recently, the rise of deep learnin g has provided new tools for change detection, which have yielded impressive results. However, the available methods focus mainly on the difference information between multitemporal remote sensing images and lack robustness to pseudo-change information. To overcome the lack of resistance of current methods to pseudo-changes, in this paper, we propose a new method, namely, dual attentive fully convolutional Siamese networks (DASNet) for change detection in high-resolution images. Through the dual-attention mechanism, long-range dependencies are captured to obtain more discriminant feature representations to enhance the recognition performance of the model. Moreover, the imbalanced sample is a serious problem in change detection, i.e. unchanged samples are much more than changed samples, which is one of the main reasons resulting in pseudo-changes. We put forward the weighted double margin contrastive loss to address this problem by punishing the attention to unchanged feature pairs and increase attention to changed feature pairs. The experimental results of our method on the change detection dataset (CDD) and the building change detection dataset (BCDD) demonstrate that compared with other baseline methods, the proposed method realizes maximum improvements of 2.1% and 3.6%, respectively, in the F1 score. Our Pytorch implementation is available at https://github.com/lehaifeng/DASNet.
In many vision-based reinforcement learning (RL) problems, the agent controls a movable object in its visual field, e.g., the players avatar in video games and the robotic arm in visual grasping and manipulation. Leveraging action-conditioned video p rediction, we propose an end-to-end learning framework to disentangle the controllable object from the observation signal. The disentangled representation is shown to be useful for RL as additional observation channels to the agent. Experiments on a set of Atari games with the popular Double DQN algorithm demonstrate improved sample efficiency and game performance (from 222.8% to 261.4% measured in normalized game scores, with prediction bonus reward).
154 - Jian Peng , Bo Tang , Hao Jiang 2019
Artificial neural networks face the well-known problem of catastrophic forgetting. Whats worse, the degradation of previously learned skills becomes more severe as the task sequence increases, known as the long-term catastrophic forgetting. It is due to two facts: first, as the model learns more tasks, the intersection of the low-error parameter subspace satisfying for these tasks becomes smaller or even does not exist; second, when the model learns a new task, the cumulative error keeps increasing as the model tries to protect the parameter configuration of previous tasks from interference. Inspired by the memory consolidation mechanism in mammalian brains with synaptic plasticity, we propose a confrontation mechanism in which Adversarial Neural Pruning and synaptic Consolidation (ANPyC) is used to overcome the long-term catastrophic forgetting issue. The neural pruning acts as long-term depression to prune task-irrelevant parameters, while the novel synaptic consolidation acts as long-term potentiation to strengthen task-relevant parameters. During the training, this confrontation achieves a balance in that only crucial parameters remain, and non-significant parameters are freed to learn subsequent tasks. ANPyC avoids forgetting important information and makes the model efficient to learn a large number of tasks. Specifically, the neural pruning iteratively relaxes the current tasks parameter conditions to expand the common parameter subspace of the task; the synaptic consolidation strategy, which consists of a structure-aware parameter-importance measurement and an element-wise parameter updating strategy, decreases the cumulative error when learning new tasks. The full source code is available at https://github.com/GeoX-Lab/ANPyC.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا