ترغب بنشر مسار تعليمي؟ اضغط هنا

Non-stationarity is one thorny issue in multi-agent reinforcement learning, which is caused by the policy changes of agents during the learning procedure. Current works to solve this problem have their own limitations in effectiveness and scalability , such as centralized critic and decentralized actor (CCDA), population-based self-play, modeling of others and etc. In this paper, we novelly introduce a $delta$-stationarity measurement to explicitly model the stationarity of a policy sequence, which is theoretically proved to be proportional to the joint policy divergence. However, simple policy factorization like mean-field approximation will mislead to larger policy divergence, which can be considered as trust region decomposition dilemma. We model the joint policy as a general Markov random field and propose a trust region decomposition network based on message passing to estimate the joint policy divergence more accurately. The Multi-Agent Mirror descent policy algorithm with Trust region decomposition, called MAMT, is established with the purpose to satisfy $delta$-stationarity. MAMT can adjust the trust region of the local policies adaptively in an end-to-end manner, thereby approximately constraining the divergence of joint policy to alleviate the non-stationary problem. Our method can bring noticeable and stable performance improvement compared with baselines in coordination tasks of different complexity.
When solving a complex task, humans will spontaneously form teams and to complete different parts of the whole task, respectively. Meanwhile, the cooperation between teammates will improve efficiency. However, for current cooperative MARL methods, th e cooperation team is constructed through either heuristics or end-to-end blackbox optimization. In order to improve the efficiency of cooperation and exploration, we propose a structured diversification emergence MARL framework named {sc{Rochico}} based on reinforced organization control and hierarchical consensus learning. {sc{Rochico}} first learns an adaptive grouping policy through the organization control module, which is established by independent multi-agent reinforcement learning. Further, the hierarchical consensus module based on the hierarchical intentions with consensus constraint is introduced after team formation. Simultaneously, utilizing the hierarchical consensus module and a self-supervised intrinsic reward enhanced decision module, the proposed cooperative MARL algorithm {sc{Rochico}} can output the final diversified multi-agent cooperative policy. All three modules are organically combined to promote the structured diversification emergence. Comparative experiments on four large-scale cooperation tasks show that {sc{Rochico}} is significantly better than the current SOTA algorithms in terms of exploration efficiency and cooperation strength.
We report the measurement of spin current induced charge accumulation, the inverse Edelstein effect (IEE), on the surface of a candidate topological Kondo insulator SmB6 single crystal. Robust surface conduction channel of SmB6 has been shown to exhi bit large degree of spin-momentum locking, and spin polarized current through an external ferromagnetic contact induces the spin dependent charge accumulation on the surface of SmB6. The dependences of the IEE signal on the bias current, an external magnetic field direction and temperature are consistent with the anticlockwise spin texture for the surface band in SmB6 in the momentum space, and the direction and magnitude of the effect compared with the normal Edelstein signal are clearly explained by the Onsager reciprocal relation. Furthermore, we estimate spin-to-charge conversion efficiency, the IEE length, as 4.46 nm that is an order of magnitude larger than the efficiency found in other typical Rashba interfaces, implying that the Rashba contribution to the IEE signal could be small. Building upon existing reports on the surface charge and spin conduction nature on this material, our results provide additional evidence that the surface of SmB6 supports spin polarized conduction channel.
We develop new perturbation techniques for conducting convergence analysis of various first-order algorithms for a class of nonsmooth optimization problems. We consider the iteration scheme of an algorithm to construct a perturbed stationary point se t-valued map, and define the perturbing parameter by the difference of two consecutive iterates. Then, we show that the calmness condition of the induced set-valued map, together with a local version of the proper separation of stationary value condition, is a sufficient condition to ensure the linear convergence of the algorithm. The equivalence of the calmness condition to the one for the canonically perturbed stationary point set-valued map is proved, and this equivalence allows us to derive some sufficient conditions for calmness by using some recent developments in variational analysis. These sufficient conditions are different from existing results (especially, those error-bound-based ones) in that they can be easily verified for many concrete application models. Our analysis is focused on the fundamental proximal gradient (PG) method, and it enables us to show that any accumulation of the sequence generated by the PG method must be a stationary point in terms of the proximal subdifferential, instead of the limiting subdifferential. This result finds the surprising fact that the solution quality found by the PG method is in general superior. Our analysis also leads to some improvement for the linear convergence results of the PG method in the convex case. The new perturbation technique can be conveniently used to derive linear rate convergence of a number of other first-order methods including the well-known alternating direction method of multipliers and primal-dual hybrid gradient method, under mild assumptions.
The Kondo insulator compound SmB6 has emerged as a strong candidate for the realization of a topologically nontrivial state in a strongly correlated system, a topological Kondo insulator, which can be a novel platform for investigating the interplay between nontrivial topology and emergent correlation driven phenomena in solid state systems. Electronic transport measurements on this material, however, so far showed only the robust surface dominated charge conduction at low temperatures, lacking evidence of its connection to the topological nature by showing, for example, spin polarization due to spin momentum locking. Here, we find evidence for surface state spin polarization by electrical detection of a current induced spin chemical potential difference on the surface of a SmB6 single crystal. We clearly observe a surface dominated spin voltage, which is proportional to the projection of the spin polarization onto the contact magnetization, is determined by the direction and magnitude of the charge current and is strongly temperature dependent due to the crossover from surface to bulk conduction. We estimate the lower bound of the surface state net spin polarization as 15 percent based on the quantum transport model providing direct evidence that SmB6 supports metallic spin helical surface states.
Wasserstein distance plays increasingly important roles in machine learning, stochastic programming and image processing. Major efforts have been under way to address its high computational complexity, some leading to approximate or regularized varia tions such as Sinkhorn distance. However, as we will demonstrate, regularized variations with large regularization parameter will degradate the performance in several important machine learning applications, and small regularization parameter will fail due to numerical stability issues with existing algorithms. We address this challenge by developing an Inexact Proximal point method for exact Optimal Transport problem (IPOT) with the proximal operator approximately evaluated at each iteration using projections to the probability simplex. The algorithm (a) converges to exact Wasserstein distance with theoretical guarantee and robust regularization parameter selection, (b) alleviates numerical stability issue, (c) has similar computational complexity to Sinkhorn, and (d) avoids the shrinking problem when apply to generative models. Furthermore, a new algorithm is proposed based on IPOT to obtain sharper Wasserstein barycenter.
This paper presents an unsupervised learning approach for simultaneous sample and feature selection, which is in contrast to existing works which mainly tackle these two problems separately. In fact the two tasks are often interleaved with each other : noisy and high-dimensional features will bring adverse effect on sample selection, while informative or representative samples will be beneficial to feature selection. Specifically, we propose a framework to jointly conduct active learning and feature selection based on the CUR matrix decomposition. From the data reconstruction perspective, both the selected samples and features can best approximate the original dataset respectively, such that the selected samples characterized by the features are highly representative. In particular, our method runs in one-shot without the procedure of iterative sample selection for progressive labeling. Thus, our model is especially suitable when there are few labeled samples or even in the absence of supervision, which is a particular challenge for existing methods. As the joint learning problem is NP-hard, the proposed formulation involves a convex but non-smooth optimization problem. We solve it efficiently by an iterative algorithm, and prove its global convergence. Experimental results on publicly available datasets corroborate the efficacy of our method compared with the state-of-the-art.
Topological insulators, with metallic boundary states protected against time-reversal-invariant perturbations, are a promising avenue for realizing exotic quantum states of matter including various excitations of collective modes predicted in particl e physics, such as Majorana fermions and axions. According to theoretical predictions, a topological insulating state can emerge from not only a weakly interacting system with strong spin-orbit coupling, but also in insulators driven by strong electron correlations. The Kondo insulator compound SmB6 is an ideal candidate for realizing this exotic state of matter, with hybridization between itinerant conduction electrons and localized $f$-electrons driving an insulating gap and metallic surface states at low temperatures. Here we exploit the existence of surface ferromagnetism in SmB6 to investigate the topological nature of metallic surface states by studying magnetotransport properties at very low temperatures. We find evidence of one-dimensional surface transport with a quantized conductance value of $e^2/h$ originating from the chiral edge channels of ferromagnetic domain walls, providing strong evidence that topologically non-trivial surface states exist in SmB6.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا