ترغب بنشر مسار تعليمي؟ اضغط هنا

Differentiable architecture search (DARTS) marks a milestone in Neural Architecture Search (NAS), boasting simplicity and small search costs. However, DARTS still suffers from frequent performance collapse, which happens when some operations, such as skip connections, zeroes and poolings, dominate the architecture. In this paper, we are the first to point out that the phenomenon is attributed to bi-level optimization. We propose Single-DARTS which merely uses single-level optimization, updating network weights and architecture parameters simultaneously with the same data batch. Even single-level optimization has been previously attempted, no literature provides a systematic explanation on this essential point. Replacing the bi-level optimization, Single-DARTS obviously alleviates performance collapse as well as enhances the stability of architecture search. Experiment results show that Single-DARTS achieves state-of-the-art performance on mainstream search spaces. For instance, on NAS-Benchmark-201, the searched architectures are nearly optimal ones. We also validate that the single-level optimization framework is much more stable than the bi-level one. We hope that this simple yet effective method will give some insights on differential architecture search. The code is available at https://github.com/PencilAndBike/Single-DARTS.git.
This paper revisits human-object interaction (HOI) recognition at image level without using supervisions of object location and human pose. We name it detection-free HOI recognition, in contrast to the existing detection-supervised approaches which r ely on object and keypoint detections to achieve state of the art. With our method, not only the detection supervision is evitable, but superior performance can be achieved by properly using image-text pre-training (such as CLIP) and the proposed Log-Sum-Exp Sign (LSE-Sign) loss function. Specifically, using text embeddings of class labels to initialize the linear classifier is essential for leveraging the CLIP pre-trained image encoder. In addition, LSE-Sign loss facilitates learning from multiple labels on an imbalanced dataset by normalizing gradients over all classes in a softmax format. Surprisingly, our detection-free solution achieves 60.5 mAP on the HICO dataset, outperforming the detection-supervised state of the art by 13.4 mAP
It has long remained elusive whether CuCo$_{2}$S$_{4}$ thiospinel shows bulk superconductivity. Here we clarify the issue by studying on the samples of sulfur-deficient CuCo$_{2}$S$_{3.7}$ and sulfurized CuCo$_{2}$S$_{4}$. The sample CuCo$_{2}$S$_{3. 7}$ has a smaller lattice constant of $a=9.454$ {AA}, and it is not superconducting down to 1.8 K. After a full sulfurization, the $a$ axis of the thiospinel phase increases to 9.475 {AA}, and the thiospinel becomes nearly stoichiometric CuCo$_{2}$S$_{4}$, although a secondary phase of slightly Cu-doped CoS$_2$ forms. Bulk superconductivity at 4.2 K and Pauli paramagnetism have been demonstrated for the sulfurized CuCo$_{2}$S$_{4}$ by the measurements of electrical resistivity, magnetic susceptibility, and specific heat.
We study $bar{Q}Qbar{q}q$ and $bar{Q}qQbar{q}$ molecular states as mixed states in QCD sum rules. By calculating the two-point correlation functions of pure states of their corresponding currents, we review the mass and coupling constant predictions of $J^{PC}=1^{++}$, $1^{--}$, $1^{-+}$ molecular states. By calculating the two-point mixed correlation functions of $bar{Q}Qbar{q}q$ and $bar{Q}qQbar{q}$ molecular currents, and we estimate the mass and coupling constants of the corresponding ``physical state that couples to both $bar{Q}Qbar{q}q$ and $bar{Q}qQbar{q}$ currents. Our results suggest that $1^{++}$ states are more likely mixing from $bar{Q}Qbar{q}q$ and $bar{Q}qQbar{q}$ components, while for $1^{--}$ and $1^{-+}$ states, there is less mixing between $bar{Q}Qbar{q}q$ and $bar{Q}qQbar{q}$. Our results suggest the $Y$ series of states have more complicated components.
Statistical uncertainty has many components, such as measurement errors, temporal variation, or sampling. Not all of these sources are relevant when considering a specific application, since practitioners might view some attributes of observations as fixed. We study the statistical inference problem arising when data is drawn conditionally on some attributes. These attributes are assumed to be sampled from a super-population but viewed as fixed when conducting uncertainty quantification. The estimand is thus defined as the parameter of a conditional distribution. We propose methods to construct conditionally valid p-values and confidence intervals for these conditional estimands based on asymptotically linear estimators. In this setting, a given estimator is conditionally unbiased for potentially many conditional estimands, which can be seen as parameters of different populations. Testing different populations raises questions of multiple testing. We discuss simple procedures that control novel conditional error rates. In addition, we introduce a bias correction technique that enables transfer of estimators across conditional distributions arising from the same super-population. This can be used to infer parameters and estimators on future datasets based on some new data. The validity and applicability of the proposed methods are demonstrated on simulated and real-world data.
107 - Yingying Jin , Li-Hong Xie 2021
The concept of gyrogroups is a generalization of groups which do not explicitly have associativity. Recently, Atiponrat extended the idea of topological (paratopological) groups to topological (paratopological) gyrogroups. In this paper, we prove tha t every regular (Hausdorff) locally gyroscopic invariant paratopological gyrogroup $G$ is completely regular (function Hausdorff). These results improve theorems of Banakh and Ravsky for paratopological groups. Also, we extend the Pontrjagin conditions of (para)topological groups to (para)topological gyrogroups.
The concept of gyrogroups is a generalization of groups which do not explicitly have associativity. Recently, Wattanapan et al consider the construction of Hartman-Mycielski in strongly topological gyrogroups. In this paper, we extend their results i n topological gyrogroups. We mainly, among other results, prove that every Hausdorff topological gyrogroup $G$ can be embedded as a closed subgyrogroup of a Hausdorff path-connected and locally path-connected topological gyrogroup $G^bullet$.
The mixed-logit model is a flexible tool in transportation choice analysis, which provides valuable insights into inter and intra-individual behavioural heterogeneity. However, applications of mixed-logit models are limited by the high computational and data requirements for model estimation. When estimating on small samples, the Bayesian estimation approach becomes vulnerable to over and under-fitting. This is problematic for investigating the behaviour of specific population sub-groups or market segments with low data availability. Similar challenges arise when transferring an existing model to a new location or time period, e.g., when estimating post-pandemic travel behaviour. We propose an Early Stopping Bayesian Data Assimilation (ESBDA) simulator for estimation of mixed-logit which combines a Bayesian statistical approach with Machine Learning methodologies. The aim is to improve the transferability of mixed-logit models and to enable the estimation of robust choice models with low data availability. This approach can provide new insights into choice behaviour where the traditional estimation of mixed-logit models was not possible due to low data availability, and open up new opportunities for investment and planning decisions support. The ESBDA estimator is benchmarked against the Direct Application approach, a basic Bayesian simulator with random starting parameter values and a Bayesian Data Assimilation (BDA) simulator without early stopping. The ESBDA approach is found to effectively overcome under and over-fitting and non-convergence issues in simulation. Its resulting models clearly outperform those of the reference simulators in predictive accuracy. Furthermore, models estimated with ESBDA tend to be more robust, with significant parameters with signs and values consistent with behavioural theory, even when estimated on small samples.
We study offline reinforcement learning (RL), which aims to learn an optimal policy based on a dataset collected a priori. Due to the lack of further interactions with the environment, offline RL suffers from the insufficient coverage of the dataset, which eludes most existing theoretical analysis. In this paper, we propose a pessimistic variant of the value iteration algorithm (PEVI), which incorporates an uncertainty quantifier as the penalty function. Such a penalty function simply flips the sign of the bonus function for promoting exploration in online RL, which makes it easily implementable and compatible with general function approximators. Without assuming the sufficient coverage of the dataset, we establish a data-dependent upper bound on the suboptimality of PEVI for general Markov decision processes (MDPs). When specialized to linear MDPs, it matches the information-theoretic lower bound up to multiplicative factors of the dimension and horizon. In other words, pessimism is not only provably efficient but also minimax optimal. In particular, given the dataset, the learned policy serves as the best effort among all policies, as no other policies can do better. Our theoretical analysis identifies the critical role of pessimism in eliminating a notion of spurious correlation, which emerges from the irrelevant trajectories that are less covered by the dataset and not informative for the optimal policy.
Deep learning applications in shaping ad hoc planning proposals are limited by the difficulty in integrating professional knowledge about cities with artificial intelligence. We propose a novel, complementary use of deep neural networks and planning guidance to automate street network generation that can be context-aware, example-based and user-guided. The model tests suggest that the incorporation of planning knowledge (e.g., road junctions and neighborhood types) in the model training leads to a more realistic prediction of street configurations. Furthermore, the new tool provides both professional and lay users an opportunity to systematically and intuitively explore benchmark proposals for comparisons and further evaluations.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا