ترغب بنشر مسار تعليمي؟ اضغط هنا

3D hand-mesh reconstruction from RGB images facilitates many applications, including augmented reality (AR). However, this requires not only real-time speed and accurate hand pose and shape but also plausible mesh-image alignment. While existing work s already achieve promising results, meeting all three requirements is very challenging. This paper presents a novel pipeline by decoupling the hand-mesh reconstruction task into three stages: a joint stage to predict hand joints and segmentation; a mesh stage to predict a rough hand mesh; and a refine stage to fine-tune it with an offset mesh for mesh-image alignment. With careful design in the network structure and in the loss functions, we can promote high-quality finger-level mesh-image alignment and drive the models together to deliver real-time predictions. Extensive quantitative and qualitative results on benchmark datasets demonstrate that the quality of our results outperforms the state-of-the-art methods on hand-mesh/pose precision and hand-image alignment. In the end, we also showcase several real-time AR scenarios.
Over a complete Riemannian manifold of finite dimension, Greene and Wu introduced a convolution, known as Greene-Wu (GW) convolution. In this paper, we study properties of the GW convolution and apply it to non-Euclidean machine learning problems. In particular, we derive a new formula for how the curvature of the space would affect the curvature of the function through the GW convolution. Also, following the study of the GW convolution, a new method for gradient estimation over Riemannian manifolds is introduced.
In prescriptive analytics, the decision-maker observes historical samples of $(X, Y)$, where $Y$ is the uncertain problem parameter and $X$ is the concurrent covariate, without knowing the joint distribution. Given an additional covariate observation $x$, the goal is to choose a decision $z$ conditional on this observation to minimize the cost $mathbb{E}[c(z,Y)|X=x]$. This paper proposes a new distributionally robust approach under Wasserstein ambiguity sets, in which the nominal distribution of $Y|X=x$ is constructed based on the Nadaraya-Watson kernel estimator concerning the historical data. We show that the nominal distribution converges to the actual conditional distribution under the Wasserstein distance. We establish the out-of-sample guarantees and the computational tractability of the framework. Through synthetic and empirical experiments about the newsvendor problem and portfolio optimization, we demonstrate the strong performance and practical value of the proposed framework.
In this paper, we tackle the problem of unsupervised 3D object segmentation from a point cloud without RGB information. In particular, we propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model and jointly learn the multiple-o bject representation and segmentation in 3D via Variational Autoencoders (VAE). Inspired by SPAIR, we adopt an object-specification scheme that describes each objects location relative to its local voxel grid cell rather than the point cloud as a whole. To model the spatial mixture model on point clouds, we derive the Chamfer Likelihood, which fits naturally into the variational training pipeline. We further design a new spatially invariant graph neural network to generate a varying number of 3D points as a decoder within our VAE. Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information across diverse scenes.
This paper presents a method for learning logical task specifications and cost functions from demonstrations. Linear temporal logic (LTL) formulas are widely used to express complex objectives and constraints for autonomous systems. Yet, such specifi cations may be challenging to construct by hand. Instead, we consider demonstrated task executions, whose temporal logic structure and transition costs need to be inferred by an autonomous agent. We employ a spectral learning approach to extract a weighted finite automaton (WFA), approximating the unknown logic structure of the task. Thereafter, we define a product between the WFA for high-level task guidance and a Labeled Markov decision process (L-MDP) for low-level control and optimize a cost function that matches the demonstrators behavior. We demonstrate that our method is capable of generalizing the execution of the inferred task specification to new environment configurations.
68 - Tianyu Wang , Bo Lin , Baxi Chong 2020
Snake robots composed of alternating single-axis pitch and yaw joints have many internal degrees of freedom, which make them capable of versatile three-dimensional locomotion. In motion planning process, snake robot motions are often designed kinemat ically by a chronological sequence of continuous backbone curves that capture desired macroscopic shapes of the robot. However, as the geometric arrangement of single-axis rotary joints creates constraints on the rotations in the robot, it is challenging for the robot to reconstruct an arbitrary 3D curve. When the robot configuration does not accurately achieve the desired shapes defined by these backbone curves, the robot can have unexpected contacts with the environment, such that the robot does not achieve the desired motion. In this work, we propose a method for snake robots to reconstruct desired backbone curves by posing an optimization problem that exploits the robots geometric structure. We verified that our method enables fast and accurate curve-configuration
We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration ex tended to maintain an adaptive discretization of the space. From a theoretical perspective we provide worst-case regret bounds for our algorithm which are competitive compared to the state-of-the-art model-based algorithms. Moreover, our bounds are obtained via a modular proof technique which can potentially extend to incorporate additional structure on the problem. From an implementation standpoint, our algorithm has much lower storage and computational requirements due to maintaining a more efficient partition of the state and action spaces. We illustrate this via experiments on several canonical control problems, which shows that our algorithm empirically performs significantly better than fixed discretization in terms of both faster convergence and lower memory usage. Interestingly, we observe empirically that while fixed-discretization model-based algorithms vastly outperform their model-free counterparts, the two achieve comparable performance with adaptive discretization.
Stochastic Lipschitz bandit algorithms balance exploration and exploitation, and have been used for a variety of important task domains. In this paper, we present a framework for Lipschitz bandit methods that adaptively learns partitions of context- and arm-space. Due to this flexibility, the algorithm is able to efficiently optimize rewards and minimize regret, by focusing on the portions of the space that are most relevant. In our analysis, we link tree-based methods to Gaussian processes. In light of our analysis, we design a novel hierarchical Bayesian model for Lipschitz bandit problems. Our experiments show that our algorithms can achieve state-of-the-art performance in challenging real-world tasks such as neural network hyperparameter tuning.
All next generation ground-based and space-based solar telescopes require a good quality assessment metric in order to evaluate their imaging performance. In this paper, a new image quality metric, the median filter gradient similarity (MFGS) is prop osed for photospheric images. MFGS is a no-reference/blind objective image quality metric (IQM) by a measurement result between 0 and 1 and has been performed on short-exposure photospheric images captured by the New Vacuum Solar Telescope (NVST) of the Fuxian Solar Observatory and by the Solar Optical Telescope (SOT) onboard the Hinode satellite, respectively. The results show that: (1)the measured value of MFGS changes monotonically from 1 to 0 with degradation of image quality; (2)there exists a linear correlation between the measured values of MFGS and root-mean-square-contrast (RMS-contrast) of granulation; (3)MFGS is less affected by the image contents than the granular RMS-contrast. Overall, MFGS is a good alternative for the quality assessment of photospheric images.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا