Reinforcement Learning without Ground-Truth State

335 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xingyu Lin

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xingyu Lin - Harjatin Singh Baweja - David Held

علم الروبوتات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

To perform robot manipulation tasks, a low-dimensional state of the environment typically needs to be estimated. However, designing a state estimator can sometimes be difficult, especially in environments with deformable objects. An alternative is to learn an end-to-end policy that maps directly from high-dimensional sensor inputs to actions. However, if this policy is trained with reinforcement learning, then without a state estimator, it is hard to specify a reward function based on high-dimensional observations. To meet this challenge, we propose a simple indicator reward function for goal-conditioned reinforcement learning: we only give a positive reward when the robots observation exactly matches a target goal observation. We show that by relabeling the original goal with the achieved goal to obtain positive rewards (Andrychowicz et al., 2017), we can learn with the indicator reward function even in continuous state spaces. We propose two methods to further speed up convergence with indicator rewards: reward balancing and reward filtering. We show comparable performance between our method and an oracle which uses the ground-truth state for computing rewards. We show that our method can perform complex tasks in continuous state spaces such as rope manipulation from RGB-D images, without knowledge of the ground-truth state.

قيم البحث

130 - Henri Tiittanen , Emilia Oikarinen , Andreas Henelius 2019

Regression analysis is a standard supervised machine learning method used to model an outcome variable in terms of a set of predictor variables. In most real-world applications we do not know the true value of the outcome variable being predicted out side the training data, i.e., the ground truth is unknown. It is hence not straightforward to directly observe when the estimate from a model potentially is wrong, due to phenomena such as overfitting and concept drift. In this paper we present an efficient framework for estimating the generalization error of regression functions, applicable to any family of regression functions when the ground truth is unknown. We present a theoretical derivation of the framework and empirically evaluate its strengths and limitations. We find that it performs robustly and is useful for detecting concept drift in datasets in several real-world domains.

التعلم الالي التعلم الآلي

Reducing Magnetic Resonance Image Spacing by Learning Without Ground-Truth

381 - Kai Xuan , Liping Si , Lichi Zhang 2020

High-quality magnetic resonance (MR) image, i.e., with near isotropic voxel spacing, is desirable in various scenarios of medical image analysis. However, many MR acquisitions use large inter-slice spacing in clinical practice. In this work, we propo se a novel deep-learning-based super-resolution algorithm to generate high-resolution (HR) MR images with small slice spacing from low-resolution (LR) inputs of large slice spacing. Notice that most existing deep-learning-based methods need paired LR and HR images to supervise the training, but in clinical scenarios, usually no HR images will be acquired. Therefore, our unique goal herein is to design and train the super-resolution network with no real HR ground-truth. Specifically, two training stages are used in our method. First, HR images of reduced slice spacing are synthesized from real LR images using variational auto-encoder (VAE). Although these synthesized HR images are as realistic as possible, they may still suffer from unexpected morphing induced by VAE, implying that the synthesized HR images cannot be paired with the real LR images in terms of anatomical structure details. In the second stage, we degrade the synthesized HR images to generate corresponding LR images and train a super-resolution network based on these synthesized HR and degraded LR pairs. The underlying mechanism is that such a super-resolution network is less vulnerable to anatomical variability. Experiments on knee MR images successfully demonstrate the effectiveness of our proposed solution to reduce the slice spacing for better rendering.

معالجة الصور والفيديو

Training Image Estimators without Image Ground-Truth

86 - Zhihao Xia , Ayan Chakrabarti 2019

Deep neural networks have been very successful in image estimation applications such as compressive-sensing and image restoration, as a means to estimate images from partial, blurry, or otherwise degraded measurements. These networks are trained on a large number of corresponding pairs of measurements and ground-truth images, and thus implicitly learn to exploit domain-specific image statistics. But unlike measurement data, it is often expensive or impractical to collect a large training set of ground-truth images in many application settings. In this paper, we introduce an unsupervised framework for training image estimation networks, from a training set that contains only measurements---with two varied measurements per image---but no ground-truth for the full images desired as output. We demonstrate that our framework can be applied for both regular and blind image estimation tasks, where in the latter case parameters of the measurement model (e.g., the blur kernel) are unknown: during inference, and potentially, also during training. We evaluate our method for training networks for compressive-sensing and blind deconvolution, considering both non-blind and blind training for the latter. Our unsupervised framework yields models that are nearly as accurate as those from fully supervised training, despite not having access to any ground-truth images.

الرؤية الحاسوبية وتمييز الأنماط

Unsupervised Deep Basis Pursuit: Learning inverse problems without ground-truth data

103 - Jonathan I. Tamir , Stella X. Yu , Michael Lustig 2019

Basis pursuit is a compressed sensing optimization in which the l1-norm is minimized subject to model error constraints. Here we use a deep neural network prior instead of l1-regularization. Using known noise statistics, we jointly learn the prior an d reconstruct images without access to ground-truth data. During training, we use alternating minimization across an unrolled iterative network and jointly solve for the neural network weights and training set image reconstructions. At inference, we fix the weights and pass the measurements through the network. We compare reconstruction performance between unsupervised and supervised (i.e. with ground-truth) methods. We hypothesize this technique could be used to learn reconstruction when ground-truth data are unavailable, such as in high-resolution dynamic MRI.

معالجة الإشارات معالجة الصور والفيديو

SynSeg-Net: Synthetic Segmentation Without Target Modality Ground Truth

281 - Yuankai Huo , Zhoubing Xu , Hyeonsoo Moon 2018

A key limitation of deep convolutional neural networks (DCNN) based image segmentation methods is the lack of generalizability. Manually traced training images are typically required when segmenting organs in a new imaging modality or from distinct d isease cohort. The manual efforts can be alleviated if the manually traced images in one imaging modality (e.g., MRI) are able to train a segmentation network for another imaging modality (e.g., CT). In this paper, we propose an end-to-end synthetic segmentation network (SynSeg-Net) to train a segmentation network for a target imaging modality without having manual labels. SynSeg-Net is trained by using (1) unpaired intensity images from source and target modalities, and (2) manual labels only from source modality. SynSeg-Net is enabled by the recent advances of cycle generative adversarial networks (CycleGAN) and DCNN. We evaluate the performance of the SynSeg-Net on two experiments: (1) MRI to CT splenomegaly synthetic segmentation for abdominal images, and (2) CT to MRI total intracranial volume synthetic segmentation (TICV) for brain images. The proposed end-to-end approach achieved superior performance to two stage methods. Moreover, the SynSeg-Net achieved comparable performance to the traditional segmentation network using target modality labels in certain scenarios. The source code of SynSeg-Net is publicly available (https://github.com/MASILab/SynSeg-Net).

الرؤية الحاسوبية وتمييز الأنماط