ترغب بنشر مسار تعليمي؟ اضغط هنا

Discrete-continuous hybrid action space is a natural setting in many practical problems, such as robot control and game AI. However, most previous Reinforcement Learning (RL) works only demonstrate the success in controlling with either discrete or c ontinuous action space, while seldom take into account the hybrid action space. One naive way to address hybrid action RL is to convert the hybrid action space into a unified homogeneous action space by discretization or continualization, so that conventional RL algorithms can be applied. However, this ignores the underlying structure of hybrid action space and also induces the scalability issue and additional approximation difficulties, thus leading to degenerated results. In this paper, we propose Hybrid Action Representation (HyAR) to learn a compact and decodable latent representation space for the original hybrid action space. HyAR constructs the latent space and embeds the dependence between discrete action and continuous parameter via an embedding table and conditional Variantional Auto-Encoder (VAE). To further improve the effectiveness, the action representation is trained to be semantically smooth through unsupervised environmental dynamics prediction. Finally, the agent then learns its policy with conventional DRL algorithms in the learned representation space and interacts with the environment by decoding the hybrid action embeddings to the original action space. We evaluate HyAR in a variety of environments with discrete-continuous action space. The results demonstrate the superiority of HyAR when compared with previous baselines, especially for high-dimensional action spaces.
This paper strives to generate a synthetic computed tomography (CT) image from a magnetic resonance (MR) image. The synthetic CT image is valuable for radiotherapy planning when only an MR image is available. Recent approaches have made large strides in solving this challenging synthesis problem with convolutional neural networks that learn a mapping from MR inputs to CT outputs. In this paper, we find that all existing approaches share a common limitation: reconstruction breaks down in and around the high-frequency parts of CT images. To address this common limitation, we introduce frequency-supervised deep networks to explicitly enhance high-frequency MR-to-CT image reconstruction. We propose a frequency decomposition layer that learns to decompose predicted CT outputs into low- and high-frequency components, and we introduce a refinement module to improve high-frequency reconstruction through high-frequency adversarial learning. Experimental results on a new dataset with 45 pairs of 3D MR-CT brain images show the effectiveness and potential of the proposed approach. Code is available at url{https://github.com/shizenglin/Frequency-Supervised-MR-to-CT-Image-Synthesis}.
Integrated phase I-II clinical trial designs are efficient approaches to accelerate drug development. In cases where efficacy cannot be ascertained in a short period of time, two-stage approaches are usually employed. When different patient populatio ns are involved across stages, it is worth of discussion about the use of efficacy data collected from both stages. In this paper, we focus on a two-stage design that aims to estimate safe dose combinations with a certain level of efficacy. In stage I, conditional escalation with overdose control (EWOC) is used to allocate successive cohorts of patients. The maximum tolerated dose (MTD) curve is estimated based on a Bayesian dose-toxicity model. In stage II, we consider an adaptive allocation of patients to drug combinations that have a high probability of being efficacious along the obtained MTD curve. A robust Bayesian hierarchical model is proposed to allow sharing of information on the efficacy parameters across stages assuming the related parameters are either exchangeable or nonexchangeable. Under the assumption of exchangeability, a random-effects distribution is specified for the main effects parameters to capture uncertainty about the between-stage differences. The proposed methodology is assessed with extensive simulations motivated by a real phase I-II drug combination trial using continuous doses.
Book covers are intentionally designed and provide an introduction to a book. However, they typically require professional skills to design and produce the cover images. Thus, we propose a generative neural network that can produce book covers based on an easy-to-use layout graph. The layout graph contains objects such as text, natural scene objects, and solid color spaces. This layout graph is embedded using a graph convolutional neural network and then used with a mask proposal generator and a bounding-box generator and filled using an object proposal generator. Next, the objects are compiled into a single image and the entire network is trained using a combination of adversarial training, perceptual training, and reconstruction. Finally, a Style Retention Network (SRNet) is used to transfer the learned font style onto the desired text. Using the proposed method allows for easily controlled and unique book covers.
For learned image compression, the autoregressive context model is proved effective in improving the rate-distortion (RD) performance. Because it helps remove spatial redundancies among latent representations. However, the decoding process must be do ne in a strict scan order, which breaks the parallelization. We propose a parallelizable checkerboard context model (CCM) to solve the problem. Our two-pass checkerboard context calculation eliminates such limitations on spatial locations by re-organizing the decoding order. Speeding up the decoding process more than 40 times in our experiments, it achieves significantly improved computational efficiency with almost the same rate-distortion performance. To the best of our knowledge, this is the first exploration on parallelization-friendly spatial context model for learned image compression.
123 - Yan Zheng , Yi Liu , Xiaofei Xie 2021
Web testing has long been recognized as a notoriously difficult task. Even nowadays, web testing still heavily relies on manual efforts while automated web testing is far from achieving human-level performance. Key challenges in web testing include d ynamic content update and deep bugs hiding under complicated user interactions and specific input values, which can only be triggered by certain action sequences in the huge search space. In this paper, we propose WebExplor, an automatic end-to-end web testing framework, to achieve an adaptive exploration of web applications. WebExplor adopts curiosity-driven reinforcement learning to generate high-quality action sequences (test cases) satisfying temporal logical relations. Besides, WebExplor incrementally builds an automaton during the online testing process, which provides high-level guidance to further improve the testing efficiency. We have conducted comprehensive evaluations of WebExplor on six real-world projects, a commercial SaaS web application, and performed an in-the-wild study of the top 50 web applications in the world. The results demonstrate that in most cases WebExplor can achieve a significantly higher failure detection rate, code coverage, and efficiency than existing state-of-the-art web testing techniques. WebExplor also detected 12 previously unknown failures in the commercial web application, which have been confirmed and fixed by the developers. Furthermore, our in-the-wild study further uncovered 3,466 exceptions and errors.
Digital payment volume has proliferated in recent years with the rapid growth of small businesses and online shops. When processing these digital transactions, recognizing each merchants real identity (i.e., business type) is vital to ensure the inte grity of payment processing systems. Conventionally, this problem is formulated as a time series classification problem solely using the merchant transaction history. However, with the large scale of the data, and changing behaviors of merchants and consumers over time, it is extremely challenging to achieve satisfying performance from off-the-shelf classification methods. In this work, we approach this problem from a multi-modal learning perspective, where we use not only the merchant time series data but also the information of merchant-merchant relationship (i.e., affinity) to verify the self-reported business type (i.e., merchant category) of a given merchant. Specifically, we design two individual encoders, where one is responsible for encoding temporal information and the other is responsible for affinity information, and a mechanism to fuse the outputs of the two encoders to accomplish the identification task. Our experiments on real-world credit card transaction data between 71,668 merchants and 433,772,755 customers have demonstrated the effectiveness and efficiency of the proposed model.
This paper develops Bayesian sample size formulae for experiments comparing two groups. We assume the experimental data will be analysed in the Bayesian framework, where pre-experimental information from multiple sources can be represented into robus t priors. In particular, such robust priors account for preliminary belief about the pairwise commensurability between parameters that underpin the historical and new experiments, to permit flexible borrowing of information. Averaged over the probability space of the new experimental data, appropriate sample sizes are found according to criteria that control certain aspects of the posterior distribution, such as the coverage probability or length of a defined density region. Our Bayesian methodology can be applied to circumstances where the common variance in the new experiment is known or unknown. Exact solutions are available based on most of the criteria considered for Bayesian sample size determination, while a search procedure is described in cases for which there are no closed-form expressions. We illustrate the application of our Bayesian sample size formulae in the setting of designing a clinical trial. Hypothetical data examples, motivated by a rare-disease trial with elicitation of expert prior opinion, and a comprehensive performance evaluation of the proposed methodology are presented.
Incorporating preclinical animal data, which can be regarded as a special kind of historical data, into phase I clinical trials can improve decision making when very little about human toxicity is known. In this paper, we develop a robust hierarchica l modelling approach to leverage animal data into new phase I clinical trials, where we bridge across non-overlapping, potentially heterogeneous patient subgroups. Translation parameters are used to bring both historical and contemporary data onto a common dosing scale. This leads to feasible exchangeability assumptions that the parameter vectors, which underpin the dose-toxicity relationship per study, are assumed to be drawn from a common distribution. Moreover, human dose-toxicity parameter vectors are assumed to be exchangeable either with the standardised, animal study-specific parameter vectors, or between themselves. Possibility of non-exchangeability for each parameter vector is considered to avoid inferences for extreme subgroups being overly influenced by the other. We illustrate the proposed approach with several trial data examples, and evaluate the operating characteristics of our model compared with several alternatives in a simulation study. Numerical results show that our approach yields robust inferences in circumstances, where data from multiple sources are inconsistent and/or the bridging assumptions are incorrect.
Basket trials have emerged as a new class of efficient approaches in oncology to evaluate a new treatment in several patient subgroups simultaneously. In this paper, we extend the key ideas to disease areas outside of oncology, developing a robust Ba yesian methodology for randomised, placebo-controlled basket trials with a continuous endpoint to enable borrowing of information across subtrials with similar treatment effects. After adjusting for covariates, information from a complementary subtrial can be represented into a commensurate prior for the parameter that underpins the subtrial under consideration. We propose using distributional discrepancy to characterise the commensurability between subtrials for appropriate borrowing of information through a spike-and-slab prior, which is placed on the prior precision factor. When the basket trial has at least three subtrials, commensurate priors for point-to-point borrowing are combined into a marginal predictive prior, according to the weights transformed from the pairwise discrepancy measures. In this way, only information from subtrial(s) with the most commensurate treatment effect is leveraged. The marginal predictive prior is updated to a robust posterior by the contemporary subtrial data to inform decision making. Operating characteristics of the proposed methodology are evaluated through simulations motivated by a real basket trial in chronic diseases. The proposed methodology has advantages compared to other selected Bayesian analysis models, for (i) identifying the most commensurate source of information, and (ii) gauging the degree of borrowing from specific subtrials. Numerical results also suggest that our methodology can improve the precision of estimates and, potentially, the statistical power for hypothesis testing.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا