ﻻ يوجد ملخص باللغة العربية
Microfluidic devices are utilized to control and direct flow behavior in a wide variety of applications, particularly in medical diagnostics. A particularly popular form of microfluidics -- called inertial microfluidic flow sculpting -- involves placing a sequence of pillars to controllably deform an initial flow field into a desired one. Inertial flow sculpting can be formally defined as an inverse problem, where one identifies a sequence of pillars (chosen, with replacement, from a finite set of pillars, each of which produce a specific transformation) whose composite transformation results in a user-defined desired transformation. Endemic to most such problems in engineering, inverse problems are usually quite computationally intractable, with most traditional approaches based on search and optimization strategies. In this paper, we pose this inverse problem as a Reinforcement Learning (RL) problem. We train a DoubleDQN agent to learn from this environment. The results suggest that learning is possible using a DoubleDQN model with the success frequency reaching 90% in 200,000 episodes and the rewards converging. While most of the results are obtained by fixing a particular target flow shape to simplify the learning problem, we later demonstrate how to transfer the learning of an agent based on one target shape to another, i.e. from one design to another and thus be useful for a generic design of a flow shape.
We propose a new low-cost machine-learning-based methodology which assists designers in reducing the gap between the problem and the solution in the design process. Our work applies reinforcement learning (RL) to find the optimal task-oriented design
Conventional simulations on multi-exit indoor evacuation focus primarily on how to determine a reasonable exit based on numerous factors in a changing environment. Results commonly include some congested and other under-utilized exits, especially wit
Many real-world sequential decision making problems are partially observable by nature, and the environment model is typically unknown. Consequently, there is great need for reinforcement learning methods that can tackle such problems given only a st
Off-policy learning allows us to learn about possible policies of behavior from experience generated by a different behavior policy. Temporal difference (TD) learning algorithms can become unstable when combined with function approximation and off-po
The offline reinforcement learning (RL) setting (also known as full batch RL), where a policy is learned from a static dataset, is compelling as progress enables RL methods to take advantage of large, previously-collected datasets, much like how the