No Arabic abstract
This paper presents a new conditional GAN (named convex relaxing CGAN or crCGAN) to replicate the conventional constrained topology optimization algorithms in an extremely effective and efficient process. The proposed crCGAN consists of a generator and a discriminator, both of which are deep convolutional neural networks (CNN) and the topology design constraint can be conditionally set to both the generator and discriminator. In order to improve the training efficiency and accuracy due to the dependency between the training images and the condition, a variety of crCGAN formulation are introduced to relax the non-convex design space. These new formulations were evaluated and validated via a series of comprehensive experiments. Moreover, a minibatch discrimination technique was introduced in the crCGAN training process to stabilize the convergence and avoid the mode collapse problems. Additional verifications were conducted using the state-of-the-art MNIST digits and CIFAR-10 images conditioned by class labels. The experimental evaluations clearly reveal that the new objective formulation with the minibatch discrimination training provides not only the accuracy but also the consistency of the designs.
When developing and analyzing new hyperparameter optimization (HPO) methods, it is vital to empirically evaluate and compare them on well-curated benchmark suites. In this work, we list desirable properties and requirements for such benchmarks and propose a new set of challenging and relevant multifidelity HPO benchmark problems motivated by these requirements. For this, we revisit the concept of surrogate-based benchmarks and empirically compare them to more widely-used tabular benchmarks, showing that the latter ones may induce bias in performance estimation and ranking of HPO methods. We present a new surrogate-based benchmark suite for multifidelity HPO methods consisting of 9 benchmark collections that constitute over 700 multifidelity HPO problems in total. All our benchmarks also allow for querying of multiple optimization targets, enabling the benchmarking of multi-objective HPO. We examine and compare our benchmark suite with respect to the defined requirements and show that our benchmarks provide viable additions to existing suites.
Black-box problems are common in real life like structural design, drug experiments, and machine learning. When optimizing black-box systems, decision-makers always consider multiple performances and give the final decision by comprehensive evaluations. Motivated by such practical needs, we focus on constrained black-box problems where the objective and constraints lack known special structure, and evaluations are expensive and even with noise. We develop a novel constrained Bayesian optimization approach based on the knowledge gradient method ($c-rm{KG}$). A new acquisition function is proposed to determine the next batch of samples considering optimality and feasibility. An unbiased estimator of the gradient of the new acquisition function is derived to implement the $c-rm{KG}$ approach.
In this study, a novel topology optimization approach based on conditional Wasserstein generative adversarial networks (CWGAN) is developed to replicate the conventional topology optimization algorithms in an extremely computationally inexpensive way. CWGAN consists of a generator and a discriminator, both of which are deep convolutional neural networks (CNN). The limited samples of data, quasi-optimal planar structures, needed for training purposes are generated using the conventional topology optimization algorithms. With CWGANs, the topology optimization conditions can be set to a required value before generating samples. CWGAN truncates the global design space by introducing an equality constraint by the designer. The results are validated by generating an optimized planar structure using the conventional algorithms with the same settings. A proof of concept is presented which is known to be the first such illustration of fusion of CWGANs and topology optimization.
When training end-to-end learned models for lossy compression, one has to balance the rate and distortion losses. This is typically done by manually setting a tradeoff parameter $beta$, an approach called $beta$-VAE. Using this approach it is difficult to target a specific rate or distortion value, because the result can be very sensitive to $beta$, and the appropriate value for $beta$ depends on the model and problem setup. As a result, model comparison requires extensive per-model $beta$-tuning, and producing a whole rate-distortion curve (by varying $beta$) for each model to be compared. We argue that the constrained optimization method of Rezende and Viola, 2018 is a lot more appropriate for training lossy compression models because it allows us to obtain the best possible rate subject to a distortion constraint. This enables pointwise model comparisons, by training two models with the same distortion target and comparing their rate. We show that the method does manage to satisfy the constraint on a realistic image compression task, outperforms a constrained optimization method based on a hinge-loss, and is more practical to use for model selection than a $beta$-VAE.
Structure learning of directed acyclic graphs (DAGs) is a fundamental problem in many scientific endeavors. A new line of work, based on NOTEARS (Zheng et al., 2018), reformulates the structure learning problem as a continuous optimization one by leveraging an algebraic characterization of DAG constraint. The constrained problem is typically solved using the augmented Lagrangian method (ALM) which is often preferred to the quadratic penalty method (QPM) by virtue of its convergence result that does not require the penalty coefficient to go to infinity, hence avoiding ill-conditioning. In this work, we review the standard convergence result of the ALM and show that the required conditions are not satisfied in the recent continuous constrained formulation for learning DAGs. We demonstrate empirically that its behavior is akin to that of the QPM which is prone to ill-conditioning, thus motivating the use of second-order method in this setting. We also establish the convergence guarantee of QPM to a DAG solution, under mild conditions, based on a property of the DAG constraint term.