Do you want to publish a course? Click here

Benchmarking Invertible Architectures on Inverse Problems

85   0   0.0 ( 0 )
 Added by Jakob Kruse
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Recent work demonstrated that flow-based invertible neural networks are promising tools for solving ambiguous inverse problems. Following up on this, we investigate how ten invertible architectures and related models fare on two intuitive, low-dimensional benchmark problems, obtaining the best results with coupling layers and simple autoencoders. We hope that our initial efforts inspire other researchers to evaluate their invertible architectures in the same setting and put forth additional benchmarks, so our evaluation may eventually grow into an official community challenge.



rate research

Read More

Recent efforts on solving inverse problems in imaging via deep neural networks use architectures inspired by a fixed number of iterations of an optimization method. The number of iterations is typically quite small due to difficulties in training networks corresponding to more iterations; the resulting solvers cannot be run for more iterations at test time without incurring significant errors. This paper describes an alternative approach corresponding to an infinite number of iterations, yielding a consistent improvement in reconstruction accuracy above state-of-the-art alternatives and where the computational budget can be selected at test time to optimize context-dependent trade-offs between accuracy and computation. The proposed approach leverages ideas from Deep Equilibrium Models, where the fixed-point iteration is constructed to incorporate a known forward model and insights from classical optimization-based reconstruction methods.
Photonic platforms represent a promising technology for the realization of several quantum communication protocols and for experiments of quantum simulation. Moreover, large-scale integrated interferometers have recently gained a relevant role for restricted models of quantum computing, specifically with Boson Sampling devices. Indeed, various linear optical schemes have been proposed for the implementation of unitary transformations, each one suitable for a specific task. Notwithstanding, so far a comprehensive analysis of the state of the art under broader and realistic conditions is still lacking. In the present work we address this gap, providing in a unified framework a quantitative comparison of the three main photonic architectures, namely the ones with triangular and square designs and the so-called fast transformations. All layouts have been analyzed in presence of losses and imperfect control over the reflectivities and phases of the inner structure. Our results represent a further step ahead towards the implementation of quantum information protocols on large-scale integrated photonic devices.
The trend towards highly parallel multi-processing is ubiquitous in all modern computer architectures, ranging from handheld devices to large-scale HPC systems; yet many applications are struggling to fully utilise the multiple levels of parallelism exposed in modern high-performance platforms. In order to realise the full potential of recent hardware advances, a mixed-mode between shared-memory programming techniques and inter-node message passing can be adopted which provides high-levels of parallelism with minimal overheads. For scientific applications this entails that not only the simulation code itself, but the whole software stack needs to evolve. In this paper, we evaluate the mixed-mode performance of PETSc, a widely used scientific library for the scalable solution of partial differential equations. We describe the addition of OpenMP threaded functionality to the library, focusing on sparse matrix-vector multiplication. We highlight key challenges in achieving good parallel performance, such as explicit communication overlap using task-based parallelism, and show how to further improve performance by explicitly load balancing threads within MPI processes. Using a set of matrices extracted from Fluidity, a CFD application code which uses the library as its linear solver engine, we then benchmark the parallel performance of mixed-mode PETSc across multiple nodes on several modern HPC architectures. We evaluate the parallel scalability on Uniform Memory Access (UMA) systems, such as the Fujitsu PRIMEHPC FX10 and IBM BlueGene/Q, as well as a Non-Uniform Memory Access (NUMA) Cray XE6 platform. A detailed comparison is performed which highlights the characteristics of each particular architecture, before demonstrating efficient strong scalability of sparse matrix-vector multiplication with significant speedups over the pure-MPI mode.
We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit Supercomputer. We demonstrate our gradient reduction techniques in the context of training a Fully Convolutional Neural Network to approximate the solution of a longstanding scientific inverse problem in materials imaging. The efficient distributed training on a dataset size of 0.5 PB, produces a model capable of an atomically-accurate reconstruction of materials, and in the process reaching a peak performance of 2.15(4) EFLOPS$_{16}$.
Inverse problems arise in a number of domains such as medical imaging, remote sensing, and many more, relying on the use of advanced signal and image processing approaches -- such as sparsity-driven techniques -- to determine their solution. This paper instead studies the use of deep learning approaches to approximate the solution of inverse problems. In particular, the paper provides a new generalization bound, depending on key quantity associated with a deep neural network -- its Jacobian matrix -- that also leads to a number of computationally efficient regularization strategies applicable to inverse problems. The paper also tests the proposed regularization strategies in a number of inverse problems including image super-resolution ones. Our numerical results conducted on various datasets show that both fully connected and convolutional neural networks regularized using the regularization or proxy regularization strategies originating from our theory exhibit much better performance than deep networks regularized with standard approaches such as weight-decay.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا