أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Thomas Hofmann

Uniform Convergence, Adversarial Spheres and a Simple Remedy

47 - Gregor Bachmann , Seyed-Mohsen Moosavi-Dezfooli , Thomas Hofmann 2021

Previous work has cast doubt on the general framework of uniform convergence and its ability to explain generalization in neural networks. By considering a specific dataset, it was observed that a neural network completely misclassifies a projection of the training data (adversarial set), rendering any existing generalization bound based on uniform convergence vacuous. We provide an extensive theoretical investigation of the previously studied data setting through the lens of infinitely-wide models. We prove that the Neural Tangent Kernel (NTK) also suffers from the same phenomenon and we uncover its origin. We highlight the important role of the output bias and show theoretically as well as empirically how a sensible choice completely mitigates the problem. We identify sharp phase transitions in the accuracy on the adversarial set and study its dependency on the training sample size. As a result, we are able to characterize critical sample sizes beyond which the effect disappears. Moreover, we study decompositions of a neural network into a clean and noisy part by considering its canonical decomposition into its different eigenfunctions and show empirically that for too small bias the adversarial phenomenon still persists.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Learning Generative Models of Textured 3D Meshes from Real-World Images

424 - Dario Pavllo , Jonas Kohler , Thomas Hofmann 2021

Recent advances in differentiable rendering have sparked an interest in learning generative models of textured 3D meshes from image collections. These models natively disentangle pose and appearance, enable downstream applications in computer graphic s, and improve the ability of generative models to understand the concept of image formation. Although there has been prior work on learning such models from collections of 2D images, these approaches require a delicate pose estimation step that exploits annotated keypoints, thereby restricting their applicability to a few specific datasets. In this work, we propose a GAN framework for generating textured triangle meshes without relying on such annotations. We show that the performance of our approach is on par with prior work that relies on ground-truth keypoints, and more importantly, we demonstrate the generality of our method by setting new baselines on a larger set of categories from ImageNet - for which keypoints are not available - without any class-specific hyperparameter tuning. We release our code at https://github.com/dariopavllo/textured-3d-gan

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Evidence for temporary and local transition of sp2 graphite-type to sp3 diamond-type bonding induced by the tip of an atomic force microscope

217 - Thomas Hofmann , Xinguo Ren , Alexander Liebig 2021

Artificial diamond is created by exposing graphite to pressures on the order of 10,GPa and temperatures of about 2000,K. Here, we provide evidence that the pressure exerted by the tip of an atomic force microscope onto graphene over the carbon buffer layer of silicon carbide can lead to a temporary transition of graphite to diamond on the atomic scale. We perform atomic force microscopy with CO terminated tips and copper oxide (CuOx) tips to image graphene and to induce the structural transition. For a local transition, DFT predicts that a repulsive barrier of $approx13$,nN, followed by a force reduction by $approx4$,nN is overcome when inducing the graphite-diamond transition. Experimental evidence for this transition is provided by the observation of third harmonics in the cantilever oscillation for relative flexible CO terminated tips and a kink in the force versus distance curve for rigid CuOx tips. The experimental observation of the third harmonic with a magnitude of about 200,fm fits to a force with an amplitude of $pm 3$,nN. The large repulsive overall force of $approx 10$,nN is only compatible with the experiment if one assumes that the repulsive force acting on the tip when inducing the transition is compensated by an increased van-der-Waals attraction of the tip due to form fitting of tip and sample by local indentation. The transition changes flat sp$^2$ bonds to corrugated sp$^3$ bonds, resulting in a different height of the two basis atoms in the elementary cell of graphene. Both tip types show a strong asysmmetry between the two basis atoms of the lattice when using large repulsive tip forces that induce the transition. Experimental data of tunneling current, frequency shift and dissipation are consistent with the proposed transition. The experiment also shows that atomic force microscopy allows to perform high pressure physics on the atomic scale.

علم المواد الفيزياء ميسكالي وننكالي

Convolutional Generation of Textured 3D Meshes

359 - Dario Pavllo , Graham Spinks , Thomas Hofmann 2020

While recent generative models for 2D images achieve impressive visual results, they clearly lack the ability to perform 3D reasoning. This heavily restricts the degree of control over generated objects as well as the possible applications of such mo dels. In this work, we bridge this gap by leveraging recent advances in differentiable rendering. We design a framework that can generate triangle meshes and associated high-resolution texture maps, using only 2D supervision from single-view natural images. A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN. We demonstrate the efficacy of our method on Pascal3D+ Cars and CUB, both in an unconditional setting and in settings where the model is conditioned on class labels, attributes, and text. Finally, we propose an evaluation methodology that assesses the mesh and texture quality separately.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Controlling Style and Semantics in Weakly-Supervised Image Generation

194 - Dario Pavllo , Aurelien Lucchi , Thomas Hofmann 2019

We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene. We exploit sparse semantic maps to control object shapes and classes, as well as textual des criptions or attributes to control both local and global style. In order to condition our model on textual descriptions, we introduce a semantic attention module whose computational cost is independent of the image resolution. To further augment the controllability of the scene, we propose a two-step generation scheme that decomposes background and foreground. The label maps used to train our model are produced by a large-vocabulary object detector, which enables access to unlabeled data and provides structured instance information. In such a setting, we report better FID scores compared to fully-supervised settings where the model is trained on ground-truth semantic maps. We also showcase the ability of our model to manipulate a scene on complex datasets such as COCO and Visual Genome.

الرؤية الحاسوبية وتمييز الأنماط

Zero-Shot Dual Machine Translation

95 - Lierni Sestorain , Massimiliano Ciaramita , Christian Buck andn Thomas Hofmann 2018

Neural Machine Translation (NMT) systems rely on large amounts of parallel data. This is a major challenge for low-resource languages. Building on recent work on unsupervised and semi-supervised methods, we present an approach that combines zero-shot and dual learning. The latter relies on reinforcement learning, to exploit the duality of the machine translation task, and requires only monolingual data for the target language pair. Experiments show that a zero-shot dual system, trained on English-French and English-Spanish, outperforms by large margins a standard NMT system in zero-shot translation performance on Spanish-French (both directions). The zero-shot dual method approaches the performance, within 2.2 BLEU points, of a comparable supervised setting. Our method can obtain improvements also on the setting where a small amount of parallel data for the zero-shot language pair is available. Adding Russian, to extend our experiments to jointly modeling 6 zero-shot translation directions, all directions improve between 4 and 15 BLEU points, again, reaching performance near that of the supervised setting.

الحساب واللغة الحوسبة العصبية والتطورية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد