أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل David Bau

127 - Sheng-Yu Wang , David Bau , Jun-Yan Zhu 2021

Can a user create a deep generative model by sketching a single example? Traditionally, creating a GAN model has required the collection of a large-scale dataset of exemplars and specialized knowledge in deep learning. In contrast, sketching is possi bly the most universally accessible way to convey a visual concept. In this work, we present a method, GAN Sketching, for rewriting GANs with one or more sketches, to make GANs training easier for novice users. In particular, we change the weights of an original GAN model according to user sketches. We encourage the models output to match the user sketches through a cross-domain adversarial loss. Furthermore, we explore different regularization methods to preserve the original models diversity and image quality. Experiments have shown that our method can mold GANs to match shapes and poses specified by sketches while maintaining realism and diversity. Finally, we demonstrate a few applications of the resulting GAN, including latent space interpolation and image editing.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Ultrafast electric control of cavity mediated single-photon and photon-pair generation with semiconductor quantum dots

207 - David Bauch , Dirk Heinze , Jens Forstner 2021

Employing the ultrafast control of electronic states of a semiconductor quantum dot in a cavity, we introduce a novel approach to achieve on-demand emission of single photons with almost perfect indistinguishability and photon pairs with near ideal e ntanglement. Our scheme is based on optical excitation off-resonant to a cavity mode followed by ultrafast control of the electronic states using the time-dependent quantum-confined Stark effect, which then allows for cavity-resonant emission. Our theoretical analysis takes into account cavity-loss mechanisms, the Stark effect, and phonon-induced dephasing allowing realistic predictions for finite temperatures.

الفيزياء ميسكالي وننكالي

Paint by Word

52 - David Bau , Alex Andonian , Audrey Cui 2021

We investigate the problem of zero-shot semantic image painting. Instead of painting modifications into an image using only concrete colors or a finite set of semantic concepts, we ask how to create semantic paint based on open full-text descriptions : our goal is to be able to point to a location in a synthesized image and apply an arbitrary new concept such as rustic or opulent or happy dog. To do this, our method combines a state-of-the art generative model of realistic images with a state-of-the-art text-image semantic similarity network. We find that, to make large changes, it is important to use non-gradient methods to explore latent space, and it is important to relax the computations of the GAN to target changes to a specific region. We conduct user studies to compare our methods to several baselines.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي الرسم الحاسوبي

Understanding the Role of Individual Units in a Deep Neural Network

130 - David Bau , Jun-Yan Zhu , Hendrik Strobelt 2020

Deep neural networks excel at finding hierarchical representations that solve complex tasks over large data sets. How can we humans understand these learned representations? In this work, we present network dissection, an analytic framework to system atically identify the semantics of individual hidden units within image classification and image generation networks. First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts. We find evidence that the network has learned many object classes that play crucial roles in classifying scene classes. Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes. By analyzing changes made when small sets of units are activated or deactivated, we find that objects can be added and removed from the output scenes while adapting to the context. Finally, we apply our analytic framework to understanding adversarial attacks and to semantic image editing.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي الحوسبة العصبية والتطورية

What makes fake images detectable? Understanding properties that generalize

115 - Lucy Chai , David Bau , Ser-Nam Lim 2020

The quality of image generation and manipulation is reaching impressive levels, making it increasingly difficult for a human to distinguish between what is real and what is fake. However, deep networks can still pick up on the subtle artifacts in the se doctored images. We seek to understand what properties of fake images make them detectable and identify what generalizes across different model architectures, datasets, and variations in training. We use a patch-based classifier with limited receptive fields to visualize which regions of fake images are more easily detectable. We further show a technique to exaggerate these detectable properties and demonstrate that, even when the image generator is adversarially finetuned against a fake image classifier, it is still imperfect and leaves detectable artifacts in certain image patches. Code is available at https://chail.github.io/patch-forensics/.

الرؤية الحاسوبية وتمييز الأنماط

Exploring the Design Space of Aesthetics with the Repertory Grid Technique

50 - David Baum 2020

By optimizing aesthetics, graph diagrams can be generated that are easier to read and understand. However, the challenge lies in identifying suitable aesthetics. We present a novel approach based on repertory grids to explore the design space of aest hetics systematically. We applied our approach with three independent groups of participants to systematically identify graph aesthetics. In all three cases, we were able to reproduce the aesthetics with positively evaluated influence on readability without any prior knowledge. We also applied our approach to two- and three-dimensional domain-specific software visualizations to demonstrate its versatility. In this case, we were also able to acquire several aesthetics that are relevant for perceiving the visualization.

تفاعل الإنسان والحاسوب

Identifying Usability Issues of Software Analytics Applications in Immersive Augmented Reality

95 - David Baum , Stefan Bechert , Ulrich Eisenecker 2020

Software analytics in augmented reality (AR) is said to have great potential. One reason why this potential is not yet fully exploited may be usability problems of the AR user interfaces. We present an iterative and qualitative usability evaluation w ith 15 subjects of a state-of-the-art application for software analytics in AR. We could identify and resolve numerous usability issues. Most of them were caused by applying conventional user interface elements, such as dialog windows, buttons, and scrollbars. The used city visualization, however, did not cause any usability issues. Therefore, we argue that future work should focus on making conventional user interface elements in AR obsolete by integrating their functionality into the immersive visualization.

تفاعل الإنسان والحاسوب

Rewriting a Deep Generative Model

94 - David Bau , Steven Liu , Tongzhou Wang 2020

A deep generative model such as a GAN learns to model a rich set of semantic and physical rules about the target distribution, but up to now, it has been obscure how such rules are encoded in the network, or how a rule could be changed. In this paper , we introduce a new problem setting: manipulation of specific rules encoded by a deep generative model. To address the problem, we propose a formulation in which the desired rule is changed by manipulating a layer of a deep network as a linear associative memory. We derive an algorithm for modifying one entry of the associative memory, and we demonstrate that several interesting structural rules can be located and modified within the layers of state-of-the-art generative models. We present a user interface to enable users to interactively change the rules of a generative model to achieve desired effects, and we show several proof-of-concept applications. Finally, results on multiple datasets demonstrate the advantage of our method against standard fine-tuning methods and edit transfer algorithms.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Label-free detection of Giardia lamblia cysts using a deep learning-enabled portable imaging flow cytometer

45 - Zoltan Gorocs , David Baum , Fang Song 2020

We report a field-portable and cost-effective imaging flow cytometer that uses deep learning to accurately detect Giardia lamblia cysts in water samples at a volumetric throughput of 100 mL/h. This flow cytometer uses lensfree color holographic imagi ng to capture and reconstruct phase and intensity images of microscopic objects in a continuously flowing sample, and automatically identifies Giardia Lamblia cysts in real-time without the use of any labels or fluorophores. The imaging flow cytometer is housed in an environmentally-sealed enclosure with dimensions of 19 cm x 19 cm x 16 cm and weighs 1.6 kg. We demonstrate that this portable imaging flow cytometer coupled to a laptop computer can detect and quantify, in real-time, low levels of Giardia contamination (e.g., <10 cysts per 50 mL) in both freshwater and seawater samples. The field-portable and label-free nature of this method has the potential to allow rapid and automated screening of drinking water supplies in resource limited settings in order to detect waterborne parasites and monitor the integrity of the filters used for water treatment.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط الفيزياء التطبيقية

Diverse Image Generation via Self-Conditioned GANs

135 - Steven Liu , Tongzhou Wang , David Bau 2020

We introduce a simple but effective unsupervised method for generating realistic and diverse images. We train a class-conditional GAN model without using manually annotated class labels. Instead, our model is conditional on labels automatically deriv ed from clustering in the discriminators feature space. Our clustering step automatically discovers diverse modes, and explicitly requires the generator to cover them. Experiments on standard mode collapse benchmarks show that our method outperforms several competing methods when addressing mode collapse. Our method also performs well on large-scale datasets such as ImageNet and Places365, improving both image diversity and standard quality metrics, compared to previous methods.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد