أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Vladislav Golyanik

Adiabatic Quantum Graph Matching with Permutation Matrix Constraints

276 - Marcel Seelbach Benkner , Vladislav Golyanik , Christian Theobaltn 2021

Matching problems on 3D shapes and images are challenging as they are frequently formulated as combinatorial quadratic assignment problems (QAPs) with permutation matrix constraints, which are NP-hard. In this work, we address such problems with emer ging quantum computing technology and propose several reformulations of QAPs as unconstrained problems suitable for efficient execution on quantum hardware. We investigate several ways to inject permutation matrix constraints in a quadratic unconstrained binary optimization problem which can be mapped to quantum hardware. We focus on obtaining a sufficient spectral gap, which further increases the probability to measure optimal solutions and valid permutation matrices in a single run. We perform our experiments on the quantum computer D-Wave 2000Q (2^11 qubits, adiabatic). Despite the observed discrepancy between simulated adiabatic quantum computing and execution on real quantum hardware, our reformulation of permutation matrix constraints increases the robustness of the numerical computations over other penalty approaches in our experiments. The proposed algorithm has the potential to scale to higher dimensions on future quantum computing architectures, which opens up multiple new directions for solving matching problems in 3D computer vision and graphics.

الرؤية الحاسوبية وتمييز الأنماط

Fast Simultaneous Gravitational Alignment of Multiple Point Sets

66 - Vladislav Golyanik , Soshi Shimada , Christian Theobalt 2021

The problem of simultaneous rigid alignment of multiple unordered point sets which is unbiased towards any of the inputs has recently attracted increasing interest, and several reliable methods have been newly proposed. While being remarkably robust towards noise and clustered outliers, current approaches require sophisticated initialisation schemes and do not scale well to large point sets. This paper proposes a new resilient technique for simultaneous registration of multiple point sets by interpreting the latter as particle swarms rigidly moving in the mutually induced force fields. Thanks to the improved simulation with altered physical laws and acceleration of globally multiply-linked point interactions with a 2^D-tree (D is the space dimensionality), our Multi-Body Gravitational Approach (MBGA) is robust to noise and missing data while supporting more massive point sets than previous methods (with 10^5 points and more). In various experimental settings, MBGA is shown to outperform several baseline point set alignment approaches in terms of accuracy and runtime. We make our source code available for the community to facilitate the reproducibility of the results.

الرؤية الحاسوبية وتمييز الأنماط

Q-Match: Iterative Shape Matching via Quantum Annealing

376 - Marcel Seelbach Benkner , Zorah Lahner , Vladislav Golyanik andn Christof Wunderlich 2021

Finding shape correspondences can be formulated as an NP-hard quadratic assignment problem (QAP) that becomes infeasible for shapes with high sampling density. A promising research direction is to tackle such quadratic optimization problems over bina ry variables with quantum annealing, which allows for some problems a more efficient search in the solution space. Unfortunately, enforcing the linear equality constraints in QAPs via a penalty significantly limits the success probability of such methods on currently available quantum hardware. To address this limitation, this paper proposes Q-Match, i.e., a new iterative quantum method for QAPs inspired by the alpha-expansion algorithm, which allows solving problems of an order of magnitude larger than current quantum methods. It implicitly enforces the QAP constraints by updating the current estimates in a cyclic fashion. Further, Q-Match can be applied iteratively, on a subset of well-chosen correspondences, allowing us to scale to real-world problems. Using the latest quantum annealer, the D-Wave Advantage, we evaluate the proposed method on a subset of QAPLIB as well as on isometric shape matching problems from the FAUST dataset.

الرؤية الحاسوبية وتمييز الأنماط

Neural Monocular 3D Human Motion Capture with Physical Awareness

141 - Soshi Shimada , Vladislav Golyanik , Weipeng Xu 2021

We present a new trainable system for physically plausible markerless 3D human motion capture, which achieves state-of-the-art results in a broad range of challenging scenarios. Unlike most neural methods for human motion capture, our approach, which we dub physionical, is aware of physical and environmental constraints. It combines in a fully differentiable way several key innovations, i.e., 1. a proportional-derivative controller, with gains predicted by a neural network, that reduces delays even in the presence of fast motions, 2. an explicit rigid body dynamics model and 3. a novel optimisation layer that prevents physically implausible foot-floor penetration as a hard constraint. The inputs to our system are 2D joint keypoints, which are canonicalised in a novel way so as to reduce the dependency on intrinsic camera parameters -- both at train and test time. This enables more accurate global translation estimation without generalisability loss. Our model can be finetuned only with 2D annotations when the 3D annotations are not available. It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes, including newly recorded ones. Its advantages are especially noticeable on in-the-wild sequences that significantly differ from common 3D pose estimation benchmarks such as Human 3.6M and MPI-INF-3DHP. Qualitative results are available at http://gvv.mpi-inf.mpg.de/projects/PhysAware/

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي تفاعل الإنسان والحاسوب

Differentiable Event Stream Simulator for Non-Rigid 3D Tracking

109 - Jalees Nehvi , Vladislav Golyanik , Franziska Mueller andn Hans-Peter Seidel 2021

This paper introduces the first differentiable simulator of event streams, i.e., streams of asynchronous brightness change signals recorded by event cameras. Our differentiable simulator enables non-rigid 3D tracking of deformable objects (such as hu man hands, isometric surfaces and general watertight meshes) from event streams by leveraging an analysis-by-synthesis principle. So far, event-based tracking and reconstruction of non-rigid objects in 3D, like hands and body, has been either tackled using explicit event trajectories or large-scale datasets. In contrast, our method does not require any such processing or data, and can be readily applied to incoming event streams. We show the effectiveness of our approach for various types of non-rigid objects and compare to existing methods for non-rigid 3D tracking. In our experiments, the proposed energy-based formulations outperform competing RGB-based methods in terms of 3D errors. The source code and the new data are publicly available.

الرؤية الحاسوبية وتمييز الأنماط

HumanGAN: A Generative Model of Humans Images

120 - Kripasindhu Sarkar , Lingjie Liu , Vladislav Golyanik andn Christian Theobalt 2021

Generative adversarial networks achieve great performance in photorealistic image synthesis in various domains, including human images. However, they usually employ latent vectors that encode the sampled outputs globally. This does not allow convenie nt control of semantically-relevant individual parts of the image, and is not able to draw samples that only differ in partial aspects, such as clothing style. We address these limitations and present a generative model for images of dressed humans offering control over pose, local body part appearance and garment style. This is the first method to solve various aspects of human image generation such as global appearance sampling, pose transfer, parts and garment transfer, and parts sampling jointly in a unified framework. As our model encodes part-based latent appearance vectors in a normalized pose-independent space and warps them to different poses, it preserves body and clothing appearance under varying posture. Experiments show that our flexible and general generative method outperforms task-specific baselines for pose-conditioned image generation, pose transfer and part sampling in terms of realism and output resolution.

الرؤية الحاسوبية وتمييز الأنماط

Style and Pose Control for Image Synthesis of Humans from a Single Monocular View

121 - Kripasindhu Sarkar , Vladislav Golyanik , Lingjie Liu andn Christian Theobalt 2021

Photo-realistic re-rendering of a human from a single image with explicit control over body pose, shape and appearance enables a wide range of applications, such as human appearance transfer, virtual try-on, motion imitation, and novel view synthesis . While significant progress has been made in this direction using learning-based image generation tools, such as GANs, existing approaches yield noticeable artefacts such as blurring of fine details, unrealistic distortions of the body parts and garments as well as severe changes of the textures. We, therefore, propose a new method for synthesising photo-realistic human images with explicit control over pose and part-based appearance, i.e., StylePoseGAN, where we extend a non-controllable generator to accept conditioning of pose and appearance separately. Our network can be trained in a fully supervised way with human images to disentangle pose, appearance and body parts, and it significantly outperforms existing single image re-rendering methods. Our disentangled representation opens up further applications such as garment transfer, motion transfer, virtual try-on, head (identity) swap and appearance interpolation. StylePoseGAN achieves state-of-the-art image generation fidelity on common perceptual metrics compared to the current best-performing methods and convinces in a comprehensive user study.

الرؤية الحاسوبية وتمييز الأنماط

Quantum Permutation Synchronization

440 - Tolga Birdal , Vladislav Golyanik , Christian Theobalt 2021

We present QuantumSync, the first quantum algorithm for solving a synchronization problem in the context of computer vision. In particular, we focus on permutation synchronization which involves solving a non-convex optimization problem in discrete v ariables. We start by formulating synchronization into a quadratic unconstrained binary optimization problem (QUBO). While such formulation respects the binary nature of the problem, ensuring that the result is a set of permutations requires extra care. Hence, we: (i) show how to insert permutation constraints into a QUBO problem and (ii) solve the constrained QUBO problem on the current generation of the adiabatic quantum computers D-Wave. Thanks to the quantum annealing, we guarantee global optimality with high probability while sampling the energy landscape to yield confidence estimates. Our proof-of-concepts realization on the adiabatic D-Wave computer demonstrates that quantum machines offer a promising way to solve the prevalent yet difficult synchronization problems.

فيزياء الكم الرؤية الحاسوبية وتمييز الأنماط التقنيات الناشئة

Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video

498 - Edgar Tretschk , Ayush Tewari , Vladislav Golyanik 2020

We present Non-Rigid Neural Radiance Fields (NR-NeRF), a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes. Our approach takes RGB images of a dynamic scene as input (e.g., from a monocular video recording), and cr eates a high-quality space-time geometry and appearance representation. We show that a single handheld consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views, e.g. a `bullet-time video effect. NR-NeRF disentangles the dynamic scene into a canonical volume and its deformation. Scene deformation is implemented as ray bending, where straight rays are deformed non-rigidly. We also propose a novel rigidity network to better constrain rigid regions of the scene, leading to more stable results. The ray bending and rigidity network are trained without explicit supervision. Our formulation enables dense correspondence estimation across views and time, and compelling video editing applications such as motion exaggeration. Our code will be open sourced.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي

High-Fidelity Neural Human Motion Transfer from Monocular Video

387 - Moritz Kappel , Vladislav Golyanik , Mohamed Elgharib 2020

Video-based human motion transfer creates video animations of humans following a source motion. Current methods show remarkable results for tightly-clad subjects. However, the lack of temporally consistent handling of plausible clothing dynamics, inc luding fine and high-frequency details, significantly limits the attainable visual quality. We address these limitations for the first time in the literature and present a new framework which performs high-fidelity and temporally-consistent human motion transfer with natural pose-dependent non-rigid deformations, for several types of loose garments. In contrast to the previous techniques, we perform image generation in three subsequent stages, synthesizing human shape, structure, and appearance. Given a monocular RGB video of an actor, we train a stack of recurrent deep neural networks that generate these intermediate representations from 2D poses and their temporal derivatives. Splitting the difficult motion transfer problem into subtasks that are aware of the temporal motion context helps us to synthesize results with plausible dynamics and pose-dependent detail. It also allows artistic control of results by manipulation of individual framework stages. In the experimental results, we significantly outperform the state-of-the-art in terms of video realism. Our code and data will be made publicly available.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد