ترغب بنشر مسار تعليمي؟ اضغط هنا

We introduce Multiresolution Deep Implicit Functions (MDIF), a hierarchical\nrepresentation that can recover fine geometry detail, while being able to\nperform global operations such as shape completion. Our model represents a\ncomplex 3D shape wit h a hierarchy of latent grids, which can be decoded into\ndifferent levels of detail and also achieve better accuracy. For shape\ncompletion, we propose latent grid dropout to simulate partial data in the\nlatent space and therefore defer the completing functionality to the decoder\nside. This along with our multires design significantly improves the shape\ncompletion quality under decoder-only latent optimization. To the best of our\nknowledge, MDIF is the first deep implicit function model that can at the same\ntime (1) represent different levels of detail and allow progressive decoding;\n(2) support both encoder-decoder inference and decoder-only latent\noptimization, and fulfill multiple applications; (3) perform detailed\ndecoder-only shape completion. Experiments demonstrate its superior performance\nagainst prior art in various 3D reconstruction tasks.\n
In this paper, we address the problem of building dense correspondences\nbetween human images under arbitrary camera viewpoints and body poses. Prior\nart either assumes small motion between frames or relies on local descriptors,\nwhich cannot hand le large motion or visually ambiguous body parts, e.g., left\nvs. right hand. In contrast, we propose a deep learning framework that maps\neach pixel to a feature space, where the feature distances reflect the geodesic\ndistances among pixels as if they were projected onto the surface of a 3D human\nscan. To this end, we introduce novel loss functions to push features apart\naccording to their geodesic distances on the surface. Without any semantic\nannotation, the proposed embeddings automatically learn to differentiate\nvisually similar parts and align different subjects into an unified feature\nspace. Extensive experiments show that the learned embeddings can produce\naccurate correspondences between images with remarkable generalization\ncapabilities on both intra and inter subjects.\n
We describe a novel approach for compressing truncated signed distance fields\n(TSDF) stored in 3D voxel grids, and their corresponding textures. To compress\nthe TSDF, our method relies on a block-based neural network architecture\ntrained end-to- end, achieving state-of-the-art rate-distortion trade-off. To\nprevent topological errors, we losslessly compress the signs of the TSDF, which\nalso upper bounds the reconstruction error by the voxel size. To compress the\ncorresponding texture, we designed a fast block-based UV parameterization,\ngenerating coherent texture maps that can be effectively compressed using\nexisting video compression algorithms. We demonstrate the performance of our\nalgorithms on two 4D performance capture datasets, reducing bitrate by 66% for\nthe same distortion, or alternatively reducing the distortion by 50% for the\nsame bitrate, compared to the state-of-the-art.\n
In this work, we present a modified fuzzy decision forest for real-time 3D\nobject pose estimation based on typical template representation. We employ an\nextra preemptive background rejector node in the decision forest framework to\nterminate the examination of background locations as early as possible, result\nin a significantly improvement on efficiency. Our approach is also scalable to\nlarge dataset since the tree structure naturally provides a logarithm time\ncomplexity to the number of objects. Finally we further reduce the validation\nstage with a fast breadth-first scheme. The results show that our approach\noutperform the state-of-the-arts on the efficiency while maintaining a\ncomparable accuracy.\n
In this paper we present Latent-Class Hough Forests, a method for object\ndetection and 6 DoF pose estimation in heavily cluttered and occluded\nscenarios. We adapt a state of the art template matching feature into a\nscale-invariant patch descript or and integrate it into a regression forest\nusing a novel template-based split function. We train with positive samples\nonly and we treat class distributions at the leaf nodes as latent variables.\nDuring testing we infer by iteratively updating these distributions, providing\naccurate estimation of background clutter and foreground occlusions and, thus,\nbetter detection rate. Furthermore, as a by-product, our Latent-Class Hough\nForests can provide accurate occlusion aware segmentation masks, even in the\nmulti-instance scenario. In addition to an existing public dataset, which\ncontains only single-instance sequences with large amounts of clutter, we have\ncollected two, more challenging, datasets for multiple-instance detection\ncontaining heavy 2D and 3D clutter as well as foreground occlusions. We provide\nextensive experiments on the various parameters of the framework such as patch\nsize, number of trees and number of iterations to infer class distributions at\ntest time. We also evaluate the Latent-Class Hough Forests on all datasets\nwhere we outperform state of the art methods.\n

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا