No Arabic abstract
For medical volume visualization, one of the most important tasks is to reveal clinically relevant details from the 3D scan (CT, MRI ...), e.g. the coronary arteries, without obscuring them with less significant parts. These volume datasets contain different materials which are difficult to extract and visualize with 1D transfer functions based solely on the attenuation coefficient. Multi-dimensional transfer functions allow a much more precise classification of data which makes it easier to separate different surfaces from each other. Unfortunately, setting up multi-dimensional transfer functions can become a fairly complex task, generally accomplished by trial and error. This paper explains neural networks, and then presents an efficient way to speed up visualization process by semi-automatic transfer function generation. We describe how to use neural networks to detect distinctive features shown in the 2D histogram of the volume data and how to use this information for data classification.
We present Neural Splines, a technique for 3D surface reconstruction that is based on random feature kernels arising from infinitely-wide shallow ReLU networks. Our method achieves state-of-the-art results, outperforming recent neural network-based techniques and widely used Poisson Surface Reconstruction (which, as we demonstrate, can also be viewed as a type of kernel method). Because our approach is based on a simple kernel formulation, it is easy to analyze and can be accelerated by general techniques designed for kernel-based learning. We provide explicit analytical expressions for our kernel and argue that our formulation can be seen as a generalization of cubic spline interpolation to higher dimensions. In particular, the RKHS norm associated with Neural Splines biases toward smooth interpolants.
We propose a method for constructing generative models of 3D objects from a single 3D mesh. Our method produces a 3D morphable model that represents shape and albedo in terms of Gaussian processes. We define the shape deformations in physical (3D) space and the albedo deformations as a combination of physical-space and color-space deformations. Whereas previous approaches have typically built 3D morphable models from multiple high-quality 3D scans through principal component analysis, we build 3D morphable models from a single scan or template. We demonstrate the utility of these models in the domain of face modeling through inverse rendering and registration tasks. Specifically, we show that our approach can be used to perform face recognition using only a single 3D scan (one scan total, not one per person), and further demonstrate how multiple scans can be incorporated to improve performance without requiring dense correspondence. Our approach enables the synthesis of 3D morphable models for 3D object categories where dense correspondence between multiple scans is unavailable. We demonstrate this by constructing additional 3D morphable models for fish and birds and use them to perform simple inverse rendering tasks.
This paper presents new designs of graph convolutional neural networks (GCNs) on 3D meshes for 3D object segmentation and classification. We use the faces of the mesh as basic processing units and represent a 3D mesh as a graph where each node corresponds to a face. To enhance the descriptive power of the graph, we introduce a 1-ring face neighbourhood structure to derive novel multi-dimensional spatial and structure features to represent the graph nodes. Based on this new graph representation, we then design a densely connected graph convolutional block which aggregates local and regional features as the key construction component to build effective and efficient practical GCN models for 3D object classification and segmentation. We will present experimental results to show that our new technique outperforms state of the art where our models are shown to have the smallest number of parameters and consietently achieve the highest accuracies across a number of benchmark datasets. We will also present ablation studies to demonstrate the soundness of our design principles and the effectiveness of our practical models.
We report on an exploratory project aimed at performing immersive 3D visualization of astronomical data, starting with spectral-line radio data cubes from galaxies. This work is done as a collaboration between the Department of Physics and Astronomy and the Department of Computer Science at the University of Manitoba. We are building our prototype using the 3D engine Unity, because of its ease of use for integration with advanced displays such as a CAVE environment, a zSpace tabletop, or virtual reality headsets. We address general issues regarding 3D visualization, such as: load and convert astronomy data, perform volume rendering on the GPU, and produce physically meaningful visualizations using principles of visual literacy. We discuss some challenges to be met when designing a user interface that allows us to take advantage of this new way of exploring data. We hope to lay the foundations for an innovative framework useful for all astronomers who use spectral line data cubes, and encourage interested parties to join our efforts. This pilot project addresses the challenges presented by frontier astronomy experiments, such as the Square Kilometre Array and its precursors.
We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene in a compact, yet expressive representation that enables high-quality view synthesis and motion interpolation. Our approach takes the high quality and compactness of static neural radiance fields in a new direction: to a model-free, dynamic setting. At the core of our approach is a novel time-conditioned neural radiance fields that represents scene dynamics using a set of compact latent codes. To exploit the fact that changes between adjacent frames of a video are typically small and locally consistent, we propose two novel strategies for efficient training of our neural network: 1) An efficient hierarchical training scheme, and 2) an importance sampling strategy that selects the next rays for training based on the temporal variation of the input videos. In combination, these two strategies significantly boost the training speed, lead to fast convergence of the training process, and enable high quality results. Our learned representation is highly compact and able to represent a 10 second 30 FPS multi-view video recording by 18 cameras with a model size of just 28MB. We demonstrate that our method can render high-fidelity wide-angle novel views at over 1K resolution, even for highly complex and dynamic scenes. We perform an extensive qualitative and quantitative evaluation that shows that our approach outperforms the current state of the art. We include additional video and information at: https://neural-3d-video.github.io/