No Arabic abstract
This paper proposes a novel message passing neural (MPN) architecture Conv-MPN, which reconstructs an outdoor building as a planar graph from a single RGB image. Conv-MPN is specifically designed for cases where nodes of a graph have explicit spatial embedding. In our problem, nodes correspond to building edges in an image. Conv-MPN is different from MPN in that 1) the feature associated with a node is represented as a feature volume instead of a 1D vector; and 2) convolutions encode messages instead of fully connected layers. Conv-MPN learns to select a true subset of nodes (i.e., building edges) to reconstruct a building planar graph. Our qualitative and quantitative evaluations over 2,000 buildings show that Conv-MPN makes significant improvements over the existing fully neural solutions. We believe that the paper has a potential to open a new line of graph neural network research for structured geometry reconstruction.
This paper presents an explore-and-classify framework for structured architectural reconstruction from an aerial image. Starting from a potentially imperfect building reconstruction by an existing algorithm, our approach 1) explores the space of building models by modifying the reconstruction via heuristic actions; 2) learns to classify the correctness of building models while generating classification labels based on the ground-truth, and 3) repeat. At test time, we iterate exploration and classification, seeking for a result with the best classification score. We evaluate the approach using initial reconstructions by two baselines and two state-of-the-art reconstruction algorithms. Qualitative and quantitative evaluations demonstrate that our approach consistently improves the reconstruction quality from every initial reconstruction.
This study follows many classical approaches to multi-object tracking (MOT) that model the problem using dynamic graphical data structures, and adapts this formulation to make it amenable to modern neural networks. Our main contributions in this work are the creation of a framework based on dynamic undirected graphs that represent the data association problem over multiple timesteps, and a message passing graph neural network (MPNN) that operates on these graphs to produce the desired likelihood for every association therein. We also provide solutions and propositions for the computational problems that need to be addressed to create a memory-efficient, real-time, online algorithm that can reason over multiple timesteps, correct previous mistakes, update beliefs, and handle missed/false detections. To demonstrate the efficacy of our approach, we only use the 2D box location and object category ID to construct the descriptor for each object instance. Despite this, our model performs on par with state-of-the-art approaches that make use of additional sensors, as well as multiple hand-crafted and/or learned features. This illustrates that given the right problem formulation and model design, raw bounding boxes (and their kinematics) from any off-the-shelf detector are sufficient to achieve competitive tracking results on challenging MOT benchmarks.
Precise destination prediction of taxi trajectories can benefit many intelligent location based services such as accurate ad for passengers. Traditional prediction approaches, which treat trajectories as one-dimensional sequences and process them in single scale, fail to capture the diverse two-dimensional patterns of trajectories in different spatial scales. In this paper, we propose T-CONV which models trajectories as two-dimensional images, and adopts multi-layer convolutional neural networks to combine multi-scale trajectory patterns to achieve precise prediction. Furthermore, we conduct gradient analysis to visualize the multi-scale spatial patterns captured by T-CONV and extract the areas with distinct influence on the ultimate prediction. Finally, we integrate multiple local enhancement convolutional fields to explore these important areas deeply for better prediction. Comprehensive experiments based on real trajectory data show that T-CONV can achieve higher accuracy than the state-of-the-art methods.
Graph neural networks (GNNs) emerged recently as a standard toolkit for learning from data on graphs. Current GNN designing works depend on immense human expertise to explore different message-passing mechanisms, and require manual enumeration to determine the proper message-passing depth. Inspired by the strong searching capability of neural architecture search (NAS) in CNN, this paper proposes Graph Neural Architecture Search (GNAS) with novel-designed search space. The GNAS can automatically learn better architecture with the optimal depth of message passing on the graph. Specifically, we design Graph Neural Architecture Paradigm (GAP) with tree-topology computation procedure and two types of fine-grained atomic operations (feature filtering and neighbor aggregation) from message-passing mechanism to construct powerful graph network search space. Feature filtering performs adaptive feature selection, and neighbor aggregation captures structural information and calculates neighbors statistics. Experiments show that our GNAS can search for better GNNs with multiple message-passing mechanisms and optimal message-passing depth. The searched network achieves remarkable improvement over state-of-the-art manual designed and search-based GNNs on five large-scale datasets at three classical graph tasks. Codes can be found at https://github.com/phython96/GNAS-MP.
PET image reconstruction is challenging due to the ill-poseness of the inverse problem and limited number of detected photons. Recently deep neural networks have been widely and successfully used in computer vision tasks and attracted growing interests in medical imaging. In this work, we trained a deep residual convolutional neural network to improve PET image quality by using the existing inter-patient information. An innovative feature of the proposed method is that we embed the neural network in the iterative reconstruction framework for image representation, rather than using it as a post-processing tool. We formulate the objective function as a constraint optimization problem and solve it using the alternating direction method of multipliers (ADMM) algorithm. Both simulation data and hybrid real data are used to evaluate the proposed method. Quantification results show that our proposed iterative neural network method can outperform the neural network denoising and conventional penalized maximum likelihood methods.