No Arabic abstract
Digital Twin is an emerging technology at the forefront of Industry 4.0, with the ultimate goal of combining the physical space and the virtual space. To date, the Digital Twin concept has been applied in many engineering fields, providing useful insights in the areas of engineering design, manufacturing, automation, and construction industry. While the nexus of various technologies opens up new opportunities with Digital Twin, the technology requires a framework to integrate the different technologies, such as the Building Information Model used in the Building and Construction industry. In this work, an Information Fusion framework is proposed to seamlessly fuse heterogeneous components in a Digital Twin framework from the variety of technologies involved. This study aims to augment Digital Twin in buildings with the use of AI and 3D reconstruction empowered by unmanned aviation vehicles. We proposed a drone-based Digital Twin augmentation framework with reusable and customisable components. A proof of concept is also developed, and extensive evaluation is conducted for 3D reconstruction and applications of AI for defect detection.
In this paper, we present our solution for the {it IJCAI--PRICAI--20 3D AI Challenge: 3D Object Reconstruction from A Single Image}. We develop a variant of AtlasNet that consumes single 2D images and generates 3D point clouds through 2D to 3D mapping. To push the performance to the limit and present guidance on crucial implementation choices, we conduct extensive experiments to analyze the influence of decoder design and different settings on the normalization, projection, and sampling methods. Our method achieves 2nd place in the final track with a score of $70.88$, a chamfer distance of $36.87$, and a mean f-score of $59.18$. The source code of our method will be available at https://github.com/em-data/Enhanced_AtlasNet_3DReconstruction.
Explainable Artificial Intelligence (XAI) has in recent years become a well-suited framework to generate human understandable explanations of black box models. In this paper, we present a novel XAI visual explanation algorithm denoted SIDU that can effectively localize entire object regions responsible for prediction in a full extend. We analyze its robustness and effectiveness through various computational and human subject experiments. In particular, we assess the SIDU algorithm using three different types of evaluations (Application, Human and Functionally-Grounded) to demonstrate its superior performance. The robustness of SIDU is further studied in presence of adversarial attack on black box models to better understand its performance.
Central to the concept of multi-domain operations (MDO) is the utilization of an intelligence, surveillance, and reconnaissance (ISR) network consisting of overlapping systems of remote and autonomous sensors, and human intelligence, distributed among multiple partners. Realising this concept requires advancement in both artificial intelligence (AI) for improved distributed data analytics and intelligence augmentation (IA) for improved human-machine cognition. The contribution of this paper is threefold: (1) we map the coalition situational understanding (CSU) concept to MDO ISR requirements, paying particular attention to the need for assured and explainable AI to allow robust human-machine decision-making where assets are distributed among multiple partners; (2) we present illustrative vignettes for AI and IA in MDO ISR, including human-machine teaming, dense urban terrain analysis, and enhanced asset interoperability; (3) we appraise the state-of-the-art in explainable AI in relation to the vignettes with a focus on human-machine collaboration to achieve more rapid and agile coalition decision-making. The union of these three elements is intended to show the potential value of a CSU approach in the context of MDO ISR, grounded in three distinct use cases, highlighting how the need for explainability in the multi-partner coalition setting is key.
It is counter-intuitive that multi-modality methods based on point cloud and images perform only marginally better or sometimes worse than approaches that solely use point cloud. This paper investigates the reason behind this phenomenon. Due to the fact that multi-modality data augmentation must maintain consistency between point cloud and images, recent methods in this field typically use relatively insufficient data augmentation. This shortage makes their performance under expectation. Therefore, we contribute a pipeline, named transformation flow, to bridge the gap between single and multi-modality data augmentation with transformation reversing and replaying. In addition, considering occlusions, a point in different modalities may be occupied by different objects, making augmentations such as cut and paste non-trivial for multi-modality detection. We further present Multi-mOdality Cut and pAste (MoCa), which simultaneously considers occlusion and physical plausibility to maintain the multi-modality consistency. Without using ensemble of detectors, our multi-modality detector achieves new state-of-the-art performance on nuScenes dataset and competitive performance on KITTI 3D benchmark. Our method also wins the best PKL award in the 3rd nuScenes detection challenge. Code and models will be released at https://github.com/open-mmlab/mmdetection3d.
Drones, or general UAVs, equipped with a single camera have been widely deployed to a broad range of applications, such as aerial photography, fast goods delivery and most importantly, surveillance. Despite the great progress achieved in computer vision algorithms, these algorithms are not usually optimized for dealing with images or video sequences acquired by drones, due to various challenges such as occlusion, fast camera motion and pose variation. In this paper, a drone-based multi-object tracking and 3D localization scheme is proposed based on the deep learning based object detection. We first combine a multi-object tracking method called TrackletNet Tracker (TNT) which utilizes temporal and appearance information to track detected objects located on the ground for UAV applications. Then, we are also able to localize the tracked ground objects based on the group plane estimated from the Multi-View Stereo technique. The system deployed on the drone can not only detect and track the objects in a scene, but can also localize their 3D coordinates in meters with respect to the drone camera. The experiments have proved our tracker can reliably handle most of the detected objects captured by drones and achieve favorable 3D localization performance when compared with the state-of-the-art methods.