No Arabic abstract
Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualization device improving traditional workflows. As a consequence, the technology is gaining minimum maturity that it requires to redefine new procedures, user interfaces, and interactions. The main contribution of this paper is to reveal how exemplary workflows are redefined by taking full advantage of head-mounted displays when entirely co-registered with the imaging system at all times. The proposed AR landscape is enabled by co-localizing the users and the imaging devices via the operating room environment and exploiting all involved frustums to move spatial information between different bodies. The awareness of the system from the geometric and physical characteristics of X-ray imaging allows the redefinition of different human-machine interfaces. We demonstrate that this AR paradigm is generic, and can benefit a wide variety of procedures. Our system achieved an error of $4.76pm2.91$ mm for placing K-wire in a fracture management procedure, and yielded errors of $1.57pm1.16^circ$ and $1.46pm1.00^circ$ in the abduction and anteversion angles, respectively, for total hip arthroplasty. We hope that our holistic approach towards improving the interface of surgery not only augments the surgeons capabilities but also augments the surgical teams experience in carrying out an effective intervention with reduced complications and provide novel approaches of documenting procedures for training purposes.
Augmented reality has the potential to improve operating room workflow by allowing physicians to see inside a patient through the projection of imaging directly onto the surgical field. For this to be useful the acquired imaging must be quickly and accurately registered with patient and the registration must be maintained. Here we describe a method for projecting a CT scan with Microsoft Hololens and then aligning that projection to a set of fiduciary markers. Radio-opaque stickers with unique QR-codes are placed on an object prior to acquiring a CT scan. The location of the markers in the CT scan are extracted and the CT scan is converted into a 3D surface object. The 3D object is then projected using the Hololens onto a table on which the same markers are placed. We designed an algorithm that aligns the markers on the 3D object with the markers on the table. To extract the markers and convert the CT into a 3D object took less than 5 seconds. To align three markers, it took $0.9 pm 0.2$ seconds to achieve an accuracy of $5 pm 2$ mm. These findings show that it is feasible to use a combined radio-opaque optical marker, placed on a patient prior to a CT scan, to subsequently align the acquired CT scan with the patient.
We design and develop a new shared Augmented Reality (AR) workspace for Human-Robot Interaction (HRI), which establishes a bi-directional communication between human agents and robots. In a prototype system, the shared AR workspace enables a shared perception, so that a physical robot not only perceives the virtual elements in its own view but also infers the utility of the human agent--the cost needed to perceive and interact in AR--by sensing the human agents gaze and pose. Such a new HRI design also affords a shared manipulation, wherein the physical robot can control and alter virtual objects in AR as an active agent; crucially, a robot can proactively interact with human agents, instead of purely passively executing received commands. In experiments, we design a resource collection game that qualitatively demonstrates how a robot perceives, processes, and manipulates in AR and quantitatively evaluates the efficacy of HRI using the shared AR workspace. We further discuss how the system can potentially benefit future HRI studies that are otherwise challenging.
In this paper, we present Retargetable AR, a novel AR framework that yields an AR experience that is aware of scene contexts set in various real environments, achieving natural interaction between the virtual and real worlds. To this end, we characterize scene contexts with relationships among objects in 3D space, not with coordinates transformations. A context assumed by an AR content and a context formed by a real environment where users experience AR are represented as abstract graph representations, i.e. scene graphs. From RGB-D streams, our framework generates a volumetric map in which geometric and semantic information of a scene are integrated. Moreover, using the semantic map, we abstract scene objects as oriented bounding boxes and estimate their orientations. With such a scene representation, our framework constructs, in an online fashion, a 3D scene graph characterizing the context of a real environment for AR. The correspondence between the constructed graph and an AR scene graph denoting the context of AR content provides a semantically registered content arrangement, which facilitates natural interaction between the virtual and real worlds. We performed extensive evaluations on our prototype system through quantitative evaluation of the performance of the oriented bounding box estimation, subjective evaluation of the AR content arrangement based on constructed 3D scene graphs, and an online AR demonstration. The results of these evaluations showed the effectiveness of our framework, demonstrating that it can provide a context-aware AR experience in a variety of real scenes.
We present a congestion-aware routing solution for indoor evacuation, which produces real-time individual-customized evacuation routes among multiple destinations while keeping tracks of all evacuees locations. A population density map, obtained on-the-fly by aggregating locations of evacuees from user-end Augmented Reality (AR) devices, is used to model the congestion distribution inside a building. To efficiently search the evacuation route among all destinations, a variant of A* algorithm is devised to obtain the optimal solution in a single pass. In a series of simulated studies, we show that the proposed algorithm is more computationally optimized compared to classic path planning algorithms; it generates a more time-efficient evacuation route for each individual that minimizes the overall congestion. A complete system using AR devices is implemented for a pilot study in real-world environments, demonstrating the efficacy of the proposed approach.
Patch-based methods and deep networks have been employed to tackle image inpainting problem, with their own strengths and weaknesses. Patch-based methods are capable of restoring a missing region with high-quality texture through searching nearest neighbor patches from the unmasked regions. However, these methods bring problematic contents when recovering large missing regions. Deep networks, on the other hand, show promising results in completing large regions. Nonetheless, the results often lack faithful and sharp details that resemble the surrounding area. By bringing together the best of both paradigms, we propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions. The framework has a novel design that allows texture memory retrieval to be trained end-to-end with the deep inpainting network. In addition, we introduce a patch distribution loss to encourage high-quality patch synthesis. The proposed method shows superior performance both qualitatively and quantitatively on three challenging image benchmarks, i.e., Places, CelebA-HQ, and Paris Street-View datasets.