Localizing dexterous surgical tools in X-ray for image-based navigation

114 0 0.0 ( 0 )

Download Cite

Added by Cong Gao

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Cong Gao - Mathias Unberath - Russell Taylor

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

X-ray image based surgical tool navigation is fast and supplies accurate images of deep seated structures. Typically, recovering the 6 DOF rigid pose and deformation of tools with respect to the X-ray camera can be accurately achieved through intensity-based 2D/3D registration of 3D images or models to 2D X-rays. However, the capture range of image-based 2D/3D registration is inconveniently small suggesting that automatic and robust initialization strategies are of critical importance. This manuscript describes a first step towards leveraging semantic information of the imaged object to initialize 2D/3D registration within the capture range of image-based registration by performing concurrent segmentation and localization of dexterous surgical tools in X-ray images. We presented a learning-based strategy to simultaneously localize and segment dexterous surgical tools in X-ray images and demonstrate promising performance on synthetic and ex vivo data. We currently investigate methods to use semantic information extracted by the proposed network to reliably and robustly initialize image-based 2D/3D registration. While image-based 2D/3D registration has been an obvious focus of the CAI community, robust initialization thereof (albeit critical) has largely been neglected. This manuscript discusses learning-based retrieval of semantic information on imaged-objects as a stepping stone for such initialization and may therefore be of interest to the IPCAI community. Since results are still preliminary and only focus on localization, we target the Long Abstract category.

rate research

Image Compositing for Segmentation of Surgical Tools without Manual Annotations

143 - Luis C. Garcia-Peraza-Herrera , Lucas Fidon , Claudia DEttorre 2021

Producing manual, pixel-accurate, image segmentation labels is tedious and time-consuming. This is often a rate-limiting factor when large amounts of labeled images are required, such as for training deep convolutional networks for instrument-background segmentation in surgical scenes. No large datasets comparable to industry standards in the computer vision community are available for this task. To circumvent this problem, we propose to automate the creation of a realistic training dataset by exploiting techniques stemming from special effects and harnessing them to target training performance rather than visual appeal. Foreground data is captured by placing sample surgical instruments over a chroma key (a.k.a. green screen) in a controlled environment, thereby making extraction of the relevant image segment straightforward. Multiple lighting conditions and viewpoints can be captured and introduced in the simulation by moving the instruments and camera and modulating the light source. Background data is captured by collecting videos that do not contain instruments. In the absence of pre-existing instrument-free background videos, minimal labeling effort is required, just to select frames that do not contain surgical instruments from videos of surgical interventions freely available online. We compare different methods to blend instruments over tissue and propose a novel data augmentation approach that takes advantage of the plurality of options. We show that by training a vanilla U-Net on semi-synthetic data only and applying a simple post-processing, we are able to match the results of the same network trained on a publicly available manually labeled real dataset.

Computer Vision and Pattern Recognition

Surgical navigation systems based on augmented reality technologies

206 - Vladimir Ivanov , Anton Krivtsov , Sergey Strelkov 2021

This study considers modern surgical navigation systems based on augmented reality technologies. Augmented reality glasses are used to construct holograms of the patients organs from MRI and CT data, subsequently transmitted to the glasses. This, in addition to seeing the actual patient, the surgeon gains visualization inside the patients body (bones, soft tissues, blood vessels, etc.). The solutions developed at Peter the Great St. Petersburg Polytechnic University allow reducing the invasiveness of the procedure and preserving healthy tissues. This also improves the navigation process, making it easier to estimate the location and size of the tumor to be removed. We describe the application of developed systems to different types of surgical operations (removal of a malignant brain tumor, removal of a cyst of the cervical spine). We consider the specifics of novel navigation systems designed for anesthesia, for endoscopic operations. Furthermore, we discuss the construction of novel visualization systems for ultrasound machines. Our findings indicate that the technologies proposed show potential for telemedicine.

Human-Computer Interaction

Kinematic Parameter Optimization of a Miniaturized Surgical Instrument Based on Dexterous Workspace Determination

146 - Xin Zhi , Weibang Bai , 2021

Miniaturized instruments are highly needed for robot assisted medical healthcare and treatment, especially for less invasive surgery as it empowers more flexible access to restricted anatomic intervention. But the robotic design is more challenging due to the contradictory needs of miniaturization and the capability of manipulating with a large dexterous workspace. Thus, kinematic parameter optimization is of great significance in this case. To this end, this paper proposes an approach based on dexterous workspace determination for designing a miniaturized tendon-driven surgical instrument under necessary restraints. The workspace determination is achieved by boundary determination and volume estimation with partition and least-squares polynomial fitting methods. The final robotic configuration with optimized kinematic parameters is proved to be eligible with a large enough dexterous workspace and targeted miniature size.

Robotics

Mapping, Localization and Path Planning for Image-based Navigation using Visual Features and Map

190 - Janine Thoma , Danda Pani Paudel , Ajad Chhatkuli 2018

Building on progress in feature representations for image retrieval, image-based localization has seen a surge of research interest. Image-based localization has the advantage of being inexpensive and efficient, often avoiding the use of 3D metric maps altogether. That said, the need to maintain a large number of reference images as an effective support of localization in a scene, nonetheless calls for them to be organized in a map structure of some kind. The problem of localization often arises as part of a navigation process. We are, therefore, interested in summarizing the reference images as a set of landmarks, which meet the requirements for image-based navigation. A contribution of this paper is to formulate such a set of requirements for the two sub-tasks involved: map construction and self-localization. These requirements are then exploited for compact map representation and accurate self-localization, using the framework of a network flow problem. During this process, we formulate the map construction and self-localization problems as convex quadratic and second-order cone programs, respectively. We evaluate our methods on publicly available indoor and outdoor datasets, where they outperform existing methods significantly.

Computer Vision and Pattern Recognition

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

342 - Lina Mezghani , Sainbayar Sukhbaatar , Thibaut Lavril 2021

In this work, we present a memory-augmented approach for image-goal navigation. Earlier attempts, including RL-based and SLAM-based approaches have either shown poor generalization performance, or are heavily-reliant on pose/depth sensors. Our method uses an attention-based end-to-end model that leverages an episodic memory to learn to navigate. First, we train a state-embedding network in a self-supervised fashion, and then use it to embed previously-visited states into the agents memory. Our navigation policy takes advantage of this information through an attention mechanism. We validate our approach with extensive evaluations, and show that our model establishes a new state of the art on the challenging Gibson dataset. Furthermore, we achieve this impressive performance from RGB input alone, without access to additional information such as position or depth, in stark contrast to related work.

Computer Vision and Pattern Recognition