Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Surgical Gesture Recognition Based on Bidirectional Multi-Layer Independently RNN with Explainable Spatial Feature Extraction

74 0 0.0 ( 0 )

Download Cite

Added by Dandan Zhang

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Dandan Zhang - Ruoxi Wang - Benny Lo

Computer Vision and Pattern Recognition Robotics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Minimally invasive surgery mainly consists of a series of sub-tasks, which can be decomposed into basic gestures or contexts. As a prerequisite of autonomic operation, surgical gesture recognition can assist motion planning and decision-making, and build up context-aware knowledge to improve the surgical robot control quality. In this work, we aim to develop an effective surgical gesture recognition approach with an explainable feature extraction process. A Bidirectional Multi-Layer independently RNN (BML-indRNN) model is proposed in this paper, while spatial feature extraction is implemented via fine-tuning of a Deep Convolutional Neural Network(DCNN) model constructed based on the VGG architecture. To eliminate the black-box effects of DCNN, Gradient-weighted Class Activation Mapping (Grad-CAM) is employed. It can provide explainable results by showing the regions of the surgical images that have a strong relationship with the surgical gesture classification results. The proposed method was evaluated based on the suturing task with data obtained from the public available JIGSAWS database. Comparative studies were conducted to verify the proposed framework. Results indicated that the testing accuracy for the suturing task based on our proposed method is 87.13%, which outperforms most of the state-of-the-art algorithms.

rate research

Multi-Modality Fusion based on Consensus-Voting and 3D Convolution for Isolated Gesture Recognition

60 - Jiali Duan , Shuai Zhou , Jun Wan 2016

Recently, the popularity of depth-sensors such as Kinect has made depth videos easily available while its advantages have not been fully exploited. This paper investigates, for gesture recognition, to explore the spatial and temporal information complementarily embedded in RGB and depth sequences. We propose a convolutional twostream consensus voting network (2SCVN) which explicitly models both the short-term and long-term structure of the RGB sequences. To alleviate distractions from background, a 3d depth-saliency ConvNet stream (3DDSN) is aggregated in parallel to identify subtle motion characteristics. These two components in an unified framework significantly improve the recognition accuracy. On the challenging Chalearn IsoGD benchmark, our proposed method outperforms the first place on the leader-board by a large margin (10.29%) while also achieving the best result on RGBD-HuDaAct dataset (96.74%). Both quantitative experiments and qualitative analysis shows the effectiveness of our proposed framework and codes will be released to facilitate future research.

Computer Vision and Pattern Recognition

Gated Multi-layer Convolutional Feature Extraction Network for Robust Pedestrian Detection

194 - Tianrui Liu , Jun-Jie Huang , Tianhong Dai 2019

Pedestrian detection methods have been significantly improved with the development of deep convolutional neural networks. Nevertheless, robustly detecting pedestrians with a large variant on sizes and with occlusions remains a challenging problem. In this paper, we propose a gated multi-layer convolutional feature extraction method which can adaptively generate discriminative features for candidate pedestrian regions. The proposed gated feature extraction framework consists of squeeze units, gate units and a concatenation layer which perform feature dimension squeezing, feature elements manipulation and convolutional features combination from multiple CNN layers, respectively. We proposed two different gate models which can manipulate the regional feature maps in a channel-wise selection manner and a spatial-wise selection manner, respectively. Experiments on the challenging CityPersons dataset demonstrate the effectiveness of the proposed method, especially on detecting those small-size and occluded pedestrians.

Computer Vision and Pattern Recognition

Appearance-based Gesture recognition in the compressed domain

64 - Shaojie Xu , Anvesha Amaravati , Justin Romberg 2019

We propose a novel appearance-based gesture recognition algorithm using compressed domain signal processing techniques. Gesture features are extracted directly from the compressed measurements, which are the block averages and the coded linear combinations of the image sensors pixel values. We also improve both the computational efficiency and the memory requirement of the previous DTW-based K-NN gesture classifiers. Both simulation testing and hardware implementation strongly support the proposed algorithm.

Computer Vision and Pattern Recognition Machine Learning Machine Learning

Bidirectional RNN-based Few-shot Training for Detecting Multi-stage Attack

66 - Di Zhao , Jiqiang Liu , Jialin Wang 2019

Feint Attack, as a new type of APT attack, has become the focus of attention. It adopts a multi-stage attacks mode which can be concluded as a combination of virtual attacks and real attacks. Under the cover of virtual attacks, real attacks can achieve the real purpose of the attacker, as a result, it often caused huge losses inadvertently. However, to our knowledge, all previous works use common methods such as Causal-Correlation or Cased-based to detect outdated multi-stage attacks. Few attentions have been paid to detect the Feint Attack, because the difficulty of detection lies in the diversification of the concept of Feint Attack and the lack of professional datasets, many detection methods ignore the semantic relationship in the attack. Aiming at the existing challenge, this paper explores a new method to solve the problem. In the attack scenario, the fuzzy clustering method based on attribute similarity is used to mine multi-stage attack chains. Then we use a few-shot deep learning algorithm (SMOTE&CNN-SVM) and bidirectional Recurrent Neural Network model (Bi-RNN) to obtain the Feint Attack chains. Feint Attack is simulated by the real attack inserted in the normal causal attack chain, and the addition of the real attack destroys the causal relationship of the original attack chain. So we used Bi-RNN coding to obtain the hidden feature of Feint Attack chain. In the end, our method achieved the goal to detect the Feint Attack accurately by using the LLDoS1.0 and LLDoS2.0 of DARPA2000 and CICIDS2017 of Canadian Institute for Cybersecurity.

Cryptography and Security Machine Learning Machine Learning

Coupled Network for Robust Pedestrian Detection with Gated Multi-Layer Feature Extraction and Deformable Occlusion Handling

94 - Tianrui Liu , Wenhan Luo , Lin Ma 2019

Pedestrian detection methods have been significantly improved with the development of deep convolutional neural networks. Nevertheless, detecting small-scaled pedestrians and occluded pedestrians remains a challenging problem. In this paper, we propose a pedestrian detection method with a couple-network to simultaneously address these two issues. One of the sub-networks, the gated multi-layer feature extraction sub-network, aims to adaptively generate discriminative features for pedestrian candidates in order to robustly detect pedestrians with large variations on scales. The second sub-network targets in handling the occlusion problem of pedestrian detection by using deformable regional RoI-pooling. We investigate two different gate units for the gated sub-network, namely, the channel-wise gate unit and the spatio-wise gate unit, which can enhance the representation ability of the regional convolutional features among the channel dimensions or across the spatial domain, repetitively. Ablation studies have validated the effectiveness of both the proposed gated multi-layer feature extraction sub-network and the deformable occlusion handling sub-network. With the coupled framework, our proposed pedestrian detector achieves state-of-the-art results on the Caltech and the CityPersons pedestrian detection benchmarks.

Computer Vision and Pattern Recognition

comments

Fetching comments

Kalamoon Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Surgical Gesture Recognition Based on Bidirectional Multi-Layer Independently RNN with Explainable Spatial Feature Extraction

Ask ChatGPT about the research

No Arabic abstract

Read More