Curve-Structure Segmentation from Depth Maps: A CNN-based Approach and Its Application to Exploring Cultural Heritage Objects

158 0 0.0 ( 0 )

Download Cite

Added by Yuhang Lu

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Yuhang Lu - Jun Zhou - Jing Wang

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Motivated by the important archaeological application of exploring cultural heritage objects, in this paper we study the challenging problem of automatically segmenting curve structures that are very weakly stamped or carved on an object surface in the form of a highly noisy depth map. Different from most classical low-level image segmentation methods that are known to be very sensitive to the noise and occlusions, we propose a new supervised learning algorithm based on Convolutional Neural Network (CNN) to implicitly learn and utilize more curve geometry and pattern information for addressing this challenging problem. More specifically, we first propose a Fully Convolutional Network (FCN) to estimate the skeleton of curve structures and at each skeleton pixel, a scale value is estimated to reflect the local curve width. Then we propose a dense prediction network to refine the estimated curve skeletons. Based on the estimated scale values, we finally develop an adaptive thresholding algorithm to achieve the final segmentation of curve structures. In the experiment, we validate the performance of the proposed method on a dataset of depth images scanned from unearthed pottery sherds dating to the Woodland period of Southeastern North America.

rate research

Design Identification of Curve Patterns on Cultural Heritage Objects: Combining Template Matching and CNN-based Re-Ranking

121 - Jun Zhou , Yuhang Lu , Kang Zheng 2018

The surfaces of many cultural heritage objects were embellished with various patterns, especially curve patterns. In practice, most of the unearthed cultural heritage objects are highly fragmented, e.g., sherds of potteries or vessels, and each of them only shows a very small portion of the underlying full design, with noise and deformations. The goal of this paper is to address the challenging problem of automatically identifying the underlying full design of curve patterns from such a sherd. Specifically, we formulate this problem as template matching: curve structure segmented from the sherd is matched to each location with each possible orientation of each known full design. In this paper, we propose a new two-stage matching algorithm, with a different matching cost in each stage. In Stage 1, we use a traditional template matching, which is highly computationally efficient, over the whole search space and identify a small set of candidate matchings. In Stage 2, we derive a new matching cost by training a dual-source Convolutional Neural Network (CNN) and apply it to re-rank the candidate matchings identified in Stage 1. We collect 600 pottery sherds with 98 full designs from the Woodland Period in Southeastern North America for experiments and the performance of the proposed algorithm is very competitive.

Machine Learning Machine Learning

A CNN Segmentation-Based Approach to Object Detection and Tracking in Ultrasound Scans with Application to the Vagus Nerve Detection

101 - Abdullah F. Al-Battal , Yan Gong , Lu Xu 2021

Ultrasound scanning is essential in several medical diagnostic and therapeutic applications. It is used to visualize and analyze anatomical features and structures that influence treatment plans. However, it is both labor intensive, and its effectiveness is operator dependent. Real-time accurate and robust automatic detection and tracking of anatomical structures while scanning would significantly impact diagnostic and therapeutic procedures to be consistent and efficient. In this paper, we propose a deep learning framework to automatically detect and track a specific anatomical target structure in ultrasound scans. Our framework is designed to be accurate and robust across subjects and imaging devices, to operate in real-time, and to not require a large training set. It maintains a localization precision and recall higher than 90% when trained on training sets that are as small as 20% in size of the original training set. The framework backbone is a weakly trained segmentation neural network based on U-Net. We tested the framework on two different ultrasound datasets with the aim to detect and track the Vagus nerve, where it outperformed current state-of-the-art real-time object detection networks.

Computer Vision and Pattern Recognition Image and Video Processing

Comprehensiveness of Archives: A Modern AI-enabled Approach to Build Comprehensive Shared Cultural Heritage

168 - Abhishek Gupta Montreal AIn Ethics Institute 2020

Archives play a crucial role in the construction and advancement of society. Humans place a great deal of trust in archives and depend on them to craft public policies and to preserve languages, cultures, self-identity, views and values. Yet, there are certain voices and viewpoints that remain elusive in the current processes deployed in the classification and discoverability of records and archives. In this paper, we explore the ramifications and effects of centralized, due process archival systems on marginalized communities. There is strong evidence to prove the need for progressive design and technological innovation while in the pursuit of comprehensiveness, equity and justice. Intentionality and comprehensiveness is our greatest opportunity when it comes to improving archival practices and for the advancement and thrive-ability of societies at large today. Intentionality and comprehensiveness is achievable with the support of technology and the Information Age we live in today. Reopening, questioning and/or purposefully including others voices in archival processes is the intention we present in our paper. We provide examples of marginalized communities who continue to lead community archive movements in efforts to reclaim and protect their cultural identity, knowledge, views and futures. In conclusion, we offer design and AI-dominant technological considerations worth further investigation in efforts to bridge systemic gaps and build robust archival processes.

Computers and Society Digital Libraries Social and Information Networks

Towards truly simultaneous PIXE and RBS analysis of layered objects in cultural heritage

300 - C. Pascual-Izarra 2007

For a long time, RBS and PIXE techniques have been used in the field of cultural heritage. Although the complementarity of both techniques has long been acknowledged, its full potential has not been yet developed due to the lack of general purpose software tools for analysing the data from both techniques in a coherent way. In this work we provide an example of how the recent addition of PIXE to the set of techniques supported by the DataFurnace code can significantly change this situation. We present a case in which a non homogeneous sample (an oxidized metal from a photographic plate -heliography- made by Niepce in 1827) is analysed using RBS and PIXE in a straightforward and powerful way that can only be performed with a code that treats both techniques simultaneously as a part of one single and coherent analysis. The optimization capabilities of DataFurnace, allowed us to obtain the composition profiles for these samples in a very simple way.

Materials Science

C3: Concentrated-Comprehensive Convolution and its application to semantic segmentation

127 - Hyojin Park , Youngjoon Yoo , Geonseok Seo 2018

One of the practical choices for making a lightweight semantic segmentation model is to combine a depth-wise separable convolution with a dilated convolution. However, the simple combination of these two methods results in an over-simplified operation which causes severe performance degradation due to loss of information contained in the feature map. To resolve this problem, we propose a new block called Concentrated-Comprehensive Convolution (C3) which applies the asymmetric convolutions before the depth-wise separable dilated convolution to compensate for the information loss due to dilated convolution. The C3 block consists of a concentration stage and a comprehensive convolution stage. The first stage uses two depth-wise asymmetric convolutions for compressed information from the neighboring pixels to alleviate the information loss. The second stage increases the receptive field by using a depth-wise separable dilated convolution from the feature map of the first stage. We applied the C3 block to various segmentation frameworks (ESPNet, DRN, ERFNet, ENet) for proving the beneficial properties of our proposed method. Experimental results show that the proposed method preserves the original accuracies on Cityscapes dataset while reducing the complexity. Furthermore, we modified ESPNet to achieve about 2% better performance while reducing the number of parameters by half and the number of FLOPs by 35% compared with the original ESPNet. Finally, experiments on ImageNet classification task show that C3 block can successfully replace dilated convolutions.

Computer Vision and Pattern Recognition