DeepLab2: A TensorFlow Library for Deep Labeling

125 0 0.0 ( 0 )

Download Cite

Added by Liang-Chieh Chen

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Mark Weber - Huiyu Wang - Siyuan Qiao

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a state-of-the-art and easy-to-use TensorFlow codebase for general dense pixel prediction problems in computer vision. DeepLab2 includes all our recently developed DeepLab model variants with pretrained checkpoints as well as model training and evaluation code, allowing the community to reproduce and further improve upon the state-of-art systems. To showcase the effectiveness of DeepLab2, our Panoptic-DeepLab employing Axial-SWideRNet as network backbone achieves 68.0% PQ or 83.5% mIoU on Cityscaspes validation set, with only single-scale inference and ImageNet-1K pretrained checkpoints. We hope that publicly sharing our library could facilitate future research on dense pixel labeling tasks and envision new applications of this technology. Code is made publicly available at url{https://github.com/google-research/deeplab2}.

rate research

TensorFlow RiemOpt: a library for optimization on Riemannian manifolds

61 - Oleg Smirnov 2021

The adoption of neural networks and deep learning in non-Euclidean domains has been hindered until recently by the lack of scalable and efficient learning frameworks. Existing toolboxes in this space were mainly motivated by research and education use cases, whereas practical aspects, such as deploying and maintaining machine learning models, were often overlooked. We attempt to bridge this gap by proposing TensorFlow RiemOpt, a Python library for optimization on Riemannian manifolds in TensorFlow. The library is designed with the aim for a seamless integration with the TensorFlow ecosystem, targeting not only research, but also streamlining production machine learning pipelines.

Mathematical Software Computational Geometry Machine Learning

Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research

136 - Krishna Murthy Jatavallabhula , Edward Smith , Jean-Francois Lafleche 2019

We present Kaolin, a PyTorch library aiming to accelerate 3D deep learning research. Kaolin provides efficient implementations of differentiable 3D modules for use in deep learning systems. With functionality to load and preprocess several popular 3D datasets, and native functions to manipulate meshes, pointclouds, signed distance functions, and voxel grids, Kaolin mitigates the need to write wasteful boilerplate code. Kaolin packages together several differentiable graphics modules including rendering, lighting, shading, and view warping. Kaolin also supports an array of loss functions and evaluation metrics for seamless evaluation and provides visualization functionality to render the 3D results. Importantly, we curate a comprehensive model zoo comprising many state-of-the-art 3D deep learning architectures, to serve as a starting point for future research endeavours. Kaolin is available as open-source software at https://github.com/NVIDIAGameWorks/kaolin/.

Computer Vision and Pattern Recognition Machine Learning Robotics

Semi-Automatic Labeling for Deep Learning in Robotics

82 - Daniele De Gregorio , Alessio Tonioni , Gianluca Palli 2019

In this paper, we propose Augmented Reality Semi-automatic labeling (ARS), a semi-automatic method which leverages on moving a 2D camera by means of a robot, proving precise camera tracking, and an augmented reality pen to define initial object bounding box, to create large labeled datasets with minimal human intervention. By removing the burden of generating annotated data from humans, we make the Deep Learning technique applied to computer vision, that typically requires very large datasets, truly automated and reliable. With the ARS pipeline, we created effortlessly two novel datasets, one on electromechanical components (industrial scenario) and one on fruits (daily-living scenario), and trained robustly two state-of-the-art object detectors, based on convolutional neural networks, such as YOLO and SSD. With respect to the conventional manual annotation of 1000 frames that takes us slightly more than 10 hours, the proposed approach based on ARS allows annotating 9 sequences of about 35000 frames in less than one hour, with a gain factor of about 450. Moreover, both the precision and recall of object detection is increased by about 15% with respect to manual labeling. All our software is available as a ROS package in a public repository alongside the novel annotated datasets.

Computer Vision and Pattern Recognition

Progressive Representative Labeling for Deep Semi-Supervised Learning

89 - Xiaopeng Yan , Riquan Chen , Litong Feng 2021

Deep semi-supervised learning (SSL) has experienced significant attention in recent years, to leverage a huge amount of unlabeled data to improve the performance of deep learning with limited labeled data. Pseudo-labeling is a popular approach to expand the labeled dataset. However, whether there is a more effective way of labeling remains an open problem. In this paper, we propose to label only the most representative samples to expand the labeled set. Representative samples, selected by indegree of corresponding nodes on a directed k-nearest neighbor (kNN) graph, lie in the k-nearest neighborhood of many other samples. We design a graph neural network (GNN) labeler to label them in a progressive learning manner. Aided by the progressive GNN labeler, our deep SSL approach outperforms state-of-the-art methods on several popular SSL benchmarks including CIFAR-10, SVHN, and ILSVRC-2012. Notably, we achieve 72.1% top-1 accuracy, surpassing the previous best result by 3.3%, on the challenging ImageNet benchmark with only $10%$ labeled data.

Computer Vision and Pattern Recognition

Mesh-TensorFlow: Deep Learning for Supercomputers

206 - Noam Shazeer , Youlong Cheng , Niki Parmar 2018

Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately, efficient model-parallel algorithms tend to be complicated to discover, describe, and to implement, particularly on large clusters. We introduce Mesh-TensorFlow, a language for specifying a general class of distributed tensor computations. Where data-parallelism can be viewed as splitting tensors and operations along the batch dimension, in Mesh-TensorFlow, the user can specify any tensor-dimensions to be split across any dimensions of a multi-dimensional mesh of processors. A Mesh-TensorFlow graph compiles into a SPMD program consisting of parallel operations coupled with collective communication primitives such as Allreduce. We use Mesh-TensorFlow to implement an efficient data-parallel, model-parallel version of the Transformer sequence-to-sequence model. Using TPU meshes of up to 512 cores, we train Transformer models with up to 5 billion parameters, surpassing state of the art results on WMT14 English-to-French translation task and the one-billion-word language modeling benchmark. Mesh-Tensorflow is available at https://github.com/tensorflow/mesh .

Machine Learning Distributed Parallel and Cluster Computing Machine Learning