ConTNet: Why not use convolution and transformer at the same time?

140 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Haotian Yan

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Haotian Yan - Zhe Li - Weijian Li

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Although convolutional networks (ConvNets) have enjoyed great success in computer vision (CV), it suffers from capturing global information crucial to dense prediction tasks such as object detection and segmentation. In this work, we innovatively propose ConTNet (ConvolutionTransformer Network), combining transformer with ConvNet architectures to provide large receptive fields. Unlike the recently-proposed transformer-based models (e.g., ViT, DeiT) that are sensitive to hyper-parameters and extremely dependent on a pile of data augmentations when trained from scratch on a midsize dataset (e.g., ImageNet1k), ConTNet can be optimized like normal ConvNets (e.g., ResNet) and preserve an outstanding robustness. It is also worth pointing that, given identical strong data augmentations, the performance improvement of ConTNet is more remarkable than that of ResNet. We present its superiority and effectiveness on image classification and downstream tasks. For example, our ConTNet achieves 81.8% top-1 accuracy on ImageNet which is the same as DeiT-B with less than 40% computational complexity. ConTNet-M also outperforms ResNet50 as the backbone of both Faster-RCNN (by 2.6%) and Mask-RCNN (by 3.2%) on COCO2017 dataset. We hope that ConTNet could serve as a useful backbone for CV tasks and bring new ideas for model design

قيم البحث

59 - Nicolas Quesada , J.E. Sipe 2017

We show that using the electric field as a quantization variable in nonlinear optics leads to incorrect expressions for the squeezing parameters in spontaneous parametric down-conversion and conversion rates in frequency conversion. This observation is related to the fact that if the electric field is written as a linear combination of bosonic creation and annihilation operators one cannot satisfy Maxwells equations in a nonlinear dielectric.

فيزياء الكم بصريات

The Purcell question: why do all viscosities stop at the same place?

58 - K. Trachenko , V. Brazhkin 2020

In 1977, Purcell asked why liquid viscosities all stop at the same place? Liquids are hard to understand, yet today we can answer the Purcell question in terms of fundamental physical constants fixing viscosity minima. With the Planck constant settin g the minimal viscosity, water and life appear to be well attuned to the degree of quantumness of the physical world.

الميكانيكا الإحصائية علم المواد مادة مكثفة ناعمة

SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation

104 - Bing Li , Cheng Zheng , Silvio Giancola 2021

We propose a novel scene flow estimation approach to capture and infer 3D motions from point clouds. Estimating 3D motions for point clouds is challenging, since a point cloud is unordered and its density is significantly non-uniform. Such unstructur ed data poses difficulties in matching corresponding points between point clouds, leading to inaccurate flow estimation. We propose a novel architecture named Sparse Convolution-Transformer Network (SCTN) that equips the sparse convolution with the transformer. Specifically, by leveraging the sparse convolution, SCTN transfers irregular point cloud into locally consistent flow features for estimating continuous and consistent motions within an object/local object part. We further propose to explicitly learn point relations using a point transformer module, different from exiting methods. We show that the learned relation-based contextual information is rich and helpful for matching corresponding points, benefiting scene flow estimation. In addition, a novel loss function is proposed to adaptively encourage flow consistency according to feature similarity. Extensive experiments demonstrate that our proposed approach achieves a new state of the art in scene flow estimation. Our approach achieves an error of 0.038 and 0.037 (EPE3D) on FlyingThings3D and KITTI Scene Flow respectively, which significantly outperforms previous methods by large margins.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Why Not Categorical Equivalence?

73 - James Owen Weatherall 2018

In recent years philosophers of science have explored categorical equivalence as a promising criterion for when two (physical) theories are equivalent. On the one hand, philosophers have presented several examples of theories whose relationships seem to be clarified using these categorical methods. On the other hand, philosophers and logicians have studied the relationships, particularly in the first order case, between categorical equivalence and other notions of equivalence of theories, including definitional equivalence and generalized definitional (aka Morita) equivalence. In this article, I will express some skepticism about this approach, both on technical grounds and conceptual ones. I will argue that category structure (alone) likely does not capture the structure of a theory, and discuss some recent work in light of this claim.

تاريخ وفلسفة الفيزياء

Approximate Summaries for Why and Why-not Provenance (Extended Version)

123 - Seokki Lee , Bertram Ludaescher , Boris Glavic 2020

Why and why-not provenance have been studied extensively in recent years. However, why-not provenance, and to a lesser degree why provenance, can be very large resulting in severe scalability and usability challenges. In this paper, we introduce a no vel approximate summarization technique for provenance which overcomes these challenges. Our approach uses patterns to encode (why-not) provenance concisely. We develop techniques for efficiently computing provenance summaries balancing informativeness, conciseness, and completeness. To achieve scalability, we integrate sampling techniques into provenance capture and summarization. Our approach is the first to scale to large datasets and to generate comprehensive and meaningful summaries.

قواعد البيانات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة تشرين

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

ConTNet: Why not use convolution and transformer at the same time?

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً