بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Fourier Series Expansion Based Filter Parametrization for Equivariant Convolutions

88 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Qi Xie

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Qi Xie - Qian Zhao - Zongben Xu

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

It has been shown that equivariant convolution is very helpful for many types of computer vision tasks. Recently, the 2D filter parametrization technique plays an important role when designing equivariant convolutions. However, the current filter parametrization method still has its evident drawbacks, where the most critical one lies in the accuracy problem of filter representation. Against this issue, in this paper we modify the classical Fourier series expansion for 2D filters, and propose a new set of atomic basis functions for filter parametrization. The proposed filter parametrization method not only finely represents 2D filters with zero error when the filter is not rotated, but also substantially alleviates the fence-effect-caused quality degradation when the filter is rotated. Accordingly, we construct a new equivariant convolution method based on the proposed filter parametrization method, named F-Conv. We prove that the equivariance of the proposed F-Conv is exact in the continuous domain, which becomes approximate only after discretization. Extensive experiments show the superiority of the proposed method. Particularly, we adopt rotation equivariant convolution methods to image super-resolution task, and F-Conv evidently outperforms previous filter parametrization based method in this task, reflecting its intrinsic capability of faithfully preserving rotation symmetries in local image features.

قيم البحث

163 - Ze Wang , Zichen Miao , Jun Hu 2021

Applying feature dependent network weights have been proved to be effective in many fields. However, in practice, restricted by the enormous size of model parameters and memory footprints, scalable and versatile dynamic convolutions with per-pixel ad apted filters are yet to be fully explored. In this paper, we address this challenge by decomposing filters, adapted to each spatial position, over dynamic filter atoms generated by a light-weight network from local features. Adaptive receptive fields can be supported by further representing each filter atom over sets of pre-fixed multi-scale bases. As plug-and-play replacements to convolutional layers, the introduced adaptive convolutions with per-pixel dynamic atoms enable explicit modeling of intra-image variance, while avoiding heavy computation, parameters, and memory cost. Our method preserves the appealing properties of conventional convolutions as being translation-equivariant and parametrically efficient. We present experiments to show that, the proposed method delivers comparable or even better performance across tasks, and are particularly effective on handling tasks with significant intra-image variance.

الرؤية الحاسوبية وتمييز الأنماط

Resolution-robust Large Mask Inpainting with Fourier Convolutions

83 - Roman Suvorov , Elizaveta Logacheva , Anton Mashikhin 2021

Modern image inpainting systems, despite the significant progress, often struggle with large missing areas, complex geometric structures, and high-resolution images. We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function. To alleviate this issue, we propose a new method called large mask inpainting (LaMa). LaMa is based on i) a new inpainting network architecture that uses fast Fourier convolutions, which have the image-wide receptive field; ii) a high receptive field perceptual loss; and iii) large training masks, which unlocks the potential of the first two components. Our inpainting network improves the state-of-the-art across a range of datasets and achieves excellent performance even in challenging scenarios, e.g. completion of periodic structures. Our model generalizes surprisingly well to resolutions that are higher than those seen at train time, and achieves this at lower parameter&compute costs than the competitive baselines. The code is available at https://github.com/saic-mdal/lama.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties

362 - Benjamin Kurt Miller , Mario Geiger , Tess E. Smidt 2020

Equivariant neural networks (ENNs) are graph neural networks embedded in $mathbb{R}^3$ and are well suited for predicting molecular properties. The ENN library e3nn has customizable convolutions, which can be designed to depend only on distances betw een points, or also on angular features, making them rotationally invariant, or equivariant, respectively. This paper studies the practical value of including angular dependencies for molecular property prediction directly via an ablation study with texttt{e3nn} and the QM9 data set. We find that, for fixed network depth and parameter count, adding angular features decreased test error by an average of 23%. Meanwhile, increasing network depth decreased test error by only 4% on average, implying that rotationally equivariant layers are comparatively parameter efficient. We present an explanation of the accuracy improvement on the dipole moment, the target which benefited most from the introduction of angular features.

التعلم الآلي الفيزياء الكيميائية الفيزياء الحسابية

Power series as Fourier series

154 - Debraj Chakrabarti , Anirban Dawn 2021

An abstract theory of Fourier series in locally convex topological vector spaces is developed. An analog of Fej{e}rs theorem is proved for these series. The theory is applied to distributional solutions of Cauchy-Riemann equations to recover basic re sults of complex analysis. Some classical results of function theory are also shown to be consequences of the series expansion.

المتغيرات المعقدة

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

106 - Ziyu Liu , Hongwen Zhang , Zhenghao Chen 2020

Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal d ependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-range joint relationship modeling under multi-scale operators and (2) unobstructed cross-spacetime information flow for capturing complex spatial-temporal dependencies. In this work, we present (1) a simple method to disentangle multi-scale graph convolutions and (2) a unified spatial-temporal graph convolutional operator named G3D. The proposed multi-scale aggregation scheme disentangles the importance of nodes in different neighborhoods for effective long-range modeling. The proposed G3D module leverages dense cross-spacetime edges as skip connections for direct information propagation across the spatial-temporal graph. By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400.

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الملك عبد العزيز

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Fourier Series Expansion Based Filter Parametrization for Equivariant Convolutions

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً