SEACOW: Synopsis Embedded Array Compression using Wavelet Transform

208 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Minsoo Kim

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Minsoo Kim - Hyubjin Lee -

قواعد البيانات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recently, multidimensional data is produced in various domains; because a large volume of this data is often used in complex analytical tasks, it must be stored compactly and able to respond quickly to queries. Existing compression schemes well reduce the data storage; however, they might increase overall computational costs while performing queries. Effectively querying compressed data requires a compression scheme carefully designed for the tasks. This study presents a novel compression scheme, SEACOW, for storing and querying multidimensional array data. The scheme is based on wavelet transform and utilizes a hierarchical relationship between sub-arrays in the transformed data to compress the array. A result of the compression embeds a synopsis, improving query processing performance while acting as an index. To perform experiments, we implemented an array database, SEACOW storage, and evaluated query processing performance on real data sets. Our experiments show that 1) SEACOW provides a high compression ratio comparable to existing compression schemes and 2) the synopsis improves analytical query processing performance.

قيم البحث

61 - Manojit Roy 1999

Submission withdrawn because the authors erroneously submitted a revised version as a new submission, see nlin.CD/0002028.

ديناميات الفوضوية

Classifying Fonts and Calligraphy Styles Using Complex Wavelet Transform

213 - Alican Bozkurt , Pinar Duygulu , A. Enis Cetin 2014

Recognizing fonts has become an important task in document analysis, due to the increasing number of available digital documents in different fonts and emphases. A generic font-recognition system independent of language, script and content is desirab le for processing various types of documents. At the same time, categorizing calligraphy styles in handwritten manuscripts is important for palaeographic analysis, but has not been studied sufficiently in the literature. We address the font-recognition problem as analysis and categorization of textures. We extract features using complex wavelet transform and use support vector machines for classification. Extensive experimental evaluations on different datasets in four languages and comparisons with state-of-the-art studies show that our proposed method achieves higher recognition accuracy while being computationally simpler. Furthermore, on a new dataset generated from Ottoman manuscripts, we show that the proposed method can also be used for categorizing Ottoman calligraphy with high accuracy.

الرؤية الحاسوبية وتمييز الأنماط

Image Analysis Using a Dual-Tree $M$-Band Wavelet Transform

102 - Caroline Chaux , Laurent Duval , Jean-Christophe Pesquet 2017

We propose a 2D generalization to the $M$-band case of the dual-tree decomposition structure (initially proposed by N. Kingsbury and further investigated by I. Selesnick) based on a Hilbert pair of wavelets. We particularly address (textit{i}) the co nstruction of the dual basis and (textit{ii}) the resulting directional analysis. We also revisit the necessary pre-processing stage in the $M$-band case. While several reconstructions are possible because of the redundancy of the representation, we propose a new optimal signal reconstruction technique, which minimizes potential estimation errors. The effectiveness of the proposed $M$-band decomposition is demonstrated via denoising comparisons on several image types (natural, texture, seismics), with various $M$-band wavelets and thresholding strategies. Significant improvements in terms of both overall noise reduction and direction preservation are observed.

تحليل البيانات والإحصاءات والاحتمال الرؤية الحاسوبية وتمييز الأنماط تحليل وظيفي

Transform Quantization for CNN Compression

91 - Sean I. Young , Wang Zhe , David Taubman 2020

In this paper, we compress convolutional neural network (CNN) weights post-training via transform quantization. Previous CNN quantization techniques tend to ignore the joint statistics of weights and activations, producing sub-optimal CNN performance at a given quantization bit-rate, or consider their joint statistics during training only and do not facilitate efficient compression of already trained CNN models. We optimally transform (decorrelate) and quantize the weights post-training using a rate-distortion framework to improve compression at any given quantization bit-rate. Transform quantization unifies quantization and dimensionality reduction (decorrelation) techniques in a single framework to facilitate low bit-rate compression of CNNs and efficient inference in the transform domain. We first introduce a theory of rate and distortion for CNN quantization, and pose optimum quantization as a rate-distortion optimization problem. We then show that this problem can be solved using optimal bit-depth allocation following decorrelation by the optimal End-to-end Learned Transform (ELT) we derive in this paper. Experiments demonstrate that transform quantization advances the state of the art in CNN compression in both retrained and non-retrained quantization scenarios. In particular, we find that transform quantization with retraining is able to compress CNN models such as AlexNet, ResNet and DenseNet to very low bit-rates (1-2 bits).

الرؤية الحاسوبية وتمييز الأنماط نظرية المعلومات التعلم الآلي

Adaptive Wavelet Clustering for Highly Noisy Data

216 - Zengjian Chen , Jiayi Liu , Yihe Deng 2018

In this paper we make progress on the unsupervised task of mining arbitrarily shaped clusters in highly noisy datasets, which is a task present in many real-world applications. Based on the fundamental work that first applies a wavelet transform to d ata clustering, we propose an adaptive clustering algorithm, denoted as AdaWave, which exhibits favorable characteristics for clustering. By a self-adaptive thresholding technique, AdaWave is parameter free and can handle data in various situations. It is deterministic, fast in linear time, order-insensitive, shape-insensitive, robust to highly noisy data, and requires no pre-knowledge on data models. Moreover, AdaWave inherits the ability from the wavelet transform to cluster data in different resolutions. We adopt the grid labeling data structure to drastically reduce the memory consumption of the wavelet transform so that AdaWave can be used for relatively high dimensional data. Experiments on synthetic as well as natural datasets demonstrate the effectiveness and efficiency of our proposed method.

قواعد البيانات استرجاع المعلومات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد العالي للدراسات والبحوث السكانية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

SEACOW: Synopsis Embedded Array Compression using Wavelet Transform

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً