مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Phase space methods and psychoacoustic models in lossy transform coding

45 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Matthew Cargo

تاريخ النشر 2007

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Matthew Charles Cargo

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

I present a method for lossy transform coding of digital audio that uses the Weyl symbol calculus for constructing the encoding and decoding transformation. The method establishes a direct connection between a time-frequency representation of the signal dependent threshold of masked noise and the encode/decode pair. The formalism also offers a time-frequency measure of perceptual entropy.

قيم البحث

اقرأ أيضاً

Nonlinear Transform Coding

82 - Johannes Balle , Philip A. Chou , David Minnen 2020

We review a class of methods that can be collected under the name nonlinear transform coding (NTC), which over the past few years have become competitive with the best linear transform codecs for images, and have superseded them in terms of rate--dis tortion performance under established perceptual quality metrics such as MS-SSIM. We assess the empirical rate--distortion performance of NTC with the help of simple example sources, for which the optimal performance of a vector quantizer is easier to estimate than with natural data sources. To this end, we introduce a novel variant of entropy-constrained vector quantization. We provide an analysis of various forms of stochastic optimization techniques for NTC models; review architectures of transforms based on artificial neural networks, as well as learned entropy models; and provide a direct comparison of a number of methods to parameterize the rate--distortion trade-off of nonlinear transforms, introducing a simplified one.

نظرية المعلومات معالجة الصور والفيديو نظرية المعلومات

Construction of optimal spectral methods in phase retrieval

70 - Antoine Maillard , Florent Krzakala , Yue M. Lu 2020

We consider the phase retrieval problem, in which the observer wishes to recover a $n$-dimensional real or complex signal $mathbf{X}^star$ from the (possibly noisy) observation of $|mathbf{Phi} mathbf{X}^star|$, in which $mathbf{Phi}$ is a matrix of size $m times n$. We consider a emph{high-dimensional} setting where $n,m to infty$ with $m/n = mathcal{O}(1)$, and a large class of (possibly correlated) random matrices $mathbf{Phi}$ and observation channels. Spectral methods are a powerful tool to obtain approximate observations of the signal $mathbf{X}^star$ which can be then used as initialization for a subsequent algorithm, at a low computational cost. In this paper, we extend and unify previous results and approaches on spectral methods for the phase retrieval problem. More precisely, we combine the linearization of message-passing algorithms and the analysis of the emph{Bethe Hessian}, a classical tool of statistical physics. Using this toolbox, we show how to derive optimal spectral methods for arbitrary channel noise and right-unitarily invariant matrix $mathbf{Phi}$, in an automated manner (i.e. with no optimization over any hyperparameter or preprocessing function).

نظرية المعلومات الأنظمة المضطربة والشبكات العصبية نظرية المعلومات

Coding for Network Coding

106 - Andrea Montanari , Ruediger Urbanke 2007

We consider communication over a noisy network under randomized linear network coding. Possible error mechanism include node- or link- failures, Byzantine behavior of nodes, or an over-estimate of the network min-cut. Building on the work of Koetter and Kschischang, we introduce a probabilistic model for errors. We compute the capacity of this channel and we define an error-correction scheme based on random sparse graphs and a low-complexity decoding algorithm. By optimizing over the code degree profile, we show that this construction achieves the channel capacity in complexity which is jointly quadratic in the number of coded information bits and sublogarithmic in the error probability.

نظرية المعلومات بنية الشبكات والإنترنت نظرية المعلومات

An Adaptive Psychoacoustic Model for Automatic Speech Recognition

79 - Peng Dai , Xue Teng , Frank Rudzicz 2016

Compared with automatic speech recognition (ASR), the human auditory system is more adept at handling noise-adverse situations, including environmental noise and channel distortion. To mimic this adeptness, auditory models have been widely incorporat ed in ASR systems to improve their robustness. This paper proposes a novel auditory model which incorporates psychoacoustics and otoacoustic emissions (OAEs) into ASR. In particular, we successfully implement the frequency-dependent property of psychoacoustic models and effectively improve resulting system performance. We also present a novel double-transform spectrum-analysis technique, which can qualitatively predict ASR performance for different noise types. Detailed theoretical analysis is provided to show the effectiveness of the proposed algorithm. Experiments are carried out on the AURORA2 database and show that the word recognition rate using our proposed feature extraction method is significantly increased over the baseline. Given models trained with clean speech, our proposed method achieves up to 85.39% word recognition accuracy on noisy data.

الحساب واللغة أنظمة الصوت في الحاسوب

Information-Theoretic Bounds and Approximations in Neural Population Coding

284 - Wentao Huang , Kechen Zhang 2016

While Shannons mutual information has widespread applications in many disciplines, for practical applications it is often difficult to calculate its value accurately for high-dimensional variables because of the curse of dimensionality. This paper is focused on effective approximation methods for evaluating mutual information in the context of neural population coding. For large but finite neural populations, we derive several information-theoretic asymptotic bounds and approximation formulas that remain valid in high-dimensional spaces. We prove that optimizing the population density distribution based on these approximation formulas is a convex optimization problem which allows efficient numerical solutions. Numerical simulation results confirmed that our asymptotic formulas were highly accurate for approximating mutual information for large neural populations. In special cases, the approximation formulas are exactly equal to the true mutual information. We also discuss techniques of variable transformation and dimensionality reduction to facilitate computation of the approximations.

نظرية المعلومات التعلم الآلي نظرية المعلومات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة طرطوس

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Phase space methods and psychoacoustic models in lossy transform coding

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً