No Arabic abstract
Time-frequency distributions (TFDs) play a vital role in providing descriptive analysis of non-stationary signals involved in realistic scenarios. It is well known that low time-frequency (TF) resolution and the emergency of cross-terms (CTs) are two main issues, which make it difficult to analyze and interpret practical signals using TFDs. In order to address these issues, we propose the U-Net aided iterative shrinkage-thresholding algorithm (U-ISTA) for reconstructing a near-ideal TFD by exploiting structured sparsity in signal TF domain. Specifically, the signal ambiguity function is firstly compressed, followed by unfolding the ISTA as a recurrent neural network. To consider continuously distributed characteristics of signals, a structured sparsity constraint is incorporated into the unfolded ISTA by regarding the U-Net as an adaptive threshold block, in which structure-aware thresholds are learned from enormous training data to exploit the underlying dependencies among neighboring TF coefficients. The proposed U-ISTA model is trained by both non-overlapped and overlapped synthetic signals including closely and far located non-stationary components. Experimental results demonstrate that the robust U-ISTA achieves superior performance compared with state-of-the-art algorithms, and gains a high TF resolution with CTs greatly eliminated even in low signal-to-noise ratio (SNR) environments.
The design of high-resolution and cross-term (CT) free time-frequency distributions (TFDs) has been an open problem. Classical kernel based methods are limited by the trade-off between TFD resolution and CT suppression, even under optimally derived parameters. To break the current limitation, we propose a data-driven kernel learning model directly based on Wigner-Ville distribution (WVD). The proposed kernel learning based TFD (KL-TFD) model includes several stacked multi-channel learning convolutional kernels. Specifically, a skipping operator is utilized to maintain correct information transmission, and a weighted block is employed to exploit spatial and channel dependencies. These two designs simultaneously achieve high TFD resolution and CT elimination. Numerical experiments on both synthetic and real-world data confirm the superiority of the proposed KL-TFD over traditional kernel function methods.
The linear part of transient evoked (TE) otoacoustic emission (OAE) is thought to be generated via coherent reflection near the characteristic place of constituent wave components. Because of the tonotopic organization of the cochlea, high frequency emissions return earlier than low frequencies; however, due to the random nature of coherent reflection, the instantaneous frequency (IF) and amplitude envelope of TEOAEs both fluctuate. Multiple reflection components and synchronized spontaneous emissions can further make it difficult to extract the IF by linear transforms. In this paper, we propose to model TEOAEs as a sum of {em intrinsic mode-type functions} and analyze it by a {nonlinear-type time-frequency analysis} technique called concentration of frequency and time (ConceFT). When tested with synthetic OAE signals {with possibly multiple oscillatory components}, the present method is able to produce clearly visualized traces of individual components on the time-frequency plane. Further, when the signal is noisy, the proposed method is compared with existing linear and bilinear methods in its accuracy for estimating the fluctuating IF. Results suggest that ConceFT outperforms the best of these methods in terms of optimal transport distance, reducing the error by 10 to {21%} when the signal to noise ratio is 10 dB or below.
This paper presents an explore-and-classify framework for structured architectural reconstruction from an aerial image. Starting from a potentially imperfect building reconstruction by an existing algorithm, our approach 1) explores the space of building models by modifying the reconstruction via heuristic actions; 2) learns to classify the correctness of building models while generating classification labels based on the ground-truth, and 3) repeat. At test time, we iterate exploration and classification, seeking for a result with the best classification score. We evaluate the approach using initial reconstructions by two baselines and two state-of-the-art reconstruction algorithms. Qualitative and quantitative evaluations demonstrate that our approach consistently improves the reconstruction quality from every initial reconstruction.
In audio signal processing, probabilistic time-frequency models have many benefits over their non-probabilistic counterparts. They adapt to the incoming signal, quantify uncertainty, and measure correlation between the signals amplitude and phase information, making time domain resynthesis straightforward. However, these models are still not widely used since they come at a high computational cost, and because they are formulated in such a way that it can be difficult to interpret all the modelling assumptions. By showing their equivalence to Spectral Mixture Gaussian processes, we illuminate the underlying model assumptions and provide a general framework for constructing more complex models that better approximate real-world signals. Our interpretation makes it intuitive to inspect, compare, and alter the models since all prior knowledge is encoded in the Gaussian process kernel functions. We utilise a state space representation to perform efficient inference via Kalman smoothing, and we demonstrate how our interpretation allows for efficient parameter learning in the frequency domain.
To handle time series with complicated oscillatory structure, we propose a novel time-frequency (TF) analysis tool that fuses the short time Fourier transform (STFT) and periodic transform (PT). Since many time series oscillate with time-varying frequency, amplitude and non-sinusoidal oscillatory pattern, a direct application of PT or STFT might not be suitable. However, we show that by combining them in a proper way, we obtain a powerful TF analysis tool. We first combine the Ramanujan sums and $l_1$ penalization to implement the PT. We call the algorithm Ramanujan PT (RPT). The RPT is of its own interest for other applications, like analyzing short signal composed of components with integer periods, but that is not the focus of this paper. Second, the RPT is applied to modify the STFT and generate a novel TF representation of the complicated time series that faithfully reflect the instantaneous frequency information of each oscillatory components. We coin the proposed TF analysis the Ramanujan de-shape (RDS) and vectorized RDS (vRDS). In addition to showing some preliminary analysis results on complicated biomedical signals, we provide theoretical analysis about RPT. Specifically, we show that the RPT is robust to three commonly encountered noises, including envelop fluctuation, jitter and additive noise.