No Arabic abstract
Noise pollution is one of the topmost quality of life issues for urban residents in the United States. Continued exposure to high levels of noise has proven effects on health, including acute effects such as sleep disruption, and long-term effects such as hypertension, heart disease, and hearing loss. To investigate and ultimately aid in the mitigation of urban noise, a network of 55 sensor nodes has been deployed across New York City for over two years, collecting sound pressure level (SPL) and audio data. This network has cumulatively amassed over 75 years of calibrated, high-resolution SPL measurements and 35 years of audio data. In addition, high frequency telemetry data has been collected that provides an indication of a sensors health. This telemetry data was analyzed over an 18 month period across 31 of the sensors. It has been used to develop a prototype model for pre-failure detection which has the ability to identify sensors in a prefail state 69.1% of the time. The entire network infrastructure is outlined, including the operation of the sensors, followed by an analysis of its data yield and the development of the fault detection approach and the future system integration plans for this.
This paper proposes an noise type classification aided attention-based neural network approach for monaural speech enhancement. The network is constructed based on a previous work by introducing a noise classification subnetwork into the structure and taking the classification embedding into the attention mechanism for guiding the network to make better feature extraction. Specifically, to make the network an end-to-end way, an audio encoder and decoder constructed by temporal convolution is used to make transformation between waveform and spectrogram. Additionally, our model is composed of two long short term memory (LSTM) based encoders, two attention mechanism, a noise classifier and a speech mask generator. Experiments show that, compared with OM-LSA and the previous work, the proposed noise classification aided attention-based approach can achieve better performance in terms of speech quality (PESQ). More promisingly, our approach has better generalization ability to unseen noise conditions.
It remains a tough challenge to recover the speech signals contaminated by various noises under real acoustic environments. To this end, we propose a novel system for denoising in the complicated applications, which is mainly comprised of two pipelines, namely a two-stage network and a post-processing module. The first pipeline is proposed to decouple the optimization problem w:r:t: magnitude and phase, i.e., only the magnitude is estimated in the first stage and both of them are further refined in the second stage. The second pipeline aims to further suppress the remaining unnatural distorted noise, which is demonstrated to sufficiently improve the subjective quality. In the ICASSP 2021 Deep Noise Suppression (DNS) Challenge, our submitted system ranked top-1 for the real-time track 1 in terms of Mean Opinion Score (MOS) with ITU-T P.808 framework.
In metropolitan areas populated with commercial buildings, electric power supply is stringent especially during business hours. Demand side management using battery is a promising solution to mitigate peak demands, however long payback time creates barriers for large scale adoption. In this paper, we have developed a design phase battery life-cycle cost assessment tool and a runtime controller for the building owners, taking into account the degradation of battery. In the design phase, perfect knowledge on building load profile is assumed to estimate ideal payback time. In runtime, stochastic programming and load predictions are applied to address the uncertainties in loads for producing optimal battery operation. For validation, we have performed numerical experiments using the real-life tariff model serves New York City, Zn/MnO2 battery, and state-of-the-art building simulation tool. Experimental results shows a small gap between design phase assessment and runtime control. To further examine the proposed methods, we have applied the same tariff model and performed numerical experiments on nine weather zones and three types of commercial buildings. On contrary to the common practice of shallow discharging battery for preventing phenomenal degradation, experimental results show promising payback time achieved by optimally deep discharge a battery.
This technical report describes our system that is submitted to the Deep Noise Suppression Challenge and presents the results for the non-real-time track. To refine the estimation results stage by stage, we utilize recursive learning, a type of training protocol which aggravates the information through multiple stages with a memory mechanism. The attention generator network is designed to dynamically control the feature distribution of the noise reduction network. To improve the phase recovery accuracy, we take the complex spectral mapping procedure by decoding both real and imaginary spectra. For the final blind test set, the average MOS improvements of the submitted system in noreverb, reverb, and realrec categories are 0.49, 0.24, and 0.36, respectively.
In low signal-to-noise ratio conditions, it is difficult to effectively recover the magnitude and phase information simultaneously. To address this problem, this paper proposes a two-stage algorithm to decouple the joint optimization problem w.r.t. magnitude and phase into two sub-tasks. In the first stage, only magnitude is optimized, which incorporates noisy phase to obtain a coarse complex clean speech spectrum estimation. In the second stage, both the magnitude and phase components are refined. The experiments are conducted on the WSJ0-SI84 corpus, and the results show that the proposed approach significantly outperforms previous baselines in terms of PESQ, ESTOI, and SDR.