No Arabic abstract
We present a unified translation of LTL formulas into deterministic Rabin automata, limit-deterministic Buchi automata, and nondeterministic Buchi automata. The translations yield automata of asymptotically optimal size (double or single exponential, respectively). All three translations are derived from one single Master Theorem of purely logical nature. The Master Theorem decomposes the language of a formula into a positive boolean combination of languages that can be translated into {omega}-automata by elementary means. In particular, Safras, ranking, and breakpoint constructions used in other translations are not needed.
Speech-to-text alignment is a critical component of neural textto-speech (TTS) models. Autoregressive TTS models typically use an attention mechanism to learn these alignments on-line. However, these alignments tend to be brittle and often fail to generalize to long utterances and out-of-domain text, leading to missing or repeating words. Most non-autoregressive endto-end TTS models rely on durations extracted from external sources. In this paper we leverage the alignment mechanism proposed in RAD-TTS as a generic alignment learning framework, easily applicable to a variety of neural TTS models. The framework combines forward-sum algorithm, the Viterbi algorithm, and a simple and efficient static prior. In our experiments, the alignment learning framework improves all tested TTS architectures, both autoregressive (Flowtron, Tacotron 2) and non-autoregressive (FastPitch, FastSpeech 2, RAD-TTS). Specifically, it improves alignment convergence speed of existing attention-based mechanisms, simplifies the training pipeline, and makes the models more robust to errors on long utterances. Most importantly, the framework improves the perceived speech synthesis quality, as judged by human evaluators.
Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed. Robust quantization offers an alternative approach with improved tolerance to different classes of data-types and quantization policies. It opens up new exciting applications where the quantization process is not static and can vary to meet different circumstances and implementations. To address this issue, we propose a method that provides intrinsic robustness to the model against a broad range of quantization processes. Our method is motivated by theoretical arguments and enables us to store a single generic model capable of operating at various bit-widths and quantization policies. We validate our methods effectiveness on different ImageNet models.
Controller synthesis for general linear temporal logic (LTL) objectives is a challenging task. The standard approach involves translating the LTL objective into a deterministic parity automaton (DPA) by means of the Safra-Piterman construction. One of the challenges is the size of the DPA, which often grows very fast in practice, and can reach double exponential size in the length of the LTL formula. In this paper we describe a single exponential translation from limit-deterministic Buchi automata (LDBA) to DPA, and show that it can be concatenated with a recent efficient translation from LTL to LDBA to yield a double exponential, enquote{Safraless} LTL-to-DPA construction. We also report on an implementation, a comparison with the SPOT library, and performance on several sets of formulas, including instances from the 2016 SyntComp competition.
We study the link between baryons and dark matter in 240 galaxies with spatially resolved kinematic data. Our sample spans 9 dex in stellar mass and includes all morphological types. We consider (i) 153 late-type galaxies (LTGs; spirals and irregulars) with gas rotation curves from the SPARC database; (ii) 25 early-type galaxies (ETGs; ellipticals and lenticulars) with stellar and HI data from ATLAS^3D or X-ray data from Chandra; and (iii) 62 dwarf spheroidals (dSphs) with individual-star spectroscopy. We find that LTGs, ETGs, and classical dSphs follow the same radial acceleration relation: the observed acceleration (gobs) correlates with that expected from the distribution of baryons (gbar) over 4 dex. The relation coincides with the 1:1 line (no dark matter) at high accelerations but systematically deviates from unity below a critical scale of ~10^-10 m/s^2. The observed scatter is remarkably small (<0.13 dex) and largely driven by observational uncertainties. The residuals do not correlate with any global or local galaxy property (baryonic mass, gas fraction, radius, etc.). The radial acceleration relation is tantamount to a Natural Law: when the baryonic contribution is measured, the rotation curve follows, and vice versa. Including ultrafaint dSphs, the relation may extend by another 2 dex and possibly flatten at gbar<10^-12 m/s^2, but these data are significantly more uncertain. The radial acceleration relation subsumes and generalizes several well-known dynamical properties of galaxies, like the Tully-Fisher and Faber-Jackson relations, the baryon-halo conspiracies, and Renzos rule.
Deep learning-based video manipulation methods have become widely accessible to the masses. With little to no effort, people can quickly learn how to generate deepfake (DF) videos. While deep learning-based detection methods have been proposed to identify specific types of DFs, their performance suffers for other types of deepfake methods, including real-world deepfakes, on which they are not sufficiently trained. In other words, most of the proposed deep learning-based detection methods lack transferability and generalizability. Beyond detecting a single type of DF from benchmark deepfake datasets, we focus on developing a generalized approach to detect multiple types of DFs, including deepfakes from unknown generation methods such as DeepFake-in-the-Wild (DFW) videos. To better cope with unknown and unseen deepfakes, we introduce a Convolutional LSTM-based Residual Network (CLRNet), which adopts a unique model training strategy and explores spatial as well as the temporal information in deepfakes. Through extensive experiments, we show that existing defense methods are not ready for real-world deployment. Whereas our defense method (CLRNet) achieves far better generalization when detecting various benchmark deepfake methods (97.57% on average). Furthermore, we evaluate our approach with a high-quality DeepFake-in-the-Wild dataset, collected from the Internet containing numerous videos and having more than 150,000 frames. Our CLRNet model demonstrated that it generalizes well against high-quality DFW videos by achieving 93.86% detection accuracy, outperforming existing state-of-the-art defense methods by a considerable margin.