Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

97 0 0.0 ( 0 )

Download Cite

Added by Jiangning Zhang

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Jiangning Zhang - Chao Xu - Jian Li

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Inspired by biological evolution, we explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derive that both of them have consistent mathematical representation. Analogous to the dynamic local population in EA, we improve the existing transformer structure and propose a more efficient EAT model, and design task-related heads to deal with different tasks more flexibly. Moreover, we introduce the spatial-filling curve into the current vision transformer to sequence image data into a uniform sequential format. Thus we can design a unified EAT framework to address multi-modal tasks, separating the network architecture from the data format adaptation. Our approach achieves state-of-the-art results on the ImageNet classification task compared with recent vision transformer works while having smaller parameters and greater throughput. We further conduct multi-model tasks to demonstrate the superiority of the unified EAT, e.g., Text-Based Image Retrieval, and our approach improves the rank-1 by +3.7 points over the baseline on the CSS dataset.

rate research

A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis

168 - Junjie Pan , Xiang Yin , Zhiling Zhang 2019

In Mandarin text-to-speech (TTS) system, the front-end text processing module significantly influences the intelligibility and naturalness of synthesized speech. Building a typical pipeline-based front-end which consists of multiple individual components requires extensive efforts. In this paper, we proposed a unified sequence-to-sequence front-end model for Mandarin TTS that converts raw texts to linguistic features directly. Compared to the pipeline-based front-end, our unified front-end can achieve comparable performance in polyphone disambiguation and prosody word prediction, and improve intonation phrase prediction by 0.0738 in F1 score. We also implemented the unified front-end with Tacotron and WaveRNN to build a Mandarin TTS system. The synthesized speech by that got a comparable MOS (4.38) with the pipeline-based front-end (4.37) and close to human recordings (4.49).

Computation and Language Sound Audio and Speech Processing

ISO LWS Spectroscopy of M82: A Unified Evolutionary Model

413 - J.W. Colbert , M.A. Malkan , P.E. Clegg 1998

We present the first complete far-infrared spectrum (43 to 197 um) of M82, the brightest infrared galaxy in the sky, taken with the Long Wavelength Spectrometer of the Infrared Space Observatory (ISO). We detected seven fine structure emission lines, [OI] 63 and 145 um, [OIII] 52 and 88 um, [NII] 122 um, [NIII] 57 um and [CII] 158 um, and fit their ratios to a combination starburst and photo-dissociation region (PDR) model. The best fit is obtained with HII regions with n = 250 cm^{-3} and an ionization parameter of 10^{-3.5} and PDRs with n = 10^{3.3} cm^{-3} and a far-ultraviolet flux of G_o = 10^{2.8}. We applied both continuous and instantaneous starburst models, with our best fit being a 3-5 Myr old instantaneous burst model with a 100 M_o cut-off. We also detected the ground state rotational line of OH in absorption at 119.4 um. No excited level OH transitions are apparent, indicating that the OH is almost entirely in its ground state with a column density ~ 4x10^{14} cm^{-2}. The spectral energy distribution over the LWS wavelength range is well fit with a 48 K dust temperature and an optical depth, tau_{Dust} proportional to lambda^{-1}.

A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments

204 - Adam Foster , Martin Jankowiak , Matthew OMeara 2019

We introduce a fully stochastic gradient based approach to Bayesian optimal experimental design (BOED). Our approach utilizes variational lower bounds on the expected information gain (EIG) of an experiment that can be simultaneously optimized with respect to both the variational and design parameters. This allows the design process to be carried out through a single unified stochastic gradient ascent procedure, in contrast to existing approaches that typically construct a pointwise EIG estimator, before passing this estimator to a separate optimizer. We provide a number of different variational objectives including the novel adaptive contrastive estimation (ACE) bound. Finally, we show that our gradient-based approaches are able to provide effective design optimization in substantially higher dimensional settings than existing approaches.

Machine Learning Machine Learning Computation

Designing Strassens algorithm

185 - Joshua A. Grochow , Cristopher Moore 2017

In 1969, Strassen shocked the world by showing that two n x n matrices could be multiplied in time asymptotically less than $O(n^3)$. While the recursive construction in his algorithm is very clear, the key gain was made by showing that 2 x 2 matrix multiplication could be performed with only 7 multiplications instead of 8. The latter construction was arrived at by a process of elimination and appears to come out of thin air. Here, we give the simplest and most transparent proof of Strassens algorithm that we are aware of, using only a simple unitary 2-design and a few easy lines of calculation. Moreover, using basic facts from the representation theory of finite groups, we use 2-designs coming from group orbits to generalize our construction to all n (although the resulting algorithms arent optimal for n at least 3).

Data Structures and Algorithms Computational Complexity Symbolic Computation

Analogous Process Structure Induction for Sub-event Sequence Prediction

150 - Hongming Zhang , Muhao Chen , Haoyu Wang 2020

Computational and cognitive studies of event understanding suggest that identifying, comprehending, and predicting events depend on having structured representations of a sequence of events and on conceptualizing (abstracting) its components into (soft) event categories. Thus, knowledge about a known process such as buying a car can be used in the context of a new but analogous process such as buying a house. Nevertheless, most event understanding work in NLP is still at the ground level and does not consider abstraction. In this paper, we propose an Analogous Process Structure Induction APSI framework, which leverages analogies among processes and conceptualization of sub-event instances to predict the whole sub-event sequence of previously unseen open-domain processes. As our experiments and analysis indicate, APSI supports the generation of meaningful sub-event sequences for unseen processes and can help predict missing events.

Artificial Intelligence Computation and Language