Do you want to publish a course? Click here

141 - Chi Hu , Bei Li , Ye Lin 2021
This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task. We focus on the efficient implementation of deep Transformer models cite{wang-etal-2019-learning, li-etal-2019-niutrans} using NiuTensor (https://github.com/NiuTrans/NiuTensor), a flexible toolkit for NLP tasks. We explored the combination of deep encoder and shallow decoder in Transformer models via model compression and knowledge distillation. The neural machine translation decoding also benefits from FP16 inference, attention caching, dynamic batching, and batch pruning. Our systems achieve promising results in both translation quality and efficiency, e.g., our fastest system can translate more than 40,000 tokens per second with an RTX 2080 Ti while maintaining 42.9 BLEU on textit{newstest2018}. The code, models, and docker images are available at NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT).
156 - Ye Lin , Yanyang Li , Tong Xiao 2021
Improving Transformer efficiency has become increasingly attractive recently. A wide range of methods has been proposed, e.g., pruning, quantization, new architectures and etc. But these methods are either sophisticated in implementation or dependent on hardware. In this paper, we show that the efficiency of Transformer can be improved by combining some simple and hardware-agnostic methods, including tuning hyper-parameters, better design choices and training strategies. On the WMT news translation tasks, we improve the inference efficiency of a strong Transformer system by 3.80X on CPU and 2.52X on GPU. The code is publicly available at https://github.com/Lollipop321/mini-decoder-network.
260 - Yanyang Li , Ye Lin , Tong Xiao 2021
The large attention-based encoder-decoder network (Transformer) has become prevailing recently due to its effectiveness. But the high computation complexity of its decoder raises the inefficiency issue. By examining the mathematic formulation of the decoder, we show that under some mild conditions, the architecture could be simplified by compressing its sub-layers, the basic building block of Transformer, and achieves a higher parallelism. We thereby propose Compressed Attention Network, whose decoder layer consists of only one sub-layer instead of three. Extensive experiments on 14 WMT machine translation tasks show that our model is 1.42x faster with performance on par with a strong baseline. This strong baseline is already 2x faster than the widely used standard baseline without loss in performance.
186 - Yanyang Li , Yingfeng Luo , Ye Lin 2020
Unsupervised Bilingual Dictionary Induction methods based on the initialization and the self-learning have achieved great success in similar language pairs, e.g., English-Spanish. But they still fail and have an accuracy of 0% in many distant language pairs, e.g., English-Japanese. In this work, we show that this failure results from the gap between the actual initialization performance and the minimum initialization performance for the self-learning to succeed. We propose Iterative Dimension Reduction to bridge this gap. Our experiments show that this simple method does not hamper the performance of similar language pairs and achieves an accuracy of 13.64~55.53% between English and four distant languages, i.e., Chinese, Japanese, Vietnamese and Thai.
Intelligent reflecting surface (IRS) has been recently employed to reshape the wireless channels by controlling individual scattering elements phase shifts, namely, passive beamforming. Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity and inexact channel information. In this article, we focus on machine learning (ML) approaches for performance maximization in IRS-assisted wireless networks. In general, ML approaches provide enhanced flexibility and robustness against uncertain information and imprecise modeling. Practical challenges still remain mainly due to the demand for a large dataset in offline training and slow convergence in online learning. These observations motivate us to design a novel optimization-driven ML framework for IRS-assisted wireless networks, which takes both advantages of the efficiency in model-based optimization and the robustness in model-free ML approaches. By splitting the decision variables into two parts, one part is obtained by the outer-loop ML approach, while the other part is optimized efficiently by solving an approximate problem. Numerical results verify that the optimization-driven ML approach can improve both the convergence and the reward performance compared to conventional model-free learning approaches.
Third-party security apps are an integral part of the Android app ecosystem. Many users install them as an extra layer of protection for their devices. There are hundreds of such security apps, both free and paid in Google Play Store and some of them are downloaded millions of times. By installing security apps, the smartphone users place a significant amount of trust towards the security companies who developed these apps, because a fully functional mobile security app requires access to many smartphone resources such as the storage, text messages and email, browser history, and information about other installed applications. Often these resources contain highly sensitive personal information. As such, it is essential to understand the mobile security apps ecosystem to assess whether is it indeed beneficial to install them. To this end, in this paper, we present the first empirical study of Android security apps. We analyse 100 Android security apps from multiple aspects such as metadata, static analysis, and dynamic analysis and presents insights to their operations and behaviours. Our results show that 20% of the security apps we studied potentially resell the data they collect from smartphones to third parties; in some cases, even without the user consent. Also, our experiments show that around 50% of the security apps fail to identify malware installed on a smartphone.
We present a study of the environment of barred galaxies using a volume-limited sample of over 30,000 galaxies drawn from the Sloan Digital Sky Survey. We use four different statistics to quantify the environment: the projected two-point cross-correlation function, the background-subtracted number count of neighbor galaxies, the overdensity of the local environment, and the membership of our galaxies to galaxy groups to segregate central and satellite systems. For barred galaxies as a whole, we find a very weak difference in all the quantities compared to unbarred galaxies of the control sample. When we split our sample into early- and late-type galaxies, we see a weak but significant trend for early-type galaxies with a bar to be more strongly clustered on scales from a few 100 kpc to 1 Mpc when compared to unbarred early-type galaxies. This indicates that the presence of a bar in early-type galaxies depends on the location within their host dark matter halos. This is confirmed by the group catalog in the sense that for early-types, the fraction of central galaxies is smaller if they have a bar. For late-type galaxies, we find fewer neighbors within $sim$50 kpc around the barred galaxies when compared to unbarred galaxies form the control sample, suggesting that tidal forces from close companions suppress the formation/growth of bars. Finally, we find no obvious correlation between overdensity and the bars in our sample, showing that galactic bars are not obviously linked to the large-scale structure of the universe.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا