ترغب بنشر مسار تعليمي؟ اضغط هنا

98 - Tianfang Zhu , Yue Guan , Anan Li 2021
Augmentation can benefit point cloud learning due to the limited availability of large-scale public datasets. This paper proposes a mix-up augmentation approach, PointManifoldCut, which replaces the neural network embedded points, rather than the Euc lidean space coordinates. This approach takes the advantage that points at the higher levels of the neural network are already trained to embed its neighbors relations and mixing these representation will not mingle the relation between itself and its label. This allows to regularize the parameter space as the other augmentation methods but without worrying about the proper label of the replaced points. The experiments show that our proposed approach provides a competitive performance on point cloud classification and segmentation when it is combined with the cutting-edge vanilla point cloud networks. The result shows a consistent performance boosting compared to other state-of-the-art point cloud augmentation method, such as PointMixup and PointCutMix. The code of this paper is available at: https://github.com/fun0515/PointManifoldCut.
106 - Haonan Li , Yeyun Gong , Jian Jiao 2021
Pre-trained language models have led to substantial gains over a broad range of natural language processing (NLP) tasks, but have been shown to have limitations for natural language generation tasks with high-quality requirements on the output, such as commonsense generation and ad keyword generation. In this work, we present a novel Knowledge Filtering and Contrastive learning Network (KFCNet) which references external knowledge and achieves better generation performance. Specifically, we propose a BERT-based filter model to remove low-quality candidates, and apply contrastive learning separately to each of the encoder and decoder, within a general encoder--decoder architecture. The encoder contrastive module helps to capture global target semantics during encoding, and the decoder contrastive module enhances the utility of retrieved prototypes while learning general features. Extensive experiments on the CommonGen benchmark show that our model outperforms the previous state of the art by a large margin: +6.6 points (42.5 vs. 35.9) for BLEU-4, +3.7 points (33.3 vs. 29.6) for SPICE, and +1.3 points (18.3 vs. 17.0) for CIDEr. We further verify the effectiveness of the proposed contrastive module on ad keyword generation, and show that our model has potential commercial value.
163 - Lianbo Ma , Nan Li 2021
In the deployment of deep neural models, how to effectively and automatically find feasible deep models under diverse design objectives is fundamental. Most existing neural architecture search (NAS) methods utilize surrogates to predict the detailed performance (e.g., accuracy and model size) of a candidate architecture during the search, which however is complicated and inefficient. In contrast, we aim to learn an efficient Pareto classifier to simplify the search process of NAS by transforming the complex multi-objective NAS task into a simple Pareto-dominance classification task. To this end, we propose a classification-wise Pareto evolution approach for one-shot NAS, where an online classifier is trained to predict the dominance relationship between the candidate and constructed reference architectures, instead of using surrogates to fit the objective functions. The main contribution of this study is to change supernet adaption into a Pareto classifier. Besides, we design two adaptive schemes to select the reference set of architectures for constructing classification boundary and regulate the rate of positive samples over negative ones, respectively. We compare the proposed evolution approach with state-of-the-art approaches on widely-used benchmark datasets, and experimental results indicate that the proposed approach outperforms other approaches and have found a number of neural architectures with different model sizes ranging from 2M to 6M under diverse objectives and constraints.
A critical aspect of autonomous vehicles (AVs) is the object detection stage, which is increasingly being performed with sensor fusion models: multimodal 3D object detection models which utilize both 2D RGB image data and 3D data from a LIDAR sensor as inputs. In this work, we perform the first study to analyze the robustness of a high-performance, open source sensor fusion model architecture towards adversarial attacks and challenge the popular belief that the use of additional sensors automatically mitigate the risk of adversarial attacks. We find that despite the use of a LIDAR sensor, the model is vulnerable to our purposefully crafted image-based adversarial attacks including disappearance, universal patch, and spoofing. After identifying the underlying reason, we explore some potential defenses and provide some recommendations for improved sensor fusion models.
111 - Hsiang-nan Li 2021
We develop an inverse matrix method to solve for resonance masses from a dispersion relation obeyed by a correlation function. Given the operator product expansion (OPE) of a correlation function in the deep Euclidean region, we obtain the nonperturb ative spectral density, which exhibits resonance structures naturally. The value of the gluon condensate in the OPE is fixed by producing the $rho$ meson mass in the formalism, and then input into the dispersion relations for the scalar, pseudoscalar and tensor glueballs. It is shown that the low-energy limit of the correlation function for the scalar glueball, derived from the spectral density, discriminates the lattice estimate for the triple-gluon condensate from the single-instanton estimate. The spectral densities for the scalar and pseudoscalar glueballs reveal a double-peak structure: the peak located at lower mass implies that the $f_0(500)$ and $f_0(980)$ ($eta$ ad $eta$) mesons contain small amount of gluonium components, and should be included into scalar (pseudoscalar) mixing frameworks. Another peak determines the scalar (pseudoscalar) glueball mass around 1.50 (1.75) GeV with a broad width about 200 MeV, suggesting that the $f_0(1370)$, $f_0(1500)$ and $f_0(1710)$ ($eta(1760)$) mesons are the glue-rich states. We also predict the topological susceptability $chi_t^{1/4}=75$-78 MeV, deduced from the correlation function for the pseudoscalar glueball at zero momentum. Our analysis gives no resonance solution for the tensor glueball, which may be attributed to the insufficient nonperturbative condensate information in the currently available OPE.
146 - Runnan Liu , Liang Liu , Dazhi He 2021
The knowledge of channel covariance matrices is of paramount importance to the estimation of instantaneous channels and the design of beamforming vectors in multi-antenna systems. In practice, an abrupt change in channel covariance matrices may occur due to the change in the environment and the user location. Although several works have proposed efficient algorithms to estimate the channel covariance matrices after any change occurs, how to detect such a change accurately and quickly is still an open problem in the literature. In this paper, we focus on channel covariance change detection between a multi-antenna base station (BS) and a single-antenna user equipment (UE). To provide theoretical performance limit, we first propose a genie-aided change detector based on the log-likelihood ratio (LLR) test assuming the channel covariance matrix after change is known, and characterize the corresponding missed detection and false alarm probabilities. Then, this paper considers the practical case where the channel covariance matrix after change is unknown. The maximum likelihood (ML) estimation technique is used to predict the covariance matrix based on the received pilot signals over a certain number of coherence blocks, building upon which the LLR-based change detector is employed. Numerical results show that our proposed scheme can detect the change with low error probability even when the number of channel samples is small such that the estimation of the covariance matrix is not that accurate. This result verifies the possibility to detect the channel covariance change both accurately and quickly in practice.
Operational networks are increasingly using machine learning models for a variety of tasks, including detecting anomalies, inferring application performance, and forecasting demand. Accurate models are important, yet accuracy can degrade over time du e to concept drift, whereby either the characteristics of the data change over time (data drift) or the relationship between the features and the target predictor change over time (model drift). Drift is important to detect because changes in properties of the underlying data or relationships to the target prediction can require model retraining, which can be time-consuming and expensive. Concept drift occurs in operational networks for a variety of reasons, ranging from software upgrades to seasonality to changes in user behavior. Yet, despite the prevalence of drift in networks, its extent and effects on prediction accuracy have not been extensively studied. This paper presents an initial exploration into concept drift in a large cellular network in the United States for a major metropolitan area in the context of demand forecasting. We find that concept drift arises largely due to data drift, and it appears across different key performance indicators (KPIs), models, training set sizes, and time intervals. We identify the sources of concept drift for the particular problem of forecasting downlink volume. Weekly and seasonal patterns introduce both high and low-frequency model drift, while disasters and upgrades result in sudden drift due to exogenous shocks. Regions with high population density, lower traffic volumes, and higher speeds also tend to correlate with more concept drift. The features that contribute most significantly to concept drift are User Equipment (UE) downlink packets, UE uplink packets, and Real-time Transport Protocol (RTP) total received packets.
102 - Yinan Lin , Zhenhua Lin 2021
We develop a unified approach to hypothesis testing for various types of widely used functional linear models, such as scalar-on-function, function-on-function and function-on-scalar models. In addition, the proposed test applies to models of mixed t ypes, such as models with both functional and scalar predictors. In contrast with most existing methods that rest on the large-sample distributions of test statistics, the proposed method leverages the technique of bootstrapping max statistics and exploits the variance decay property that is an inherent feature of functional data, to improve the empirical power of tests especially when the sample size is limited and the signal is relatively weak. Theoretical guarantees on the validity and consistency of the proposed test are provided uniformly for a class of test statistics.
Achieving metrological precision of quantum anomalous Hall resistance quantization at zero magnetic field so far remains limited to temperatures of the order of 20 mK, while the Curie temperature in the involved material is as high as 20 K. The reaso n for this discrepancy remains one of the biggest open questions surrounding the effect, and is the focus of this article. Here we show, through a careful analysis of the non-local voltages on a multi-terminal Corbino geometry, that the chiral edge channels continue to exist without applied magnetic field up to the Curie temperature of bulk ferromagnetism of the magnetic topological insulator, and that thermally activated bulk conductance is responsible for this quantization breakdown. Our results offer important insights on the nature of the topological protection of these edge channels, provide an encouraging sign for potential applications, and establish the multi-terminal Corbino geometry as a powerful tool for the study of edge channel transport in topological materials.
textbf{P}re-textbf{T}rained textbf{M}odeltextbf{s} have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers. When the triggers are activated, ev en the fine-tuned model will predict pre-defined labels, causing a security threat. These backdoors generated by the poisoning methods can be erased by changing hyper-parameters during fine-tuning or detected by finding the triggers. In this paper, we propose a stronger weight-poisoning attack method that introduces a layerwise weight poisoning strategy to plant deeper backdoors; we also introduce a combinatorial trigger that cannot be easily detected. The experiments on text classification tasks show that previous defense methods cannot resist our weight-poisoning method, which indicates that our method can be widely applied and may provide hints for future model robustness studies.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا