ترغب بنشر مسار تعليمي؟ اضغط هنا

Similar question retrieval is a core task in community-based question answering (CQA) services. To balance the effectiveness and efficiency, the question retrieval system is typically implemented as multi-stage rankers: The first-stage ranker aims to recall potentially relevant questions from a large repository, and the latter stages attempt to re-rank the retrieved results. Most existing works on question retrieval mainly focused on the re-ranking stages, leaving the first-stage ranker to some traditional term-based methods. However, term-based methods often suffer from the vocabulary mismatch problem, especially on short texts, which may block the re-rankers from relevant questions at the very beginning. An alternative is to employ embedding-based methods for the first-stage ranker, which compress texts into dense vectors to enhance the semantic matching. However, these methods often lose the discriminative power as term-based methods, thus introduce noise during retrieval and hurt the recall performance. In this work, we aim to tackle the dilemma of the first-stage ranker, and propose a discriminative semantic ranker, namely DenseTrans, for high-recall retrieval. Specifically, DenseTrans is a densely connected Transformer, which learns semantic embeddings for texts based on Transformer layers. Meanwhile, DenseTrans promotes low-level features through dense connections to keep the discriminative power of the learned representations. DenseTrans is inspired by DenseNet in computer vision (CV), but poses a new way to use the dense connectivity which is totally different from its original design purpose. Experimental results over two question retrieval benchmark datasets show that our model can obtain significant gain on recall against strong term-based methods as well as state-of-the-art embedding-based methods.
317 - Bo Peng , Hongxing Fan , Wei Wang 2021
This paper presents a summary of the DFGC 2021 competition. DeepFake technology is developing fast, and realistic face-swaps are increasingly deceiving and hard to detect. At the same time, DeepFake detection methods are also improving. There is a tw o-party game between DeepFake creators and detectors. This competition provides a common platform for benchmarking the adversarial game between current state-of-the-art DeepFake creation and detection methods. In this paper, we present the organization, results and top solutions of this competition and also share our insights obtained during this event. We also release the DFGC-21 testing dataset collected from our participants to further benefit the research community.
Constructing stealthy malware has gained increasing popularity among cyber attackers to conceal their malicious intent. Nevertheless, the constructed stealthy malware still fails to survive the reverse engineering by security experts. Therefore, this paper modeled a type of malware with an unbreakable security attribute-unbreakable malware (UBM), and made a systematical probe into this new type of threat through modeling, method analysis, experiments, evaluation and anti-defense capacity tests. Specifically, we first formalized the definition of UBM and analyzed its security attributes, put forward two core features that are essential for realizing the unbreakable security attribute, and their relevant tetrad for evaluation. Then, we worked out and implemented four algorithms for constructing UBM, and verified the unbreakable security attribute based on our evaluation of the abovementioned two core features. After that, the four verified algorithms were employed to construct UBM instances, and by analyzing their volume increment and anti-defense capacity, we confirmed real-world applicability of UBM. Finally, to address the new threats incurred by UBM to the cyberspace, this paper explored some possible defense measures, with a view to establishing defense systems against UBM attacks.
112 - Zheng Chen , Xing Fan , Yuan Ling 2020
Query rewriting (QR) is an increasingly important technique to reduce customer friction caused by errors in a spoken language understanding pipeline, where the errors originate from various sources such as speech recognition errors, language understa nding errors or entity resolution errors. In this work, we first propose a neural-retrieval based approach for query rewriting. Then, inspired by the wide success of pre-trained contextual language embeddings, and also as a way to compensate for insufficient QR training data, we propose a language-modeling (LM) based approach to pre-train query embeddings on historical user conversation data with a voice assistant. In addition, we propose to use the NLU hypotheses generated by the language understanding system to augment the pre-training. Our experiments show pre-training provides rich prior information and help the QR task achieve strong performance. We also show joint pre-training with NLU hypotheses has further benefit. Finally, after pre-training, we find a small set of rewrite pairs is enough to fine-tune the QR model to outperform a strong baseline by full training on all QR training data.
Voice-controlled house-hold devices, like Amazon Echo or Google Home, face the problem of performing speech recognition of device-directed speech in the presence of interfering background speech, i.e., background noise and interfering speech from ano ther person or media device in proximity need to be ignored. We propose two end-to-end models to tackle this problem with information extracted from the anchored segment. The anchored segment refers to the wake-up word part of an audio stream, which contains valuable speaker information that can be used to suppress interfering speech and background noise. The first method is called Multi-source Attention where the attention mechanism takes both the speaker information and decoder state into consideration. The second method directly learns a frame-level mask on top of the encoder output. We also explore a multi-task learning setup where we use the ground truth of the mask to guide the learner. Given that audio data with interfering speech is rare in our training data set, we also propose a way to synthesize noisy speech from clean speech to mitigate the mismatch between training and test data. Our proposed methods show up to 15% relative reduction in WER for Amazon Alexa live data with interfering background speech without significantly degrading on clean speech.
53 - Xing Fan , Hao Luo , Xuan Zhang 2018
Holistic person re-identification (ReID) has received extensive study in the past few years and achieves impressive progress. However, persons are often occluded by obstacles or other persons in practical scenarios, which makes partial person re-iden tification non-trivial. In this paper, we propose a spatial-channel parallelism network (SCPNet) in which each channel in the ReID feature pays attention to a given spatial part of the body. The spatial-channel corresponding relationship supervises the network to learn discriminative feature for both holistic and partial person re-identification. The single model trained on four holistic ReID datasets achieves competitive accuracy on these four datasets, as well as outperforms the state-of-the-art methods on two partial ReID datasets without training.
Single top quark production cross sections at hadron colliders are traditionally used to extract the modulus of the $V_{tb}$ element of the Cabibbo-Kobayashi-Maskawa matrix under the following assumption: $|V_{tb}| gg |V_{td}|, |V_{ts}|$. For the fir st time, direct limits on $|V_{td}|$ and $|V_{ts}|$ are obtained using experimental data without the assumption of the unitarity of the CKM matrix. Limits on the $|V_{td}|$, $|V_{ts}|$ and $|V_{tb}|$ are extracted from differential measurements of single top quark cross sections in $t$-channel as a function of the rapidity and transverse momentum of the top quark and the light jet recoiling against the top quark. We have shown that the pseudorapidity of the forward jet in the single top production is one of the most powerful observables for discriminating between the $|V_{td}|$ and $|V_{tb}|$ events. We perform a global fit of top quark related CKM elements to experimental data from the LHC Runs I and II and Tevatron. Experimental data include inclusive and differential single top cross sections in $t$-channel, inclusive tW production cross section, and top quark branching ratio to b quark and W boson. We present bounds on $|V_{tb}|$, $|V_{ts}|$ and $|V_{td}|$ using current data and project the results for future LHC data sets corresponding to luminosities of 300 and 3000 fb.
A generalized Heisenberg-Euler formula is given for an Abelian gauge theory having vector as well as axial vector couplings to a massive fermion. So, the formula is applicable to a parity-violating theory. The gauge group is chosen to be $U(1)$. The formula is quite similar to that in quantum electrodynamics, but there is a complexity in which one factor (related to spin) is expressed in terms of the expectation value. The expectation value is evaluated by the contraction with the one-dimensional propagator in a given background field. The formula affords a basis to the vacuum magnetic birefringence experiment, which aims to probe the dark sector, where the interactions of the light fermions with the gauge fields are not necessarily parity conserving.
A new experiment to measure vacuum magnetic birefringence (VMB), the OVAL experiment, is reported. We developed an original pulsed magnet that has a high repetition rate and applies the strongest magnetic field among VMB experiments. The vibration is olation design and feedback system enable the direct combination of the magnet with a Fabry-Perot cavity. To ensure the searching potential, a calibration measurement with dilute nitrogen gas and a prototype search for vacuum magnetic birefringence are performed. Based on the results, a strategy to observe vacuum magnetic birefringence is reported.
A new method of cooling positronium down is proposed to realize Bose-Einstein condensation of positronium. We perform detail studies about three processes (1) thermalization processes between positronium and silica walls of a cavity, (2) Ps-Ps scatte rings and (3) Laser cooling. The thermalization process is shown to be not sufficient for BEC. Ps-Ps collision is also shown to make a big effect on the cooling performance. We combine both methods and establish an efficient cooling for BEC. We also propose a new optical laser system for the cooling.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا