ترغب بنشر مسار تعليمي؟ اضغط هنا

During the COVID-19 coronavirus epidemic, almost everyone wears a facial mask, which poses a huge challenge to deep face recognition. In this workshop, we organize Masked Face Recognition (MFR) challenge and focus on bench-marking deep face recogniti on methods under the existence of facial masks. In the MFR challenge, there are two main tracks: the InsightFace track and the WebFace260M track. For the InsightFace track, we manually collect a large-scale masked face test set with 7K identities. In addition, we also collect a children test set including 14K identities and a multi-racial test set containing 242K identities. By using these three test sets, we build up an online model testing system, which can give a comprehensive evaluation of face recognition models. To avoid data privacy problems, no test image is released to the public. As the challenge is still under-going, we will keep on updating the top-ranked solutions as well as this report on the arxiv.
119 - Xinzi He , Jia Guo , Xuzhe Zhang 2021
Unsupervised learning-based medical image registration approaches have witnessed rapid development in recent years. We propose to revisit a commonly ignored while simple and well-established principle: recursive refinement of deformation vector field s across scales. We introduce a recursive refinement network (RRN) for unsupervised medical image registration, to extract multi-scale features, construct normalized local cost correlation volume and recursively refine volumetric deformation vector fields. RRN achieves state of the art performance for 3D registration of expiratory-inspiratory pairs of CT lung scans. On DirLab COPDGene dataset, RRN returns an average Target Registration Error (TRE) of 0.83 mm, which corresponds to a 13% error reduction from the best result presented in the leaderboard. In addition to comparison with conventional methods, RRN leads to 89% error reduction compared to deep-learning-based peer approaches.
Massive multiple-input multiple-output can obtain more performance gain by exploiting the downlink channel state information (CSI) at the base station (BS). Therefore, studying CSI feedback with limited communication resources in frequency-division d uplexing systems is of great importance. Recently, deep learning (DL)-based CSI feedback has shown considerable potential. However, the existing DL-based explicit feedback schemes are difficult to deploy because current fifth-generation mobile communication protocols and systems are designed based on an implicit feedback mechanism. In this paper, we propose a DL-based implicit feedback architecture to inherit the low-overhead characteristic, which uses neural networks (NNs) to replace the precoding matrix indicator (PMI) encoding and decoding modules. By using environment information, the NNs can achieve a more refined mapping between the precoding matrix and the PMI compared with codebooks. The correlation between subbands is also used to further improve the feedback performance. Simulation results show that, for a single resource block (RB), the proposed architecture can save 25.0% and 40.0% of overhead compared with Type I codebook under two antenna configurations, respectively. For a wideband system with 52 RBs, overhead can be saved by 30.7% and 48.0% compared with Type II codebook when ignoring and considering extracting subband correlation, respectively.
Although tremendous strides have been made in uncontrolled face detection, efficient face detection with a low computation cost as well as high precision remains an open challenge. In this paper, we point out that training data sampling and computati on distribution strategies are the keys to efficient and accurate face detection. Motivated by these observations, we introduce two simple but effective methods (1) Sample Redistribution (SR), which augments training samples for the most needed stages, based on the statistics of benchmark datasets; and (2) Computation Redistribution (CR), which reallocates the computation between the backbone, neck and head of the model, based on a meticulously defined search methodology. Extensive experiments conducted on WIDER FACE demonstrate the state-of-the-art efficiency-accuracy trade-off for the proposed scrfd family across a wide range of compute regimes. In particular, scrfdf{34} outperforms the best competitor, TinaFace, by $3.86%$ (AP at hard set) while being more than emph{3$times$ faster} on GPUs with VGA-resolution images. We also release our code to facilitate future research.
This paper describes an adaptive method in continuous time for the estimation of external fields by a team of $N$ agents. The agents $i$ each explore subdomains $Omega^i$ of a bounded subset of interest $Omegasubset X := mathbb{R}^d$. Ideal adaptive estimates $hat{g}^i_t$ are derived for each agent from a distributed parameter system (DPS) that takes values in the scalar-valued reproducing kernel Hilbert space $H_X$ of functions over $X$. Approximations of the evolution of the ideal local estimate $hat{g}^i_t$ of agent $i$ is constructed solely using observations made by agent $i$ on a fine time scale. Since the local estimates on the fine time scale are constructed independently for each agent, we say that the method is strictly decentralized. On a coarse time scale, the individual local estimates $hat{g}^i_t$ are fused via the expression $hat{g}_t:=sum_{i=1}^NPsi^i hat{g}^i_t$ that uses a partition of unity ${Psi^i}_{1leq ileq N}$ subordinate to the cover ${Omega^i}_{i=1,ldots,N}$ of $Omega$. Realizable algorithms are obtained by constructing finite dimensional approximations of the DPS in terms of scattered bases defined by each agent from samples along their trajectories. Rates of convergence of the error in the finite dimensional approximations are derived in terms of the fill distance of the samples that define the scattered centers in each subdomain. The qualitative performance of the convergence rates for the decentralized estimation method is illustrated via numerical simulations.
The active Brownian particle (ABP) model describes a swimmer, synthetic or living, whose direction of swimming is a Brownian motion. The swimming is due to a propulsion force, and the fluctuations are typically thermal in origin. We present a 2D mode l where the fluctuations arise from nonthermal noise in a propelling force acting at a single point, such as that due to a flagellum. We take the overdamped limit and find several modifications to the traditional ABP model. Since the fluctuating force causes a fluctuating torque, the diffusion tensor describing the process has a coupling between translational and rotational degrees of freedom. An anisotropic particle also exhibits a noise-induced induced drift. We show that these effects have measurable consequences for the long-time diffusivity of active particles, in particular adding a contribution that is independent of where the force acts.
64 - Jia Guo , Chen Zhu , Yilun Zhao 2020
Multi-modal representation learning by pretraining has become an increasing interest due to its easy-to-use and potential benefit for various Visual-and-Language~(V-L) tasks. However its requirement of large volume and high-quality vision-language pa irs highly hinders its values in practice. In this paper, we proposed a novel label-augmented V-L pretraining model, named LAMP, to address this problem. Specifically, we leveraged auto-generated labels of visual objects to enrich vision-language pairs with fine-grained alignment and correspondingly designed a novel pretraining task. Besides, we also found such label augmentation in second-stage pretraining would further universally benefit various downstream tasks. To evaluate LAMP, we compared it with some state-of-the-art models on four downstream tasks. The quantitative results and analysis have well proven the value of labels in V-L pretraining and the effectiveness of LAMP.
70 - Yilun Zhao , Jia Guo 2020
Music annotation has always been one of the critical topics in the field of Music Information Retrieval (MIR). Traditional models use supervised learning for music annotation tasks. However, as supervised machine learning approaches increase in compl exity, the increasing need for more annotated training data can often not be matched with available data. In this paper, a new self-supervised music acoustic representation learning approach named MusiCoder is proposed. Inspired by the success of BERT, MusiCoder builds upon the architecture of self-attention bidirectional transformers. Two pre-training objectives, including Contiguous Frames Masking (CFM) and Contiguous Channels Masking (CCM), are designed to adapt BERT-like masked reconstruction pre-training to continuous acoustic frame domain. The performance of MusiCoder is evaluated in two downstream music annotation tasks. The results show that MusiCoder outperforms the state-of-the-art models in both music genre classification and auto-tagging tasks. The effectiveness of MusiCoder indicates a great potential of a new self-supervised learning approach to understand music: first apply masked reconstruction tasks to pre-train a transformer-based model with massive unlabeled music acoustic data, and then finetune the model on specific downstream tasks with labeled data.
The transfer learning toolkit wraps the codes of 17 transfer learning models and provides integrated interfaces, allowing users to use those models by calling a simple function. It is easy for primary researchers to use this toolkit and to choose pro per models for real-world applications. The toolkit is written in Python and distributed under MIT open source license. In this paper, the current state of this toolkit is described and the necessary environment setting and usage are introduced.
In the bulk, LaCoO3 (LCO) is a paramagnet, yet in tensile strained thin films at low temperature ferromagnetism (FM) is observed, and its origin remains unresolved. Polarized neutron reflectometry (PNR) is a powerful tool to determine the depth profi les of the structure and magnetization simultaneously and thus the evolution of the interfacial FM with strain can be accurately revealed. Here we quantitatively measured the distribution of atomic density and magnetization in LCO films by PNR and found that the LCO layers near the heterointerfaces exhibit a reduced magnetization but an enhanced atomic density, whereas the interior shows the opposite trend. We attribute the nonuniformity to the symmetry mismatch at the interface, which induces a structural distortion related to the ferroelasticity of LCO. This assertion is tested by systematic application of hydrostatic pressure during the PNR experiments. These results provide unique insights into mechanisms driving FM in strained LCO films while offering a tantalizing observation that tunable deformation of the CoO6 octahedra in combination with the ferroelastic order parameter.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا