ترغب بنشر مسار تعليمي؟ اضغط هنا

Kinship verification from facial images has been recognized as an emerging yet challenging technique in many potential computer vision applications. In this paper, we propose a novel cross-generation feature interaction learning (CFIL) framework for robust kinship verification. Particularly, an effective collaborative weighting strategy is constructed to explore the characteristics of cross-generation relations by corporately extracting features of both parents and children image pairs. Specifically, we take parents and children as a whole to extract the expressive local and non-local features. Different from the traditional works measuring similarity by distance, we interpolate the similarity calculations as the interior auxiliary weights into the deep CNN architecture to learn the whole and natural features. These similarity weights not only involve corresponding single points but also excavate the multiple relationships cross points, where local and non-local features are calculated by using these two kinds of distance measurements. Importantly, instead of separately conducting similarity computation and feature extraction, we integrate similarity learning and feature extraction into one unified learning process. The integrated representations deduced from local and non-local features can comprehensively express the informative semantics embedded in images and preserve abundant correlation knowledge from image pairs. Extensive experiments demonstrate the efficiency and superiority of the proposed model compared to some state-of-the-art kinship verification methods.
Kinship verification is a long-standing research challenge in computer vision. The visual differences presented to the face have a significant effect on the recognition capabilities of the kinship systems. We argue that aggregating multiple visual kn owledge can better describe the characteristics of the subject for precise kinship identification. Typically, the age-invariant features can represent more natural facial details. Such age-related transformations are essential for face recognition due to the biological effects of aging. However, the existing methods mainly focus on employing the single-view image features for kinship identification, while more meaningful visual properties such as race and age are directly ignored in the feature learning step. To this end, we propose a novel deep collaborative multi-modal learning (DCML) to integrate the underlying information presented in facial properties in an adaptive manner to strengthen the facial details for effective unsupervised kinship verification. Specifically, we construct a well-designed adaptive feature fusion mechanism, which can jointly leverage the complementary properties from different visual perspectives to produce composite features and draw greater attention to the most informative components of spatial feature maps. Particularly, an adaptive weighting strategy is developed based on a novel attention mechanism, which can enhance the dependencies between different properties by decreasing the information redundancy in channels in a self-adaptive manner. To validate the effectiveness of the proposed method, extensive experimental evaluations conducted on four widely-used datasets show that our DCML method is always superior to some state-of-the-art kinship verification methods.
In this paper, we performed thermodynamic and electron spin resonance (ESR) measurements to study low-energy magnetic excitations, which were significantly affected by crystalline electric field (CEF) excitations due to relatively small gaps between the CEF ground state and the excited states. Based on the CEF and mean-field (MF) theories, we analyzed systematically and consistently the ESR experiments and thermodynamic measurements including susceptibility, magnetization, and heat capacity. The CEF parameters were successfully extracted by fitting high-temperature (> 20 K) susceptibilities in the ab-plane and along the c-axis, allowing to determine the Lande factors ($g_{ab,calc}$ = 5.98(7) and $g_{c,calc}$ = 2.73(3)). These values were consistent with the values of Lande factors determined by ESR experiments ($g_{ab,exp}$ = 5.69 and $g_{c,exp}$ = 2.75). By applying the CEF and MF theories to the susceptibility and magnetization results, we estimated the anisotropic spin-exchange energies and found that the CEF excitations in ce{KErTe2} played a decisive role in the magnetism above 3 K, while the low-temperature magnetism below 10 K was gradually correlated with the anisotropic spin-exchange interactions. The CEF excitations were demonstrated in the low-temperature heat capacity, where both the positions of two broad peaks and their magnetic field dependence well corroborated our calculations. The present study provides a basis to explore the enriched magnetic and electronic properties of the QSL family.
In dual-energy computed tomography (DECT), low- and high- kVp data are collected often over a full-angular range (FAR) of $360^circ$. While there exists strong interest in DECT with low- and high-kVp data acquired over limited-angular ranges (LARs), there remains little investigation of image reconstruction in DECT with LAR data. Objective: We investigate image reconstruction with minimized LAR artifacts from low- and high-kVp data over LARs of $le 180^circ$ by using a directional-total-variation (DTV) algorithm. Methods: Image reconstruction from LAR data is formulated as a convex optimization problem in which data-$ell_2$ is minimized with constraints on images DTVs along orthogonal axes. We then achieve image reconstruction by applying the DTV algorithm to solve the optimization problem. We conduct numerical studies from data generated over arcs of LARs, ranging from $14^circ$ to $180^circ$, and perform visual inspection and quantitative analysis of images reconstructed. Results: Monochromatic images of interest obtained with the DTV algorithm from LAR data show substantially reduced artifacts that are observed often in images obtained with existing algorithms. The improved image quality also leads to accurate estimation of physical quantities of interest, such as effective atomic number and iodine-contrast concentration. Conclusion: Our study reveals that from LAR data of low- and high-kVp, monochromatic images can be obtained that are visually, and physical quantities can be estimated that are quantitatively, comparable to those obtained in FAR DECT. Significance: As LAR DECT is of high practical application interest, the results acquired in the work may engender insights into the design of DECT with LAR scanning configurations of practical application significance.
133 - Hao Zhou , Pei Ke , Zheng Zhang 2021
Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones. In this paper, we pro pose EVA, a Chinese dialogue system that contains the largest Chinese pre-trained dialogue model with 2.8B parameters. To build this model, we collect the largest Chinese dialogue dataset named WDC-Dialogue from various public social media. This dataset contains 1.4B context-response pairs and is used as the pre-training corpus of EVA. Extensive experiments on automatic and human evaluation show that EVA outperforms other Chinese pre-trained dialogue models especially in the multi-turn interaction of human-bot conversations.
With the development of deep encoder-decoder architectures and large-scale annotated medical datasets, great progress has been achieved in the development of automatic medical image segmentation. Due to the stacking of convolution layers and the cons ecutive sampling operations, existing standard models inevitably encounter the information recession problem of feature representations, which fails to fully model the global contextual feature dependencies. To overcome the above challenges, this paper proposes a novel Transformer based medical image semantic segmentation framework called TransAttUnet, in which the multi-level guided attention and multi-scale skip connection are jointly designed to effectively enhance the functionality and flexibility of traditional U-shaped architecture. Inspired by Transformer, a novel self-aware attention (SAA) module with both Transformer Self Attention (TSA) and Global Spatial Attention (GSA) is incorporated into TransAttUnet to effectively learn the non-local interactions between encoder features. In particular, we also establish additional multi-scale skip connections between decoder blocks to aggregate the different semantic-scale upsampling features. In this way, the representation ability of multi-scale context information is strengthened to generate discriminative features. Benefitting from these complementary components, the proposed TransAttUnet can effectively alleviate the loss of fine details caused by the information recession problem, improving the diagnostic sensitivity and segmentation quality of medical image analysis. Extensive experiments on multiple medical image segmentation datasets of different imaging demonstrate that our method consistently outperforms the state-of-the-art baselines.
97 - Yang Li , Yadan Luo , Zheng Zhang 2021
With the rapid growth of location-based social networks (LBSNs), Point-Of-Interest (POI) recommendation has been broadly studied in this decade. Recently, the next POI recommendation, a natural extension of POI recommendation, has attracted much atte ntion. It aims at suggesting the next POI to a user in spatial and temporal context, which is a practical yet challenging task in various applications. Existing approaches mainly model the spatial and temporal information, and memorize historical patterns through users trajectories for recommendation. However, they suffer from the negative impact of missing and irregular check-in data, which significantly influences the model performance. In this paper, we propose an attention-based sequence-to-sequence generative model, namely POI-Augmentation Seq2Seq (PA-Seq2Seq), to address the sparsity of training set by making check-in records to be evenly-spaced. Specifically, the encoder summarises each check-in sequence and the decoder predicts the possible missing check-ins based on the encoded information. In order to learn time-aware correlation among user history, we employ local attention mechanism to help the decoder focus on a specific range of context information when predicting a certain missing check-in point. Extensive experiments have been conducted on two real-world check-in datasets, Gowalla and Brightkite, for performance and effectiveness evaluation.
335 - Mengde Xu , Zheng Zhang , Han Hu 2021
This paper presents an end-to-end semi-supervised object detection approach, in contrast to previous more complex multi-stage methods. The end-to-end training gradually improves pseudo label qualities during the curriculum, and the more and more accu rate pseudo labels in turn benefit object detection training. We also propose two simple yet effective techniques within this framework: a soft teacher mechanism where the classification loss of each unlabeled bounding box is weighed by the classification score produced by the teacher network; a box jittering approach to select reliable pseudo boxes for the learning of box regression. On the COCO benchmark, the proposed approach outperforms previous methods by a large margin under various labeling ratios, i.e. 1%, 5% and 10%. Moreover, our approach proves to perform also well when the amount of labeled data is relatively large. For example, it can improve a 40.9 mAP baseline detector trained using the full COCO training set by +3.6 mAP, reaching 44.5 mAP, by leveraging the 123K unlabeled images of COCO. On the state-of-the-art Swin Transformer based object detector (58.9 mAP on test-dev), it can still significantly improve the detection accuracy by +1.5 mAP, reaching 60.4 mAP, and improve the instance segmentation accuracy by +1.2 mAP, reaching 52.4 mAP. Further incorporating with the Object365 pre-trained model, the detection accuracy reaches 61.3 mAP and the instance segmentation accuracy reaches 53.0 mAP, pushing the new state-of-the-art.
MLOps is about taking experimental ML models to production, i.e., serving the models to actual users. Unfortunately, existing ML serving systems do not adequately handle the dynamic environments in which online data diverges from offline training dat a, resulting in tedious model updating and deployment works. This paper implements a lightweight MLOps plugin, termed ModelCI-e (continuous integration and evolution), to address the issue. Specifically, it embraces continual learning (CL) and ML deployment techniques, providing end-to-end supports for model updating and validation without serving engine customization. ModelCI-e includes 1) a model factory that allows CL researchers to prototype and benchmark CL models with ease, 2) a CL backend to automate and orchestrate the model updating efficiently, and 3) a web interface for an ML team to manage CL service collaboratively. Our preliminary results demonstrate the usability of ModelCI-e, and indicate that eliminating the interference between model updating and inference workloads is crucial for higher system efficiency.
The energy consumption of the HVAC system accounts for a significant portion of the energy consumption of the public building system, and using an efficient energy consumption prediction model can assist it in carrying out effective energy-saving tra nsformation. Unlike the traditional energy consumption prediction model, this paper extracts features from large data sets using XGBoost, trains them separately to obtain multiple models, then fuses them with LightGBMs independent prediction results using MAE, infers energy consumption related variables, and successfully applies this model to the self-developed Internet of Things platform.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا