أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Xiangyu Li

Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation

229 - Yilin Wen , Xiangyu Li , Hao Pan 2021

6D pose estimation of rigid objects from a single RGB image has seen tremendous improvements recently by using deep learning to combat complex real-world variations, but a majority of methods build models on the per-object level, failing to scale to multiple objects simultaneously. In this paper, we present a novel approach for scalable 6D pose estimation, by self-supervised learning on synthetic data of multiple objects using a single autoencoder. To handle multiple objects and generalize to unseen objects, we disentangle the latent object shape and pose representations, so that the latent shape space models shape similarities, and the latent pose code is used for rotation retrieval by comparison with canonical rotations. To encourage shape space construction, we apply contrastive metric learning and enable the processing of unseen objects by referring to similar training objects. The different symmetries across objects induce inconsistent latent pose spaces, which we capture with a conditioned block producing shape-dependent pose codebooks by re-entangling shape and pose representations. We test our method on two multi-object benchmarks with real data, T-LESS and NOCS REAL275, and show it outperforms existing RGB-based methods in terms of pose estimation accuracy and generalization.

الرؤية الحاسوبية وتمييز الأنماط

Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

97 - Xiangyu Liu , Hangtian Jia , Ying Wen 2021

Measuring and promoting policy diversity is critical for solving games with strong non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). With that in mind, maintaining a pool of diverse p olicies via open-ended learning is an attractive solution, which can generate auto-curricula to avoid being exploited. However, in conventional open-ended learning algorithms, there are no widely accepted definitions for diversity, making it hard to construct and evaluate the diverse policies. In this work, we summarize previous concepts of diversity and work towards offering a unified measure of diversity in multi-agent open-ended learning to include all elements in Markov games, based on both Behavioral Diversity (BD) and Response Diversity (RD). At the trajectory distribution level, we re-define BD in the state-action space as the discrepancies of occupancy measures. For the reward dynamics, we propose RD to characterize diversity through the responses of policies when encountering different opponents. We also show that many current diversity measures fall in one of the categories of BD or RD but not both. With this unified diversity measure, we design the corresponding diversity-promoting objective and population effectivity when seeking the best responses in open-ended learning. We validate our methods in both relatively simple games like matrix game, non-transitive mixture model, and the complex textit{Google Research Football} environment. The population found by our methods reveals the lowest exploitability, highest population effectivity in matrix game and non-transitive mixture model, as well as the largest goal difference when interacting with opponents of various levels in textit{Google Research Football}.

أنظمة متعددة العملاء الذكاء الاصطناعي علوم الكمبيوتر ونظرية الألعاب

Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising

134 - Xiangyu Liu , Chuan Yu , Zhilin Zhang 2021

In e-commerce advertising, it is crucial to jointly consider various performance metrics, e.g., user experience, advertiser utility, and platform revenue. Traditional auction mechanisms, such as GSP and VCG auctions, can be suboptimal due to their fi xed allocation rules to optimize a single performance metric (e.g., revenue or social welfare). Recently, data-driven auctions, learned directly from auction outcomes to optimize multiple performance metrics, have attracted increasing research interests. However, the procedure of auction mechanisms involves various discrete calculation operations, making it challenging to be compatible with continuous optimization pipelines in machine learning. In this paper, we design underline{D}eep underline{N}eural underline{A}uctions (DNAs) to enable end-to-end auction learning by proposing a differentiable model to relax the discrete sorting operation, a key component in auctions. We optimize the performance metrics by developing deep models to efficiently extract contexts from auctions, providing rich features for auction design. We further integrate the game theoretical conditions within the model design, to guarantee the stability of the auctions. DNAs have been successfully deployed in the e-commerce advertising system at Taobao. Experimental evaluation results on both large-scale data set as well as online A/B test demonstrated that DNAs significantly outperformed other mechanisms widely adopted in industry.

علوم الكمبيوتر ونظرية الألعاب الذكاء الاصطناعي التعلم الآلي

Atomistic metrics of BaSO$_4$ as an ultra-efficient radiative cooling material: a first-principles prediction

67 - Zhen Tong , Joseph Peoples , Xiangyu Li 2021

Radiative cooling has recently revived due to its significant potential as an environmentally friendly cooling technology. However, the design of particle-matrix cooling nanocomposites was generally carried out via tedious trial-and-error approaches, and the atomistic physics for efficient radiative cooling was not well understood. In this work, we identify the atomistic metrics of Barium Sulfate (BaSO$_4$) nanocomposite, which is an ultra-efficient radiative cooling material, using a predictive first-principles approach coupled with Monte Carlo simulations. Our results show that BaSO$_4$-acrylic nanocomposites not only attain high total solar reflectance of 92.5% (0.28 - 4.0 um), but also simultaneously demonstrate high normal emittance of 96.0% in the sky window region (8 - 13 um), outperforming the commonly used $alpha$-quartz ($alpha$-SiO$_2$). We identify two pertinent characters of ultra-efficient radiative cooling paints: i) a balanced band gap and refractive index, which enables strong scattering while negating absorption in the solar spectrum, and ii) a sufficient number of infrared-active optical resonance phonon modes resulting in abundant Reststrahlen bands and high emissivity in the sky window. The first principles approach and the resulted physical insights in this work pave the way for further search of ultra-efficient radiative cooling materials.

علم المواد

Trear: Transformer-based RGB-D Egocentric Action Recognition

234 - Xiangyu Li , Yonghong Hou , Pichao Wang 2021

In this paper, we propose a textbf{Tr}ansformer-based RGB-D textbf{e}gocentric textbf{a}ction textbf{r}ecognition framework, called Trear. It consists of two modules, inter-frame attention encoder and mutual-attentional fusion block. Instead of using optical flow or recurrent units, we adopt self-attention mechanism to model the temporal structure of the data from different modalities. Input frames are cropped randomly to mitigate the effect of the data redundancy. Features from each modality are interacted through the proposed fusion block and combined through a simple yet effective fusion operation to produce a joint RGB-D representation. Empirical experiments on two large egocentric RGB-D datasets, THU-READ and FPHA, and one small dataset, WCVS, have shown that the proposed method outperforms the state-of-the-art results by a large margin.

الرؤية الحاسوبية وتمييز الأنماط

Regularized Attentive Capsule Network for Overlapped Relation Extraction

85 - Tianyi Liu , Xiangyu Lin , Weijia Jia 2020

Distantly supervised relation extraction has been widely applied in knowledge base construction due to its less requirement of human efforts. However, the automatically established training datasets in distant supervision contain low-quality instance s with noisy words and overlapped relations, introducing great challenges to the accurate extraction of relations. To address this problem, we propose a novel Regularized Attentive Capsule Network (RA-CapNet) to better identify highly overlapped relations in each informal sentence. To discover multiple relation features in an instance, we embed multi-head attention into the capsule network as the low-level capsules, where the subtraction of two entities acts as a new form of relation query to select salient features regardless of their positions. To further discriminate overlapped relation features, we devise disagreement regularization to explicitly encourage the diversity among both multiple attention heads and low-level capsules. Extensive experiments conducted on widely used datasets show that our model achieves significant improvements in relation extraction.

الحساب واللغة

Transformer Guided Geometry Model for Flow-Based Unsupervised Visual Odometry

297 - Xiangyu Li , Yonghong Hou , Pichao Wang 2020

Existing unsupervised visual odometry (VO) methods either match pairwise images or integrate the temporal information using recurrent neural networks over a long sequence of images. They are either not accurate, time-consuming in training or error ac cumulative. In this paper, we propose a method consisting of two camera pose estimators that deal with the information from pairwise images and a short sequence of images respectively. For image sequences, a Transformer-like structure is adopted to build a geometry model over a local temporal window, referred to as Transformer-based Auxiliary Pose Estimator (TAPE). Meanwhile, a Flow-to-Flow Pose Estimator (F2FPE) is proposed to exploit the relationship between pairwise images. The two estimators are constrained through a simple yet effective consistency loss in training. Empirical evaluation has shown that the proposed method outperforms the state-of-the-art unsupervised learning-based methods by a large margin and performs comparably to supervised and traditional ones on the KITTI and Malaga dataset.

الرؤية الحاسوبية وتمييز الأنماط

Optimizing Multiple Performance Metrics with Deep GSP Auctions for E-commerce Advertising

82 - Zhilin Zhang , Xiangyu Liu , Zhenzhe Zheng 2020

In e-commerce advertising, the ad platform usually relies on auction mechanisms to optimize different performance metrics, such as user experience, advertiser utility, and platform revenue. However, most of the state-of-the-art auction mechanisms onl y focus on optimizing a single performance metric, e.g., either social welfare or revenue, and are not suitable for e-commerce advertising with various, dynamic, difficult to estimate, and even conflicting performance metrics. In this paper, we propose a new mechanism called Deep GSP auction, which leverages deep learning to design new rank score functions within the celebrated GSP auction framework. These new rank score functions are implemented via deep neural network models under the constraints of monotone allocation and smooth transition. The requirement of monotone allocation ensures Deep GSP auction nice game theoretical properties, while the requirement of smooth transition guarantees the advertiser utilities would not fluctuate too much when the auction mechanism switches among candidate mechanisms to achieve different optimization objectives. We deployed the proposed mechanisms in a leading e-commerce ad platform and conducted comprehensive experimental evaluations with both offline simulations and online A/B tests. The results demonstrated the effectiveness of the Deep GSP auction compared to the state-of-the-art auction mechanisms.

علوم الكمبيوتر ونظرية الألعاب استرجاع المعلومات التعلم الآلي

Remarkable Daytime Sub-ambient Radiative Cooling in BaSO4 Nanoparticle Films and Paints

112 - Xiangyu Li , Joseph Peoples , Peiyan Yao 2020

Radiative cooling is a passive cooling technology that offers great promises to reduce space cooling cost, combat the urban island effect and alleviate the global warming. To achieve passive daytime radiative cooling, current state-of-the-art solutio ns often utilize complicated multilayer structures or a reflective metal layer, limiting their applications in many fields. Attempts have been made to achieve passive daytime radiative cooling with single-layer paints, but they often require a thick coating or show partial daytime cooling. In this work, we experimentally demonstrate remarkable full daytime sub-ambient cooling performance with both BaSO4 nanoparticle films and BaSO4 nanocomposite paints. BaSO4 has a high electron bandgap for low solar absorptance and phonon resonance at 9 um for high sky window emissivity. With an appropriate particle size and a broad particle size distribution, BaSO4 nanoparticle film reaches an ultra-high solar reflectance of 97.6% and high sky window emissivity of 0.96. During field tests, BaSO4 film stays more than 4.5C below ambient temperature or achieves average cooling power of 117 W/m2. BaSO4-acrylic paint is developed with 60% volume concentration to enhance the reliability in outdoor applications, achieving solar reflectance of 98.1% and sky window emissivity of 0.95. Field tests indicate similar cooling performance to the BaSO4 films. Overall, our BaSO4-acrylic paint shows standard figure of merit of 0.77 which is among the highest of radiative cooling solutions, while providing great reliability, the convenient paint form, ease of use and the compatibility with commercial paint fabrication process.

الفيزياء التطبيقية

Concentrated Radiative Cooling

79 - Joseph Peoples , Yu-Wei Hung , Xiangyu Li 2020

A fundamental limit of current radiative cooling systems is that only the top surface facing deep-space can provide the radiative cooling effect, while the bottom surface cannot. Here, we propose and experimentally demonstrate a concept of concentrat ed radiative cooling by nesting a radiative cooling system in a mid-infrared reflective trough, so that the lower surface, which does not contribute to radiative cooling in previous systems, can radiate heat to deep-space via the reflective trough. Field experiments show that the temperature drop of a radiative cooling pipe with the trough is more than double that of the standalone radiative cooling pipe. Furthermore, by integrating the concentrated radiative cooling system as a preconditioner in an air conditioning system, we predict electricity savings of $>75%$ in Phoenix, AZ, and $>80%$ in Reno, NV, for a single-story commercial building.

الفيزياء التطبيقية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد