Do you want to publish a course? Click here

Embedding-based Product Retrieval in Taobao Search

184   0   0.0 ( 0 )
 Added by Fuyu Lv
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Nowadays, the product search service of e-commerce platforms has become a vital shopping channel in peoples life. The retrieval phase of products determines the search systems quality and gradually attracts researchers attention. Retrieving the most relevant products from a large-scale corpus while preserving personalized user characteristics remains an open question. Recent approaches in this domain have mainly focused on embedding-based retrieval (EBR) systems. However, after a long period of practice on Taobao, we find that the performance of the EBR system is dramatically degraded due to its: (1) low relevance with a given query and (2) discrepancy between the training and inference phases. Therefore, we propose a novel and practical embedding-based product retrieval model, named Multi-Grained Deep Semantic Product Retrieval (MGDSPR). Specifically, we first identify the inconsistency between the training and inference stages, and then use the softmax cross-entropy loss as the training objective, which achieves better performance and faster convergence. Two efficient methods are further proposed to improve retrieval relevance, including smoothing noisy training data and generating relevance-improving hard negative samples without requiring extra knowledge and training procedures. We evaluate MGDSPR on Taobao Product Search with significant metrics gains observed in offline experiments and online A/B tests. MGDSPR has been successfully deployed to the existing multi-channel retrieval system in Taobao Search. We also introduce the online deployment scheme and share practical lessons of our retrieval system to contribute to the community.



rate research

Read More

Product search is an important way for people to browse and purchase items on E-commerce platforms. While customers tend to make choices based on their personal tastes and preferences, analysis of commercial product search logs has shown that personalization does not always improve product search quality. Most existing product search techniques, however, conduct undifferentiated personalization across search sessions. They either use a fixed coefficient to control the influence of personalization or let personalization take effect all the time with an attention mechanism. The only notable exception is the recently proposed zero-attention model (ZAM) that can adaptively adjust the effect of personalization by allowing the query to attend to a zero vector. Nonetheless, in ZAM, personalization can act at most as equally important as the query and the representations of items are static across the collection regardless of the items co-occurring in the users historical purchases. Aware of these limitations, we propose a transformer-based embedding model (TEM) for personalized product search, which could dynamically control the influence of personalization by encoding the sequence of query and users purchase history with a transformer architecture. Personalization could have a dominant impact when necessary and interactions between items can be taken into consideration when computing attention weights. Experimental results show that TEM outperforms state-of-the-art personalization product retrieval models significantly.
Product search is one of the most popular methods for customers to discover products online. Most existing studies on product search focus on developing effective retrieval models that rank items by their likelihood to be purchased. They, however, ignore the problem that there is a gap between how systems and customers perceive the relevance of items. Without explanations, users may not understand why product search engines retrieve certain items for them, which consequentially leads to imperfect user experience and suboptimal system performance in practice. In this work, we tackle this problem by constructing explainable retrieval models for product search. Specifically, we propose to model the search and purchase behavior as a dynamic relation between users and items, and create a dynamic knowledge graph based on both the multi-relational product data and the context of the search session. Ranking is conducted based on the relationship between users and items in the latent space, and explanations are generated with logic inferences and entity soft matching on the knowledge graph. Empirical experiments show that our model, which we refer to as the Dynamic Relation Embedding Model (DREM), significantly outperforms the state-of-the-art baselines and has the ability to produce reasonable explanations for search results.
Passage retrieval addresses the problem of locating relevant passages, usually from a large corpus, given a query. In practice, lexical term-matching algorithms like BM25 are popular choices for retrieval owing to their efficiency. However, term-based matching algorithms often miss relevant passages that have no lexical overlap with the query and cannot be finetuned to downstream datasets. In this work, we consider the embedding-based two-tower architecture as our neural retrieval model. Since labeled data can be scarce and because neural retrieval models require vast amounts of data to train, we propose a novel method for generating synthetic training data for retrieval. Our system produces remarkable results, significantly outperforming BM25 on 5 out of 6 datasets tested, by an average of 2.45 points for Recall@1. In some cases, our model trained on synthetic data can even outperform the same model trained on real data
Intelligent assistants change the way people interact with computers and make it possible for people to search for products through conversations when they have purchase needs. During the interactions, the system could ask questions on certain aspects of the ideal products to clarify the users needs. For example, previous work proposed to ask users the exact characteristics of their ideal items before showing results. However, users may not have clear ideas about what an ideal item looks like, especially when they have not seen any item. So it is more feasible to facilitate the conversational search by showing example items and asking for feedback instead. In addition, when the users provide negative feedback for the presented items, it is easier to collect their detailed feedback on certain properties (aspect-value pairs) of the non-relevant items. By breaking down the item-level negative feedback to fine-grained feedback on aspect-value pairs, more information is available to help clarify users intents. So in this paper, we propose a conversational paradigm for product search driven by non-relevant items, based on which fine-grained feedback is collected and utilized to show better results in the next iteration. We then propose an aspect-value likelihood model to incorporate both positive and negative feedback on fine-grained aspect-value pairs of the non-relevant items. Experimental results show that our model is significantly better than state-of-the-art product search baselines without using feedback and those baselines using item-level negative feedback.
105 - Hanqing Lu , Youna Hu , Tong Zhao 2021
Nowadays, with many e-commerce platforms conducting global business, e-commerce search systems are required to handle product retrieval under multilingual scenarios. Moreover, comparing with maintaining per-country specific e-commerce search systems, having a universal system across countries can further reduce the operational and computational costs, and facilitate business expansion to new countries. In this paper, we introduce a universal end-to-end multilingual retrieval system, and discuss our learnings and technical details when training and deploying the system to serve billion-scale product retrieval for e-commerce search. In particular, we propose a multilingual graph attention based retrieval network by leveraging recent advances in transformer-based multilingual language models and graph neural network architectures to capture the interactions between search queries and items in e-commerce search. Offline experiments on five countries data show that our algorithm outperforms the state-of-the-art baselines by 35% recall and 25% mAP on average. Moreover, the proposed model shows significant increase of conversion/revenue in online A/B experiments and has been deployed in production for multiple countries.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا