Do you want to publish a course? Click here

Controllable Gradient Item Retrieval

131   0   0.0 ( 0 )
 Added by Haonan Wang
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

In this paper, we identify and study an important problem of gradient item retrieval. We define the problem as retrieving a sequence of items with a gradual change on a certain attribute, given a reference item and a modification text. For example, after a customer saw a white dress, she/he wants to buy a similar one but more floral on it. The extent of more floral is subjective, thus prompting one floral dress is hard to satisfy the customers needs. A better way is to present a sequence of products with increasingly floral attributes based on the white dress, and allow the customer to select the most satisfactory one from the sequence. Existing item retrieval methods mainly focus on whether the target items appear at the top of the retrieved sequence, but ignore the demand for retrieving a sequence of products with gradual change on a certain attribute. To deal with this problem, we propose a weakly-supervised method that can learn a disentangled item representation from user-item interaction data and ground the semantic meaning of attributes to dimensions of the item representation. Our method takes a reference item and a modification as a query. During inference, we start from the reference item and walk along the direction of the modification in the item representation space to retrieve a sequence of items in a gradient manner. We demonstrate our proposed method can achieve disentanglement through weak supervision. Besides, we empirically show that an item sequence retrieved by our method is gradually changed on an indicated attribute and, in the item retrieval task, our method outperforms existing approaches on three different datasets.



rate research

Read More

179 - Lu Yu , Junming Huang , Chuang Liu 2014
Interactions between search and recommendation have recently attracted significant attention, and several studies have shown that many potential applications involve with a joint problem of producing recommendations to users with respect to a given query, termed $Collaborative$ $Retrieval$ (CR). Successful algorithms designed for CR should be potentially flexible at dealing with the sparsity challenges since the setup of collaborative retrieval associates with a given $query$ $times$ $user$ $times$ $item$ tensor instead of traditional $user$ $times$ $item$ matrix. Recently, several works are proposed to study CR task from users perspective. In this paper, we aim to sufficiently explore the sophisticated relationship of each $query$ $times$ $user$ $times$ $item$ triple from items perspective. By integrating item-based collaborative information for this joint task, we present an alternative factorized model that could better evaluate the ranks of those items with sparse information for the given query-user pair. In addition, we suggest to employ a recently proposed scalable ranking learning algorithm, namely BPR, to optimize the state-of-the-art approach, $Latent$ $Collaborative$ $Retrieval$ model, instead of the original learning algorithm. The experimental results on two real-world datasets, (i.e. emph{Last.fm}, emph{Yelp}), demonstrate the efficiency and effectiveness of our proposed approach.
While current information retrieval systems are effective for known-item retrieval where the searcher provides a precise name or identifier for the item being sought, systems tend to be much less effective for cases where the searcher is unable to express a precise name or identifier. We refer to this as tip of the tongue (TOT) known-item retrieval, named after the cognitive state of not being able to retrieve an item from memory. Using movie search as a case study, we explore the characteristics of questions posed by searchers in TOT states in a community question answering website. We analyze how searchers express their information needs during TOT states in the movie domain. Specifically, what information do searchers remember about the item being sought and how do they convey this information? Our results suggest that searchers use a combination of information about: (1) the content of the item sought, (2) the context in which they previously engaged with the item, and (3) previous attempts to find the item using other resources (e.g., search engines). Additionally, searchers convey information by sometimes expressing uncertainty (i.e., hedging), opinions, emotions, and by performing relative (vs. absolute) comparisons with attributes of the item. As a result of our analysis, we believe that searchers in TOT states may require specialized query understanding methods or document representations. Finally, our preliminary retrieval experiments show the impact of each information type presented in information requests on retrieval performance.
Both reviews and user-item interactions (i.e., rating scores) have been widely adopted for user rating prediction. However, these existing techniques mainly extract the latent representations for users and items in an independent and static manner. That is, a single static feature vector is derived to encode her preference without considering the particular characteristics of each candidate item. We argue that this static encoding scheme is difficult to fully capture the users preference. In this paper, we propose a novel context-aware user-item representation learning model for rating prediction, named CARL. Namely, CARL derives a joint representation for a given user-item pair based on their individual latent features and latent feature interactions. Then, CARL adopts Factorization Machines to further model higher-order feature interactions on the basis of the user-item pair for rating prediction. Specifically, two separate learning components are devised in CARL to exploit review data and interaction data respectively: review-based feature learning and interaction-based feature learning. In review-based learning component, with convolution operations and attention mechanism, the relevant features for a user-item pair are extracted by jointly considering their corresponding reviews. However, these features are only review-driven and may not be comprehensive. Hence, interaction-based learning component further extracts complementary features from interaction data alone, also on the basis of user-item pairs. The final rating score is then derived with a dynamic linear fusion mechanism. Experiments on five real-world datasets show that CARL achieves significantly better rating prediction accuracy than existing state-of-the-art alternatives. Also, with attention mechanism, we show that the relevant information in reviews can be highlighted to interpret the rating prediction.
Session-based recommendation aims at predicting the next item given a sequence of previous items consumed in the session, e.g., on e-commerce or multimedia streaming services. Specifically, session data exhibits some unique characteristics, i.e., session consistency and sequential dependency over items within the session, repeated item consumption, and session timeliness. In this paper, we propose simple-yet-effective linear models for considering the holistic aspects of the sessions. The comprehensive nature of our models helps improve the quality of session-based recommendation. More importantly, it provides a generalized framework for reflecting different perspectives of session data. Furthermore, since our models can be solved by closed-form solutions, they are highly scalable. Experimental results demonstrate that the proposed linear models show competitive or state-of-the-art performance in various metrics on several real-world datasets.
When video collections become huge, how to explore both within and across videos efficiently is challenging. Video summarization is one of the ways to tackle this issue. Traditional summarization approaches limit the effectiveness of video exploration because they only generate one fixed video summary for a given input video independent of the information need of the user. In this work, we introduce a method which takes a text-based query as input and generates a video summary corresponding to it. We do so by modeling video summarization as a supervised learning problem and propose an end-to-end deep learning based method for query-controllable video summarization to generate a query-dependent video summary. Our proposed method consists of a video summary controller, video summary generator, and video summary output module. To foster the research of query-controllable video summarization and conduct our experiments, we introduce a dataset that contains frame-based relevance score labels. Based on our experimental result, it shows that the text-based query helps control the video summary. It also shows the text-based query improves our model performance. Our code and dataset: https://github.com/Jhhuangkay/Query-controllable-Video-Summarization.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا