ﻻ يوجد ملخص باللغة العربية
Search is at the heart of modern e-commerce. As a result, the task of ranking search results automatically (learning to rank) is a multibillion dollar machine learning problem. Traditional models optimize over a few hand-constructed features based on the items text. In this paper, we introduce a multimodal learning to rank model that combines these traditional features with visual semantic features transferred from a deep convolutional neural network. In a large scale experiment using data from the online marketplace Etsy, we verify that moving to a multimodal representation significantly improves ranking quality. We show how image features can capture fine-grained style information not available in a text-only representation. In addition, we show concrete examples of how image information can successfully disentangle pairs of highly different items that are ranked similarly by a text-only model.
The complexity of the visual world creates significant challenges for comprehensive visual understanding. In spite of recent successes in visual recognition, todays vision systems would still struggle to deal with visual queries that require a deeper
We present a learning-based method for detecting real and fake deepfake multimedia content. To maximize information for learning, we extract and analyze the similarity between the two audio and visual modalities from within the same video. Additional
While deep convolutional neural networks (CNN) have been successfully applied for 2D image analysis, it is still challenging to apply them to 3D anisotropic volumes, especially when the within-slice resolution is much higher than the between-slice re
Observing a set of images and their corresponding paragraph-captions, a challenging task is to learn how to produce a semantically coherent paragraph to describe the visual content of an image. Inspired by recent successes in integrating semantic top
Traditional feature-based image stitching technologies rely heavily on feature detection quality, often failing to stitch images with few features or low resolution. The learning-based image stitching solutions are rarely studied due to the lack of l