ﻻ يوجد ملخص باللغة العربية
We propose an image representation and matching approach that substantially improves visual-based location estimation for images. The main novelty of the approach, called distinctive visual element matching (DVEM), is its use of representations that are specific to the query image whose location is being predicted. These representations are based on visual element clouds, which robustly capture the connection between the query and visual evidence from candidate locations. We then maximize the influence of visual elements that are geo-distinctive because they do not occur in images taken at many other locations. We carry out experiments and analysis for both geo-constrained and geo-unconstrained location estimation cases using two large-scale, publicly-available datasets: the San Francisco Landmark dataset with $1.06$ million street-view images and the MediaEval 15 Placing Task dataset with $5.6$ million geo-tagged images from Flickr. We present examples that illustrate the highly-transparent mechanics of the approach, which are based on common sense observations about the visual patterns in image collections. Our results show that the proposed method delivers a considerable performance improvement compared to the state of the art.
With the proliferation of online social networking services and mobile smart devices equipped with mobile communications module and position sensor module, massive amount of multimedia data has been collected, stored and shared. This trend has put fo
Nowadays, users are encouraged to activate across multiple online social networks simultaneously. Anchor link prediction, which aims to reveal the correspondence among different accounts of the same user across networks, has been regarded as a fundam
Audio-visual (AV) lip biometrics is a promising authentication technique that leverages the benefits of both the audio and visual modalities in speech communication. Previous works have demonstrated the usefulness of AV lip biometrics. However, the l
Distributed visual analysis applications, such as mobile visual search or Visual Sensor Networks (VSNs) require the transmission of visual content on a bandwidth-limited network, from a peripheral node to a processing unit. Traditionally, a Compress-
Due to the rapid development of mobile Internet techniques, cloud computation and popularity of online social networking and location-based services, massive amount of multimedia data with geographical information is generated and uploaded to the Int