ترغب بنشر مسار تعليمي؟ اضغط هنا

Constructing Hierarchical Image-tags Bimodal Representations for Word Tags Alternative Choice

104   0   0.0 ( 0 )
 نشر من قبل Ruifan Li
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper describes our solution to the multi-modal learning challenge of ICML. This solution comprises constructing three-level representations in three consecutive stages and choosing correct tag words with a data-specific strategy. Firstly, we use typical methods to obtain level-1 representations. Each image is represented using MPEG-7 and gist descriptors with additional features released by the contest organizers. And the corresponding word tags are represented by bag-of-words model with a dictionary of 4000 words. Secondly, we learn the level-2 representations using two stacked RBMs for each modality. Thirdly, we propose a bimodal auto-encoder to learn the similarities/dissimilarities between the pairwise image-tags as level-3 representations. Finally, during the test phase, based on one observation of the dataset, we come up with a data-specific strategy to choose the correct tag words leading to a leap of an improved overall performance. Our final average accuracy on the private test set is 100%, which ranks the first place in this challenge.



قيم البحث

اقرأ أيضاً

This paper proposes direct learning of image classification from user-supplied tags, without filtering. Each tag is supplied by the user who shared the image online. Enormous numbers of these tags are freely available online, and they give insight ab out the image categories important to users and to image classification. Our approach is complementary to the conventional approach of manual annotation, which is extremely costly. We analyze of the Flickr 100 Million Image dataset, making several useful observations about the statistics of these tags. We introduce a large-scale robust classification algorithm, in order to handle the inherent noise in these tags, and a calibration procedure to better predict objective annotations. We show that freely available, user-supplied tags can obtain similar or superior results to large databases of costly manual annotations.
Tagging facilitates information retrieval in social media and other online communities by allowing users to organize and describe online content. Researchers found that the efficiency of tagging systems steadily decreases over time, because tags beco me less precise in identifying specific documents, i.e., they lose their descriptiveness. However, previous works did not answer how or even whether community managers can improve the efficiency of tags. In this work, we use information-theoretic measures to track the descriptive and retrieval efficiency of tags on Stack Overflow, a question-answering system that strictly limits the number of tags users can specify per question. We observe that tagging efficiency stabilizes over time, while tag content and descriptiveness both increase. To explain this observation, we hypothesize that limiting the number of tags fosters novelty and diversity in tag usage, two properties which are both beneficial for tagging efficiency. To provide qualitative evidence supporting our hypothesis, we present a statistical model of tagging that demonstrates how novelty and diversity lead to greater tag efficiency in the long run. Our work offers insights into policies to improve information organization and retrieval in online communities.
We present AirCode, a technique that allows the user to tag physically fabricated objects with given information. An AirCode tag consists of a group of carefully designed air pockets placed beneath the object surface. These air pockets are easily pro duced during the fabrication process of the object, without any additional material or postprocessing. Meanwhile, the air pockets affect only the scattering light transport under the surface, and thus are hard to notice to our naked eyes. But, by using a computational imaging method, the tags become detectable. We present a tool that automates the design of air pockets for the user to encode information. AirCode system also allows the user to retrieve the information from captured images via a robust decoding algorithm. We demonstrate our tagging technique with applications for metadata embedding, robotic grasping, as well as conveying object affordances.
Determinations of the chemical composition of red giants in a large sample of open clusters show that the abundances of the heavy elements La, Ce, Nd and Sm but not so obviously Y and Eu vary from one cluster to another across a sample all having abo ut the solar metallicity. For La, Ce, Nd and Sm the amplitudes of the variations at solar metallicity scale approximately with the main s-process contribution to solar system material. Consideration of published abundances of field stars suggest that such a spread in heavy element abundances is present for the thin and thick disk stars of different metallicity. This new result provides an opportunity to chemically tag stars by their heavy elements and to reconstruct dissolved open clusters from the field star population.
The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three components, user, item (e.g., photos in Flickr), and tags (i.e., words or phr ases). Analyzing how tags are assigned by certain users to certain items has important implications in helping users search for desired information. In this paper, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, similarity and diversity, being applied to one or more tagging components, and therefore enables a wide range of analysis scenarios such as characterizing similar users tagging diverse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We design efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا