An Effective Way to Improve YouTube-8M Classification Accuracy in Google Cloud Platform

126 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Shujiao Huang

تاريخ النشر 2017

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Zhenzhen Zhong - Shujiao Huang - Cheng Zhan

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Large-scale datasets have played a significant role in progress of neural network and deep learning areas. YouTube-8M is such a benchmark dataset for general multi-label video classification. It was created from over 7 million YouTube videos (450,000 hours of video) and includes video labels from a vocabulary of 4716 classes (3.4 labels/video on average). It also comes with pre-extracted audio & visual features from every second of video (3.2 billion feature vectors in total). Google cloud recently released the datasets and organized Google Cloud & YouTube-8M Video Understanding Challenge on Kaggle. Competitors are challenged to develop classification algorithms that assign video-level labels using the new and improved Youtube-8M V2 dataset. Inspired by the competition, we started exploration of audio understanding and classification using deep learning algorithms and ensemble methods. We built several baseline predictions according to the benchmark paper and public github tensorflow code. Furthermore, we improved global prediction accuracy (GAP) from base level 77% to 80.7% through approaches of ensemble.

قيم البحث

97 - M. Landoni , G. Taffoni , A. Bignamini 2019

The availability of new Cloud Platform offered by Google motivated us to propose nine Proof of Concepts (PoC) aiming to demonstrated and test the capabilities of the platform in the context of scientifically-driven tasks and requirements. We review t he status of our initiative by illustrating 3 out of 9 successfully closed PoC that we implemented on Google Cloud Platform. In particular, we illustrate a cloud architecture for deployment of scientific software as microservice coupling Google Compute Engine with Docker and Pub/Sub to dispatch heavily parallel simulations. We detail also an experiment for HPC based simulation and workflow executions of data reduction pipelines (for the TNG-GIANO-B spectrograph) deployed on GCP. We compare and contrast our experience with on-site facilities comparing advantages and disadvantages both in terms of total cost of ownership and reached performances.

الأجهزة والأساليب للزيئات الفيزياء الفلكية النظم الموزعة والتوازية والحوسبة العنقودية

Constrained-size Tensorflow Models for YouTube-8M Video Understanding Challenge

57 - Tianqi Liu , Bo Liu 2018

This paper presents our 7th place solution to the second YouTube-8M video understanding competition which challenges participates to build a constrained-size model to classify millions of YouTube videos into thousands of classes. Our final model cons ists of four single models aggregated into one tensorflow graph. For each single model, we use the same network architecture as in the winning solution of the first YouTube-8M video understanding competition, namely Gated NetVLAD. We train the single models separately in tensorflows default float32 precision, then replace weights with float16 precision and ensemble them in the evaluation and inference stages., achieving 48.5% compression rate without loss of precision. Our best model achieved 88.324% GAP on private leaderboard. The code is publicly available at https://github.com/boliu61/youtube-8m

الرؤية الحاسوبية وتمييز الأنماط

Information-theoretic Classification Accuracy: A Criterion that Guides Data-driven Combination of Ambiguous Outcome Labels in Multi-class Classification

248 - Chihao Zhang , Yiling Elaine Chen , Shihua Zhang 2021

Outcome labeling ambiguity and subjectivity are ubiquitous in real-world datasets. While practitioners commonly combine ambiguous outcome labels in an ad hoc way to improve the accuracy of multi-class classification, there lacks a principled approach to guide label combination by any optimality criterion. To address this problem, we propose the information-theoretic classification accuracy (ITCA), a criterion of outcome information conditional on outcome prediction, to guide practitioners on how to combine ambiguous outcome labels. ITCA indicates a balance in the trade-off between prediction accuracy (how well do predicted labels agree with actual labels) and prediction resolution (how many labels are predictable). To find the optimal label combination indicated by ITCA, we develop two search strategies: greedy search and breadth-first search. Notably, ITCA and the two search strategies are adaptive to all machine-learning classification algorithms. Coupled with a classification algorithm and a search strategy, ITCA has two uses: to improve prediction accuracy and to identify ambiguous labels. We first verify that ITCA achieves high accuracy with both search strategies in finding the correct label combinations on synthetic and real data. Then we demonstrate the effectiveness of ITCA in diverse applications including medical prognosis, cancer survival prediction, user demographics prediction, and cell type classification.

التعلم الالي التعلم الآلي المنهجية

An Empirical Accuracy Law for Sequential Machine Translation: the Case of Google Translate

90 - Lucas Nunes Sequeira , Bruno Moreschi , Fabio Gagliardi Cozman andn Bernardo Fontes 2020

In this research, we have established, through empirical testing, a law that relates the number of translating hops to translation accuracy in sequential machine translation in Google Translate. Both accuracy and size decrease with the number of hops ; the former displays a decrease closely following a power law. Such a law allows one to predict the behavior of translation chains that may be built as society increasingly depends on automated devices.

الحساب واللغة التعلم الآلي التعلم الالي

MLModelCI: An Automatic Cloud Platform for Efficient MLaaS

154 - Huaizheng Zhang , Yuanming Li , Yizheng Huang 2020

MLModelCI provides multimedia researchers and developers with a one-stop platform for efficient machine learning (ML) services. The system leverages DevOps techniques to optimize, test, and manage models. It also containerizes and deploys these optim ized and validated models as cloud services (MLaaS). In its essence, MLModelCI serves as a housekeeper to help users publish models. The models are first automatically converted to optimized formats for production purpose and then profiled under different settings (e.g., batch size and hardware). The profiling information can be used as guidelines for balancing the trade-off between performance and cost of MLaaS. Finally, the system dockerizes the models for ease of deployment to cloud environments. A key feature of MLModelCI is the implementation of a controller, which allows elastic evaluation which only utilizes idle workers while maintaining online service quality. Our system bridges the gap between current ML training and serving systems and thus free developers from manual and tedious work often associated with service deployment. We release the platform as an open-source project on GitHub under Apache 2.0 license, with the aim that it will facilitate and streamline more large-scale ML applications and research projects.

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي