Learning to score the figure skating sports videos

86 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chengming Xu

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Chengming Xu - Yanwei Fu - Bing Zhang

الوسائط المتعددة الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper targets at learning to score the figure skating sports videos. To address this task, we propose a deep architecture that includes two complementary components, i.e., Self-Attentive LSTM and Multi-scale Convolutional Skip LSTM. These two components can efficiently learn the local and global sequential information in each video. Furthermore, we present a large-scale figure skating sports video dataset -- FisV dataset. This dataset includes 500 figure skating videos with the average length of 2 minutes and 50 seconds. Each video is annotated by two scores of nine different referees, i.e., Total Element Score(TES) and Total Program Component Score (PCS). Our proposed model is validated on FisV and MIT-skate datasets. The experimental results show the effectiveness of our models in learning to score the figure skating videos.

قيم البحث

100 - Tamami Nakano , Atsuya Sakata , Akihiro Kishimoto 2020

Highlight detection in sports videos has a broad viewership and huge commercial potential. It is thus imperative to detect highlight scenes more suitably for human interest with high temporal accuracy. Since people instinctively suppress blinks durin g attention-grabbing events and synchronously generate blinks at attention break points in videos, the instantaneous blink rate can be utilized as a highly accurate temporal indicator of human interest. Therefore, in this study, we propose a novel, automatic highlight detection method based on the blink rate. The method trains a one-dimensional convolution network (1D-CNN) to assess blink rates at each video frame from the spatio-temporal pose features of figure skating videos. Experiments show that the method successfully estimates the blink rate in 94% of the video clips and predicts the temporal change in the blink rate around a jump event with high accuracy. Moreover, the method detects not only the representative athletic action, but also the distinctive artistic expression of figure skating performance as key frames. This suggests that the blink-rate-based supervised learning approach enables high-accuracy highlight detection that more closely matches human sensibility.

الرؤية الحاسوبية وتمييز الأنماط الوسائط المتعددة

Trajectory tracing in figure skating

152 - Meghan Rhodes , Vakhtang Putkaradze 2021

In this work, we model the movement of a figure skater gliding on ice by the Chaplygin sleigh, a classic pedagogical example of a nonholonomic mechanical system. The Chaplygin sleigh is controlled by a movable added mass, modeling the movable center of mass of the figure skater. The position and velocity of the added mass act as controls that can be used to steer the skater in order to produce prescribed patterns. For any piecewise smooth prescribed curve, this model can be used to determine the controls needed to reproduce that curve by approximating the curve with circular arcs. Tracing of the circular arcs is exact in our control procedure, so the accuracy of the method depends solely on the accuracy of approximation of a trajectory by circular arcs. To reproduce the individual elements of a pattern, we employ an optimization algorithm. We conclude by reproducing a classical double flower figure skating pattern and compute the resulting controls.

التحسين والتحكم

Integrability and Chaos in Figure Skating

228 - Vaughn Gzenda , Vakhtang Putkaradze 2018

We derive and analyze a three dimensional model of a figure skater. We model the skater as a three-dimensional body moving in space subject to a non-holonomic constraint enforcing movement along the skates direction and holonomic constraints of conti nuous contact with ice and pitch constancy of the skate. For a static (non-articulated) skater, we show that the system is integrable if and only if the projection of the center of mass on skates direction coincides with the contact point with ice and some mild (and realistic) assumptions on the directions of inertias axes. The integrability is proved by showing the existence of two new constants of motion linear in momenta, providing a new and highly nontrivial example of an integrable non-holonomic mechanical system. We also consider the case when the projection of the center of mass on skates direction does not coincide with the contact point and show that this non-integrable case exhibits apparent chaotic behavior, by studying the divergence of nearby trajectories We also demonstrate the intricate behavior during the transition from the integrable to chaotic case. Our model shows many features of real-life skating, especially figure skating, and we conjecture that real-life skaters may intuitively use the discovered mechanical properties of the system for the control of the performance on ice.

بالضبط النظم القابلة للاندماج والقابلة للتكام ديناميات الفوضوية

Visual Framing of Science Conspiracy Videos: Integrating Machine Learning with Communication Theories to Study the Use of Color and Brightness

201 - Kaiping Chen , Sang Jung Kim , Sebastian Raschka 2021

Recent years have witnessed an explosion of science conspiracy videos on the Internet, challenging science epistemology and public understanding of science. Scholars have started to examine the persuasion techniques used in conspiracy messages such a s uncertainty and fear yet, little is understood about the visual narratives, especially how visual narratives differ in videos that debunk conspiracies versus those that propagate conspiracies. This paper addresses this gap in understanding visual framing in conspiracy videos through analyzing millions of frames from conspiracy and counter-conspiracy YouTube videos using computational methods. We found that conspiracy videos tended to use lower color variance and brightness, especially in thumbnails and earlier parts of the videos. This paper also demonstrates how researchers can integrate textual and visual features for identifying conspiracies on social media and discusses the implications of computational modeling for scholars interested in studying visual manipulation in the digital era.

الوسائط المتعددة الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Quality Assessment of In-the-Wild Videos

87 - Dingquan Li , Tingting Jiang , Ming Jiang 2019

Quality assessment of in-the-wild videos is a challenging problem because of the absence of reference videos and shooting distortions. Knowledge of the human visual system can help establish methods for objective quality assessment of in-the-wild vid eos. In this work, we show two eminent effects of the human visual system, namely, content-dependency and temporal-memory effects, could be used for this purpose. We propose an objective no-reference video quality assessment method by integrating both effects into a deep neural network. For content-dependency, we extract features from a pre-trained image classification neural network for its inherent content-aware property. For temporal-memory effects, long-term dependencies, especially the temporal hysteresis, are integrated into the network with a gated recurrent unit and a subjectively-inspired temporal pooling layer. To validate the performance of our method, experiments are conducted on three publicly available in-the-wild video quality assessment databases: KoNViD-1k, CVD2014, and LIVE-Qualcomm, respectively. Experimental results demonstrate that our proposed method outperforms five state-of-the-art methods by a large margin, specifically, 12.39%, 15.71%, 15.45%, and 18.09% overall performance improvements over the second-best method VBLIINDS, in terms of SROCC, KROCC, PLCC and RMSE, respectively. Moreover, the ablation study verifies the crucial role of both the content-aware features and the modeling of temporal-memory effects. The PyTorch implementation of our method is released at https://github.com/lidq92/VSFA.

الوسائط المتعددة الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو