ﻻ يوجد ملخص باللغة العربية
Video transmission applications (e.g., conferencing) are gaining momentum, especially in times of global health pandemic. Video signals are transmitted over lossy channels, resulting in low-quality received signals. To restore videos on recipient edge devices in real-time, we introduce an efficient video restoration network, EVRNet. EVRNet efficiently allocates parameters inside the network using alignment, differential, and fusion modules. With extensive experiments on video restoration tasks (deblocking, denoising, and super-resolution), we demonstrate that EVRNet delivers competitive performance to existing methods with significantly fewer parameters and MACs. For example, EVRNet has 260 times fewer parameters and 958 times fewer MACs than enhanced deformable convolution-based video restoration network (EDVR) for 4 times video super-resolution while its SSIM score is 0.018 less than EDVR. We also evaluated the performance of EVRNet under multiple distortions on unseen dataset to demonstrate its ability in modeling variable-length sequences under both camera and object motion.
In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency. It is observed that the most informative region in each frame of a video is usually a small image patch, which shifts smoothly
Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. While the endoscopy video contains a wealth of information, tools to capture this information for the purpose of clinical reporting are rather
The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a promising re
Anomaly detection in videos is a problem that has been studied for more than a decade. This area has piqued the interest of researchers due to its wide applicability. Because of this, there has been a wide array of approaches that have been proposed
Existing unsupervised video-to-video translation methods fail to produce translated videos which are frame-wise realistic, semantic information preserving and video-level consistent. In this work, we propose UVIT, a novel unsupervised video-to-video