No Arabic abstract
Image retargeting effectively resizes images by preserving the recognizability of important image regions. Most of retargeting methods rely on good importance maps as a cue to retain or remove certain regions in the input image. In addition, the traditional evaluation exhaustively depends on user ratings. There is a legitimate need for a methodological approach for evaluating retargeted results. Therefore, in this paper, we conduct a study and analysis on the prominent method in image retargeting, Seam Carving. First, we introduce two novel evaluation metrics which can be considered as the proxy of user ratings. Second, we exploit salient object dataset as a benchmark for this task. We then investigate different types of importance maps for this particular problem. The experiments show that humans in general agree with the evaluation metrics on the retargeted results and some importance map methods are consistently more favorable than others.
Image retargeting is a new image processing task that renders the change of aspect ratio in images. One of the most famous image-retargeting algorithms is seam-carving. Although seam-carving is fast and straightforward, it usually distorts the images. In this paper, we introduce a new seam-carving algorithm that not only has the simplicity of the original seam-carving but also lacks the usual unwanted distortion existed in the original method. The positional distribution of seams is introduced. We show that the proposed method outperforms the original seam-carving in terms of retargeted image quality assessment and seam coagulation measures.
We present two new metrics for evaluating generative models in the class-conditional image generation setting. These metrics are obtained by generalizing the two most popular unconditional metrics: the Inception Score (IS) and the Frechet Inception Distance (FID). A theoretical analysis shows the motivation behind each proposed metric and links the novel metrics to their unconditional counterparts. The link takes the form of a product in the case of IS or an upper bound in the FID case. We provide an extensive empirical evaluation, comparing the metrics to their unconditional variants and to other metrics, and utilize them to analyze existing generative models, thus providing additional insights about their performance, from unlearned classes to mode collapse.
Purpose: Surgical task-based metrics (rather than entire procedure metrics) can be used to improve surgeon training and, ultimately, patient care through focused training interventions. Machine learning models to automatically recognize individual tasks or activities are needed to overcome the otherwise manual effort of video review. Traditionally, these models have been evaluated using frame-level accuracy. Here, we propose evaluating surgical activity recognition models by their effect on task-based efficiency metrics. In this way, we can determine when models have achieved adequate performance for providing surgeon feedback via metrics from individual tasks. Methods: We propose a new CNN-LSTM model, RP-Net-V2, to recognize the 12 steps of robotic-assisted radical prostatectomies (RARP). We evaluated our model both in terms of conventional methods (e.g. Jaccard Index, task boundary accuracy) as well as novel ways, such as the accuracy of efficiency metrics computed from instrument movements and system events. Results: Our proposed model achieves a Jaccard Index of 0.85 thereby outperforming previous models on robotic-assisted radical prostatectomies. Additionally, we show that metrics computed from tasks automatically identified using RP-Net-V2 correlate well with metrics from tasks labeled by clinical experts. Conclusions: We demonstrate that metrics-based evaluation of surgical activity recognition models is a viable approach to determine when models can be used to quantify surgical efficiencies. We believe this approach and our results illustrate the potential for fully automated, post-operative efficiency reports.
In this paper, we present a method of clothes retargeting; generating the potential poses and deformations of a given 3D clothing template model to fit onto a person in a single RGB image. The problem is fundamentally ill-posed as attaining the ground truth data is impossible, i.e., images of people wearing the different 3D clothing template model at exact same pose. We address this challenge by utilizing large-scale synthetic data generated from physical simulation, allowing us to map 2D dense body pose to 3D clothing deformation. With the simulated data, we propose a semi-supervised learning framework that validates the physical plausibility of the 3D deformation by matching with the prescribed body-to-cloth contact points and clothing silhouette to fit onto the unlabeled real images. A new neural clothes retargeting network (CRNet) is designed to integrate the semi-supervised retargeting task in an end-to-end fashion. In our evaluation, we show that our method can predict the realistic 3D pose and deformation field needed for retargeting clothes models in real-world examples.
Image retargeting is the task of making images capable of being displayed on screens with different sizes. This work should be done so that high-level visual information and low-level features such as texture remain as intact as possible to the human visual system, while the output image may have different dimensions. Thus, simple methods such as scaling and cropping are not adequate for this purpose. In recent years, researchers have tried to improve the existing retargeting methods and introduce new ones. However, a specific method cannot be utilized to retarget all types of images. In other words, different images require different retargeting methods. Image retargeting has a close relationship to image saliency detection, which is relatively a new image processing task. Earlier saliency detection methods were based on local and global but low-level image information. These methods are called bottom-up methods. On the other hand, newer approaches are top-down and mixed methods that consider the high level and semantic information of the image too. In this paper, we introduce the proposed methods in both saliency detection and retargeting. For the saliency detection, the use of image context and semantic segmentation are examined, and a novel mixed bottom-up, and top-down saliency detection method is introduced. After saliency detection, a modified version of an existing retargeting method is utilized for retargeting the images. The results suggest that the proposed image retargeting pipeline has excellent performance compared to other tested methods. Also, the subjective evaluations on the Pascal dataset can be used as a retargeting quality assessment dataset for further research.