أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Hyun Soo Park

105 - Benjamin Hayden , Hyun Soo Park , Jan Zimmermann 2021

Understanding primate behavior is a mission-critical goal of both biology and biomedicine. Despite the importance of behavior, our ability to rigorously quantify it has heretofore been limited to low-information measures like preference, looking time , and reaction time, or to non-scaleable measures like ethograms. However, recent technological advances have led to a major revolution in behavioral measurement. Specifically, digital video cameras and automated pose tracking software can provide detailed measures of full body position (i.e., pose) of multiple primates over time (i.e., behavior) with high spatial and temporal resolution. Pose-tracking technology in turn can be used to detect behavioral states, such as eating, sleeping, and mating. The availability of such data has in turn spurred developments in data analysis techniques. Together, these changes are poised to lead to major advances in scientific fields that rely on behavioral as a dependent variable. In this review, we situate the tracking revolution in the history of the study of behavior, argue for investment in and development of analytical and research techniques that can profit from the advent of the era of big behavior, and propose that zoos will have a central role to play in this era.

الأساليب الكمية

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

101 - Yasamin Jafarian , Hyun Soo Park 2021

A key challenge of learning the geometry of dressed humans lies in the limited availability of the ground truth data (e.g., 3D scanned models), which results in the performance degradation of 3D human reconstruction when applying to real-world imager y. We address this challenge by leveraging a new data resource: a number of social media dance videos that span diverse appearance, clothing styles, performances, and identities. Each video depicts dynamic movements of the body and clothes of a single person while lacking the 3D ground truth geometry. To utilize these videos, we present a new method to use the local transformation that warps the predicted local geometry of the person from an image to that of another image at a different time instant. This allows self-supervision as enforcing a temporal coherence over the predictions. In addition, we jointly learn the depth along with the surface normals that are highly responsive to local texture, wrinkle, and shade by maximizing their geometric consistency. Our method is end-to-end trainable, resulting in high fidelity depth estimation that predicts fine geometry faithful to the input real image. We demonstrate that our method outperforms the state-of-the-art human depth estimation and human shape recovery approaches on both real and rendered images.

الرؤية الحاسوبية وتمييز الأنماط

Multiview Cross-supervision for Semantic Segmentation

89 - Yuan Yao , Hyun Soo Park 2018

This paper presents a semi-supervised learning framework for a customized semantic segmentation task using multiview image streams. A key challenge of the customized task lies in the limited accessibility of the labeled data due to the requirement of prohibitive manual annotation effort. We hypothesize that it is possible to leverage multiview image streams that are linked through the underlying 3D geometry, which can provide an additional supervisionary signal to train a segmentation model. We formulate a new cross-supervision method using a shape belief transfer---the segmentation belief in one image is used to predict that of the other image through epipolar geometry analogous to shape-from-silhouette. The shape belief transfer provides the upper and lower bounds of the segmentation for the unlabeled data where its gap approaches asymptotically to zero as the number of the labeled views increases. We integrate this theory to design a novel network that is agnostic to camera calibration, network model, and semantic category and bypasses the intermediate process of suboptimal 3D reconstruction. We validate this network by recognizing a customized semantic category per pixel from realworld visual data including non-human species and a subject of interest in social videos where attaining large-scale annotation data is infeasible.

الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد