ترغب بنشر مسار تعليمي؟ اضغط هنا

KIT MOMA: A Mobile Machines Dataset

106   0   0.0 ( 0 )
 نشر من قبل Yusheng Xiang
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Mobile machines typically working in a closed site, have a high potential to utilize autonomous driving technology. However, vigorously thriving development and innovation are happening mostly in the area of passenger cars. In contrast, although there are also many research pieces about autonomous driving or working in mobile machines, a consensus about the SOTA solution is still not achieved. We believe that the most urgent problem that should be solved is the absence of a public and challenging visual dataset, which makes the results from different researches comparable. To address the problem, we publish the KIT MOMA dataset, including eight classes of commonly used mobile machines, which can be used as a benchmark to evaluate the SOTA algorithms to detect mobile construction machines. The view of the gathered images is outside of the mobile machines since we believe fixed cameras on the ground are more suitable if all the interesting machines are working in a closed site. Most of the images in KIT MOMA are in a real scene, whereas some of the images are from the official website of top construction machine companies. Also, we have evaluated the performance of YOLO v3 on our dataset, indicating that the SOTA computer vision algorithms already show an excellent performance for detecting the mobile machines in a specific working site. Together with the dataset, we also upload the trained weights, which can be directly used by engineers from the construction machine industry. The dataset, trained weights, and updates can be found on our Github. Moreover, the demo can be found on our Youtube.



قيم البحث

اقرأ أيضاً

Face tracking serves as the crucial initial step in mobile applications trying to analyse target faces over time in mobile settings. However, this problem has received little attention, mainly due to the scarcity of dedicated face tracking benchmarks . In this work, we introduce MobiFace, the first dataset for single face tracking in mobile situations. It consists of 80 unedited live-streaming mobile videos captured by 70 different smartphone users in fully unconstrained environments. Over $95K$ bounding boxes are manually labelled. The videos are carefully selected to cover typical smartphone usage. The videos are also annotated with 14 attributes, including 6 newly proposed attributes and 8 commonly seen in object tracking. 36 state-of-the-art trackers, including facial landmark trackers, generic object trackers and trackers that we have fine-tuned or improved, are evaluated. The results suggest that mobile face tracking cannot be solved through existing approaches. In addition, we show that fine-tuning on the MobiFace training data significantly boosts the performance of deep learning-based trackers, suggesting that MobiFace captures the unique characteristics of mobile face tracking. Our goal is to offer the community a diverse dataset to enable the design and evaluation of mobile face trackers. The dataset, annotations and the evaluation server will be on url{https://mobiface.github.io/}.
The use of datasets is getting more relevance in surgical robotics since they can be used to recognise and automate tasks. Also, this allows to use common datasets to compare different algorithms and methods. The objective of this work is to provide a complete dataset of three common training surgical tasks that surgeons perform to improve their skills. For this purpose, 12 subjects teleoperated the da Vinci Research Kit to perform these tasks. The obtained dataset includes all the kinematics and dynamics information provided by the da Vinci robot (both master and slave side) together with the associated video from the camera. All the information has been carefully timestamped and provided in a readable csv format. A MATLAB interface integrated with ROS for using and replicating the data is also provided.
The current optimization approaches of construction machinery are mainly based on internal sensors. However, the decision of a reasonable strategy is not only determined by its intrinsic signals, but also very strongly by environmental information, e specially the terrain. Due to the dynamically changing of the construction site and the consequent absence of a high definition map, the Simultaneous Localization and Mapping (SLAM) offering the terrain information for construction machines is still challenging. Current SLAM technologies proposed for mobile machines are strongly dependent on costly or computationally expensive sensors, such as RTK GPS and cameras, so that commercial use is rare. In this study, we proposed an affordable SLAM method to create a multi-layer gird map for the construction site so that the machine can have the environmental information and be optimized accordingly. Concretely, after the machine passes by, we can get the local information and record it. Combining with positioning technology, we then create a map of the interesting places of the construction site. As a result of our research gathered from Gazebo, we showed that a suitable layout is the combination of 1 IMU and 2 differential GPS antennas using the unscented Kalman filter, which keeps the average distance error lower than 2m and the mapping error lower than 1.3% in the harsh environment. As an outlook, our SLAM technology provides the cornerstone to activate many efficiency improvement approaches.
The first mobile camera phone was sold only 20 years ago, when taking pictures with ones phone was an oddity, and sharing pictures online was unheard of. Today, the smartphone is more camera than phone. How did this happen? This transformation was en abled by advances in computational photography -the science and engineering of making great images from small form factor, mobile cameras. Modern algorithmic and computing advances, including machine learning, have changed the rules of photography, bringing to it new modes of capture, post-processing, storage, and sharing. In this paper, we give a brief history of mobile computational photography and describe some of the key technological components, including burst photography, noise reduction, and super-resolution. At each step, we may draw naive parallels to the human visual system.
We relate the Riemann curvature of a holographic spacetime to an entanglement property of the dual CFT state: the Berry curvature of its modular Hamiltonians. The modular Berry connection encodes the relative bases of nearby CFT subregions while its bulk dual, restricted to the code subspace, relates the edge-mode frames of the corresponding entanglement wedges. At leading order in 1/N and for sufficiently smooth HRRT surfaces, the modular Berry connection simply sews together the orthonormal coordinate systems covering neighborhoods of HRRT surfaces. This geometric perspective on entanglement is a promising new tool for connecting the dynamics of entanglement and gravitation.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا