ﻻ يوجد ملخص باللغة العربية
Among several road hazards that are present in any paved way in the world, potholes are one of the most annoying and also involving higher maintenance costs. There exists an increasing interest on the automated detection of these hazards enabled by technological and research progress. Our research work tackled the challenge of pothole detection from images of real world road scenes. The main novelty resides on the application of the latest progress in AI to learn the visual appearance of potholes. We built a large dataset of images with pothole annotations. They contained road scenes from different cities in the world, taken with different cameras, vehicles and viewpoints under varied environmental conditions. Then, we fine-tuned four different object detection models based on Faster R-CNN and SSD deep neural networks. We achieved high average precision and the pothole detector was tested on the Nvidia DrivePX2 platform with GPGPU capability, which can be embedded on vehicles. Moreover, it was deployed on a real vehicle to notify the detected potholes to a given IoT platform as part of AUTOPILOT H2020 project.
Text detection in natural scene images for content analysis is an interesting task. The research community has seen some great developments for English/Mandarin text detection. However, Urdu text extraction in natural scene images is a task not well
Despite the huge progress in scene graph generation in recent years, its long-tail distribution in object relationships remains a challenging and pestering issue. Existing methods largely rely on either external knowledge or statistical bias informat
An unsupervised image-to-image translation (UI2I) task deals with learning a mapping between two domains without paired images. While existing UI2I methods usually require numerous unpaired images from different domains for training, there are many s
Endoscopy is a widely used imaging modality to diagnose and treat diseases in hollow organs as for example the gastrointestinal tract, the kidney and the liver. However, due to varied modalities and use of different imaging protocols at various clini
Face images are subject to many different factors of variation, especially in unconstrained in-the-wild scenarios. For most tasks involving such images, e.g. expression recognition from video streams, having enough labeled data is prohibitively expen