ترغب بنشر مسار تعليمي؟ اضغط هنا

An Automatic Image Content Retrieval Method for better Mobile Device Display User Experiences

57   0   0.0 ( 0 )
 نشر من قبل Alessandro Bruno Dr
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English
 تأليف Alessandro Bruno




اسأل ChatGPT حول البحث

A growing number of commercially available mobile phones come with integrated high-resolution digital cameras. That enables a new class of dedicated applications to image analysis such as mobile visual search, image cropping, object detection, content-based image retrieval, image classification. In this paper, a new mobile application for image content retrieval and classification for mobile device display is proposed to enrich the visual experience of users. The mobile application can extract a certain number of images based on the content of an image with visual saliency methods aiming at detecting the most critical regions in a given image from a perceptual viewpoint. First, the most critical areas from a perceptual perspective are extracted using the local maxima of a 2D saliency function. Next, a salient region is cropped using the bounding box centred on the local maxima of the thresholded Saliency Map of the image. Then, each image crop feds into an Image Classification system based on SVM and SIFT descriptors to detect the class of object present in the image. ImageNet repository was used as the reference for semantic category classification. Android platform was used to implement the mobile application on a client-server architecture. A mobile client sends the photo taken by the camera to the server, which processes the image and returns the results (image contents such as image crops and related target classes) to the mobile client. The application was run on thousands of pictures and showed encouraging results towards a better user visual experience with mobile displays.



قيم البحث

اقرأ أيضاً

We address the problem of unpaired geometric image-to-image translation. Rather than transferring the style of an image as a whole, our goal is to translate the geometry of an object as depicted in different domains while preserving its appearance ch aracteristics. Our model is trained in an unpaired fashion, i.e. without the need of paired images during training. It performs all steps of the shape transfer within a single model and without additional post-processing stages. Extensive experiments on the VITON, CMU-Multi-PIE and our own FashionStyle datasets show the effectiveness of the method. In addition, we show that despite their low-dimensionality, the features learned by our model are useful to the item retrieval task.
This paper proposes direct learning of image classification from user-supplied tags, without filtering. Each tag is supplied by the user who shared the image online. Enormous numbers of these tags are freely available online, and they give insight ab out the image categories important to users and to image classification. Our approach is complementary to the conventional approach of manual annotation, which is extremely costly. We analyze of the Flickr 100 Million Image dataset, making several useful observations about the statistics of these tags. We introduce a large-scale robust classification algorithm, in order to handle the inherent noise in these tags, and a calibration procedure to better predict objective annotations. We show that freely available, user-supplied tags can obtain similar or superior results to large databases of costly manual annotations.
In this paper, we developed the system for recognizing the orchid species by using the images of flower. We used MSRM (Maximal Similarity based on Region Merging) method for segmenting the flower object from the background and extracting the shape fe ature such as the distance from the edge to the centroid point of the flower, aspect ratio, roundness, moment invariant, fractal dimension and also extract color feature. We used HSV color feature with ignoring the V value. To retrieve the image, we used Support Vector Machine (SVM) method. Orchid is a unique flower. It has a part of flower called lip (labellum) that distinguishes it from other flowers even from other types of orchids. Thus, in this paper, we proposed to do feature extraction not only on flower region but also on lip (labellum) region. The result shows that our proposed method can increase the accuracy value of content based flower image retrieval for orchid species up to $pm$ 14%. The most dominant feature is Centroid Contour Distance, Moment Invariant and HSV Color. The system accuracy is 85,33% in validation phase and 79,33% in testing phase.
117 - Hai Phan , Dang Huynh , Yihui He 2019
MobileNet and Binary Neural Networks are two among the most widely used techniques to construct deep learning models for performing a variety of tasks on mobile and embedded platforms.In this paper, we present a simple yet efficient scheme to exploit MobileNet binarization at activation function and model weights. However, training a binary network from scratch with separable depth-wise and point-wise convolutions in case of MobileNet is not trivial and prone to divergence. To tackle this training issue, we propose a novel neural network architecture, namely MoBiNet - Mobile Binary Network in which skip connections are manipulated to prevent information loss and vanishing gradient, thus facilitate the training process. More importantly, while existing binary neural networks often make use of cumbersome backbones such as Alex-Net, ResNet, VGG-16 with float-type pre-trained weights initialization, our MoBiNet focuses on binarizing the already-compressed neural networks like MobileNet without the need of a pre-trained model to start with. Therefore, our proposal results in an effectively small model while keeping the accuracy comparable to existing ones. Experiments on ImageNet dataset show the potential of the MoBiNet as it achieves 54.40% top-1 accuracy and dramatically reduces the computational cost with binary operators.
Existing research in scene image classification has focused on either content features (e.g., visual information) or context features (e.g., annotations). As they capture different information about images which can be complementary and useful to dis criminate images of different classes, we suppose the fusion of them will improve classification results. In this paper, we propose new techniques to compute content features and context features, and then fuse them together. For content features, we design multi-scale deep features based on background and foreground information in images. For context features, we use annotations of similar images available in the web to design a filter words (codebook). Our experiments in three widely used benchmark scene datasets using support vector machine classifier reveal that our proposed context and content features produce better results than existing context and content features, respectively. The fusion of the proposed two types of features significantly outperform numerous state-of-the-art features.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا