Pragmatic Image Compression for Human-in-the-Loop Decision-Making

108 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Siddharth Reddy

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Siddharth Reddy - Anca D. Dragan - Sergey Levine

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Standard lossy image compression algorithms aim to preserve an images appearance, while minimizing the number of bits needed to transmit it. However, the amount of information actually needed by a user for downstream tasks -- e.g., deciding which product to click on in a shopping website -- is likely much lower. To achieve this lower bitrate, we would ideally only transmit the visual features that drive user behavior, while discarding details irrelevant to the users decisions. We approach this problem by training a compression model through human-in-the-loop learning as the user performs tasks with the compressed images. The key insight is to train the model to produce a compressed image that induces the user to take the same action that they would have taken had they seen the original image. To approximate the loss function for this model, we train a discriminator that tries to distinguish whether a users action was taken in response to the compressed image or the original. We evaluate our method through experiments with human participants on four tasks: reading handwritten digits, verifying photos of faces, browsing an online shopping catalogue, and playing a car racing video game. The results show that our method learns to match the users actions with and without compression at lower bitrates than baseline methods, and adapts the compression model to the users behavior: it preserves the digit number and randomizes handwriting style in the digit reading task, preserves hats and eyeglasses while randomizing faces in the photo verification task, preserves the perceived price of an item while randomizing its color and background in the online shopping task, and preserves upcoming bends in the road in the car racing game.

قيم البحث

85 - Jia Quan Shen , Luo-Luo Jiang 2017

What happen in the brain when human beings play games with computers? Here a simple zero-sum game was conducted to investigate how people make decision via their brain even they know that their opponent is a computer. There are two choices (a low or high number) for people and also two strategies for the computer (red color or green color). When the number selected by the human subject meet the red color, the person loses the score which is equal to the number. On the contrary, the person gains the number of score if the computer chooses a green color for the number selected by the human being. Both the human subject and the computer give their choice at the same time, and subjects have been told that the computer make its decision randomly on the red color or green color. During the experiments, the signal of electroencephalograph (EEG) obtained from brain of subjects was recorded. From the analysis of EEG, we find that people mind the loss more than the gain, and the phenomenon becoming obvious when the gap between loss and gain grows. In addition, the signal of EEG is clearly distinguishable before making different decisions. It is observed that significant negative waves in the entire brain region when the participant has a greater expectation for the outcome, and these negative waves are mainly concentrated in the forebrain region in the brain of human beings.

الخلايا العصبية والإدراك تفاعل الإنسان والحاسوب

Inertial Sensor Data To Image Encoding For Human Action Recognition

226 - Zeeshan Ahmad , Naimul Khan 2021

Convolutional Neural Networks (CNNs) are successful deep learning models in the field of computer vision. To get the maximum advantage of CNN model for Human Action Recognition (HAR) using inertial sensor data, in this paper, we use 4 types of spatia l domain methods for transforming inertial sensor data to activity images, which are then utilized in a novel fusion framework. These four types of activity images are Signal Images (SI), Gramian Angular Field (GAF) Images, Markov Transition Field (MTF) Images and Recurrence Plot (RP) Images. Furthermore, for creating a multimodal fusion framework and to exploit activity image, we made each type of activity images multimodal by convolving with two spatial domain filters : Prewitt filter and High-boost filter. Resnet-18, a CNN model, is used to learn deep features from multi-modalities. Learned features are extracted from the last pooling layer of each ReNet and then fused by canonical correlation based fusion (CCF) for improving the accuracy of human action recognition. These highly informative features are served as input to a multiclass Support Vector Machine (SVM). Experimental results on three publicly available inertial datasets show the superiority of the proposed method over the current state-of-the-art.

الرؤية الحاسوبية وتمييز الأنماط تفاعل الإنسان والحاسوب التعلم الآلي

Augmenting Decision Making via Interactive What-If Analysis

226 - Sneha Gathani , Madelon Hulsebos , James Gale 2021

The fundamental goal of business data analysis is to improve business decisions using data. Business users such as sales, marketing, product, or operations managers often make decisions to achieve key performance indicator (KPI) goals such as increas ing customer retention, decreasing cost, and increasing sales. To discover the relationship between data attributes hypothesized to be drivers and those corresponding to KPIs of interest, business users currently need to perform lengthy exploratory analyses, considering multitudes of combinations and scenarios, slicing, dicing, and transforming the data accordingly. For example, analyzing customer retention across quarters of the year or suggesting optimal media channels across strata of customers. However, the increasing complexity of datasets combined with the cognitive limitations of humans makes it challenging to carry over multiple hypotheses, even for simple datasets. Therefore mentally performing such analyses is hard. Existing commercial tools either provide partial solutions whose effectiveness remains unclear or fail to cater to business users. Here we argue for four functionalities that we believe are necessary to enable business users to interactively learn and reason about the relationships (functions) between sets of data attributes, facilitating data-driven decision making. We implement these functionalities in SystemD, an interactive visual analysis system enabling business users to experiment with the data by asking what-if questions. We evaluate the system through three business use cases: marketing mix modeling analysis, customer retention analysis, and deal closing analysis, and report on feedback from multiple business users. Overall, business users find SystemD intuitive and useful for quick testing and validation of their hypotheses around interested KPI as well as in making effective and fast data-driven decisions.

قواعد البيانات تفاعل الإنسان والحاسوب التعلم الآلي

Human-Understandable Decision Making for Visual Recognition

104 - Xiaowei Zhou , Jie Yin , Ivor Tsang 2021

The widespread use of deep neural networks has achieved substantial success in many tasks. However, there still exists a huge gap between the operating mechanism of deep learning models and human-understandable decision making, so that humans cannot fully trust the predictions made by these models. To date, little work has been done on how to align the behaviors of deep learning models with human perception in order to train a human-understandable model. To fill this gap, we propose a new framework to train a deep neural network by incorporating the prior of human perception into the model learning process. Our proposed model mimics the process of perceiving conceptual parts from images and assessing their relative contributions towards the final recognition. The effectiveness of our proposed model is evaluated on two classical visual recognition tasks. The experimental results and analysis confirm our model is able to provide interpretable explanations for its predictions, but also maintain competitive recognition accuracy.

الذكاء الاصطناعي

Empirically Evaluating Creative Arc Negotiation for Improvisational Decision-making

79 - Mikhail Jacob , Brian Magerko 2021

Action selection from many options with few constraints is crucial for improvisation and co-creativity. Our previous work proposed creative arc negotiation to solve this problem, i.e., selecting actions to follow an author-defined `creative arc or tr ajectory over estimates of novelty, unexpectedness, and quality for potential actions. The CARNIVAL agent architecture demonstrated this approach for playing the Props game from improv theatre in the Robot Improv Circus installation. This article evaluates the creative arc negotiation experience with CARNIVAL through two crowdsourced observer studies and one improviser laboratory study. The studies focus on subjects ability to identify creative arcs in performance and their preference for creative arc negotiation compared to a random selection baseline. Our results show empirically that observers successfully identified creative arcs in performances. Both groups also preferred creative arc negotiation in agent creativity and logical coherence, while observers enjoyed it more too.

الذكاء الاصطناعي تفاعل الإنسان والحاسوب