No Arabic abstract
Scatterplots are used for a variety of visual analytics tasks, including cluster identification, and the visual encodings used on a scatterplot play a deciding role on the level of visual separation of clusters. For visualization designers, optimizing the visual encodings is crucial to maximizing the clarity of data. This requires accurately modeling human perception of cluster separation, which remains challenging. We present a multi-stage user study focusing on four factors---distribution size of clusters, number of points, size of points, and opacity of points---that influence cluster identification in scatterplots. From these parameters, we have constructed two models, a distance-based model, and a density-based model, using the merge tree data structure from Topological Data Analysis. Our analysis demonstrates that these factors play an important role in the number of clusters perceived, and it verifies that the distance-based and density-based models can reasonably estimate the number of clusters a user observes. Finally, we demonstrate how these models can be used to optimize visual encodings on real-world data.
Visual quality measures (VQMs) are designed to support analysts by automatically detecting and quantifying patterns in visualizations. We propose a new data-driven technique called ClustRank that allows to rank scatterplots according to visible grouping patterns. Our model first encodes scatterplots in the parametric space of a Gaussian Mixture Model, and then uses a classifier trained on human judgment data to estimate the perceptual complexity of grouping patterns. The numbers of initial mixture components and final combined groups determine the rank of the scatterplot. ClustRank improves on existing VQM techniques by mimicking human judgments on two-Gaussian cluster patterns and gives more accuracy when ranking general cluster patterns in scatterplots. We demonstrate its benefit by analyzing kinship data for genome-wide association studies, a domain in which experts rely on the visual analysis of large sets of scatterplots. We make the three benchmark datasets and the ClustRank VQM available for practical use and further improvements.
Well-designed data visualizations can lead to more powerful and intuitive processing by a viewer. To help a viewer intuitively compare values to quickly generate key takeaways, visualization designers can manipulate how data values are arranged in a chart to afford particular comparisons. Using simple bar charts as a case study, we empirically tested the comparison affordances of four common arrangements: vertically juxtaposed, horizontally juxtaposed, overlaid, and stacked. We asked participants to type out what patterns they perceived in a chart, and coded their takeaways into types of comparisons. In a second study, we asked data visualization design experts to predict which arrangement they would use to afford each type of comparison and found both alignments and mismatches with our findings. These results provide concrete guidelines for how both human designers and automatic chart recommendation systems can make visualizations that help viewers extract the right takeaway.
Scatterplots are frequently scaled to fit display areas in multi-view and multi-device data analysis environments. A common method used for scaling is to enlarge or shrink the entire scatterplot together with the inside points synchronously and proportionally. This process is called geometric scaling. However, geometric scaling of scatterplots may cause a perceptual bias, that is, the perceived and physical values of visual features may be dissociated with respect to geometric scaling. For example, if a scatterplot is projected from a laptop to a large projector screen, then observers may feel that the scatterplot shown on the projector has fewer points than that viewed on the laptop. This paper presents an evaluation study on the perceptual bias of visual features in scatterplots caused by geometric scaling. The study focuses on three fundamental visual features (i.e., numerosity, correlation, and cluster separation) and three hypotheses that are formulated on the basis of our experience. We carefully design three controlled experiments by using well-prepared synthetic data and recruit participants to complete the experiments on the basis of their subjective experience. With a detailed analysis of the experimental results, we obtain a set of instructive findings. First, geometric scaling causes a bias that has a linear relationship with the scale ratio. Second, no significant difference exists between the biases measured from normally and uniformly distributed scatterplots. Third, changing the point radius can correct the bias to a certain extent. These findings can be used to inspire the design decisions of scatterplots in various scenarios.
Visual analytics systems enable highly interactive exploratory data analysis. Across a range of fields, these technologies have been successfully employed to help users learn from complex data. However, these same exploratory visualization techniques make it easy for users to discover spurious findings. This paper proposes new methods to monitor a users analytic focus during visual analysis of structured datasets and use it to surface relevant articles that contextualize the visualized findings. Motivated by interactive analyses of electronic health data, this paper introduces a formal model of analytic focus, a computational approach to dynamically update the focus model at the time of user interaction, and a prototype application that leverages this model to surface relevant medical publications to users during visual analysis of a large corpus of medical records. Evaluation results with 24 users show that the modeling approach has high levels of accuracy and is able to surface highly relevant medical abstracts.
Recent research has proposed teleoperation of robotic and aerial vehicles using head motion tracked by a head-mounted display (HMD). First-person views of the vehicles are usually captured by onboard cameras and presented to users through the display panels of HMDs. This provides users with a direct, immersive and intuitive interface for viewing and control. However, a typically overlooked factor in such designs is the latency introduced by the vehicle dynamics. As head motion is coupled with visual updates in such applications, visual and control latency always exists between the issue of control commands by head movements and the visual feedback received at the completion of the attitude adjustment. This causes a discrepancy between the intended motion, the vestibular cue and the visual cue and may potentially result in simulator sickness. No research has been conducted on how various levels of visual and control latency introduced by dynamics in robots or aerial vehicles affect users performance and the degree of simulator sickness elicited. Thus, it is uncertain how much performance is degraded by latency and whether such designs are comfortable from the perspective of users. To address these issues, we studied a prototyped scenario of a head motion controlled quadcopter using an HMD. We present a virtual reality (VR) paradigm to systematically assess the effects of visual and control latency in simulated drone control scenarios.