No Arabic abstract
Large-scale interaction networks of human communication are often modeled as complex graph structures, obscuring temporal patterns within individual conversations. To facilitate the understanding of such conversational dynamics, episodes with low or high communication activity as well as breaks in communication need to be detected to enable the identification of temporal interaction patterns. Traditional episode detection approaches are highly dependent on the choice of parameters, such as window-size or binning-resolution. In this paper, we present a novel technique for the identification of relevant episodes in bi-directional interaction sequences from abstract communication networks. We model communication as a continuous density function, allowing for a more robust segmentation into individual episodes and estimation of communication volume. Additionally, we define a tailored feature set to characterize conversational dynamics and enable a user-steered classification of communication behavior. We apply our technique to a real-world corpus of email data from a large European research institution. The results show that our technique allows users to effectively define, identify, and analyze relevant communication episodes.
A common network analysis task is comparison of two networks to identify unique characteristics in one network with respect to the other. For example, when comparing protein interaction networks derived from normal and cancer tissues, one essential task is to discover protein-protein interactions unique to cancer tissues. However, this task is challenging when the networks contain complex structural (and semantic) relations. To address this problem, we design ContraNA, a visual analytics framework leveraging both the power of machine learning for uncovering unique characteristics in networks and also the effectiveness of visualization for understanding such uniqueness. The basis of ContraNA is cNRL, which integrates two machine learning schemes, network representation learning (NRL) and contrastive learning (CL), to generate a low-dimensional embedding that reveals the uniqueness of one network when compared to another. ContraNA provides an interactive visualization interface to help analyze the uniqueness by relating embedding results and network structures as well as explaining the learned features by cNRL. We demonstrate the usefulness of ContraNA with two case studies using real-world datasets. We also evaluate through a controlled user study with 12 participants on network comparison tasks. The results show that participants were able to both effectively identify unique characteristics from complex networks and interpret the results obtained from cNRL.
Real-time tweets can provide useful information on evolving events and situations. Geotagged tweets are especially useful, as they indicate the location of origin and provide geographic context. However, only a small portion of tweets are geotagged, limiting their use for situational awareness. In this paper, we adapt, improve, and evaluate a state-of-the-art deep learning model for city-level geolocation prediction, and integrate it with a visual analytics system tailored for real-time situational awareness. We provide computational evaluations to demonstrate the superiority and utility of our geolocation prediction model within an interactive system.
Communication consists of both meta-information as well as content. Currently, the automated analysis of such data often focuses either on the network aspects via social network analysis or on the content, utilizing methods from text-mining. However, the first category of approaches does not leverage the rich content information, while the latter ignores the conversation environment and the temporal evolution, as evident in the meta-information. In contradiction to communication research, which stresses the importance of a holistic approach, both aspects are rarely applied simultaneously, and consequently, their combination has not yet received enough attention in automated analysis systems. In this work, we aim to address this challenge by discussing the difficulties and design decisions of such a path as well as contribute CommAID, a blueprint for a holistic strategy to communication analysis. It features an integrated visual analytics design to analyze communication networks through dynamics modeling, semantic pattern retrieval, and a user-adaptable and problem-specific machine learning-based retrieval system. An interactive multi-level matrix-based visualization facilitates a focused analysis of both network and content using inline visuals supporting cross-checks and reducing context switches. We evaluate our approach in both a case study and through formative evaluation with eight law enforcement experts using a real-world communication corpus. Results show that our solution surpasses existing techniques in terms of integration level and applicability. With this contribution, we aim to pave the path for a more holistic approach to communication analysis.
Despite being a critical communication skill, grasping humor is challenging -- a successful use of humor requires a mixture of both engaging content build-up and an appropriate vocal delivery (e.g., pause). Prior studies on computational humor emphasize the textual and audio features immediately next to the punchline, yet overlooking longer-term context setup. Moreover, the theories are usually too abstract for understanding each concrete humor snippet. To fill in the gap, we develop DeHumor, a visual analytical system for analyzing humorous behaviors in public speaking. To intuitively reveal the building blocks of each concrete example, DeHumor decomposes each humorous video into multimodal features and provides inline annotations of them on the video script. In particular, to better capture the build-ups, we introduce content repetition as a complement to features introduced in theories of computational humor and visualize them in a context linking graph. To help users locate the punchlines that have the desired features to learn, we summarize the content (with keywords) and humor feature statistics on an augmented time matrix. With case studies on stand-up comedy shows and TED talks, we show that DeHumor is able to highlight various building blocks of humor examples. In addition, expert interviews with communication coaches and humor researchers demonstrate the effectiveness of DeHumor for multimodal humor analysis of speech content and vocal delivery.
Concept drift is a phenomenon in which the distribution of a data stream changes over time in unforeseen ways, causing prediction models built on historical data to become inaccurate. While a variety of automated methods have been developed to identify when concept drift occurs, there is limited support for analysts who need to understand and correct their models when drift is detected. In this paper, we present a visual analytics method, DriftVis, to support model builders and analysts in the identification and correction of concept drift in streaming data. DriftVis combines a distribution-based drift detection method with a streaming scatterplot to support the analysis of drift caused by the distribution changes of data streams and to explore the impact of these changes on the models accuracy. A quantitative experiment and two case studies on weather prediction and text classification have been conducted to demonstrate our proposed tool and illustrate how visual analytics can be used to support the detection, examination, and correction of concept drift.