No Arabic abstract
Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns, typically specified using intuitive sketch-based interfaces. Despite decades of past work on VQSs, these efforts have not translated to adoption in practice, possibly because VQSs are largely evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we collaborated with experts from three diverse domains---astronomy, genetics, and material science---via a year-long user-centered design process to develop a VQS that supports their workflow and analytical needs, and evaluate how VQSs can be used in practice. Our study results reveal that ad-hoc sketch-only querying is not as commonly used as prior work suggests, since analysts are often unable to precisely express their patterns of interest. In addition, we characterize three essential sensemaking processes supported by our enhanced VQS. We discover that participants employ all three processes, but in different proportions, depending on the analytical needs in each domain. Our findings suggest that all three sensemaking processes must be integrated in order to make future VQSs useful for a wide range of analytical inquiries.
Mobile applications (hereafter, apps) collect a plethora of information regarding the user behavior and his device through third-party analytics libraries. However, the collection and usage of such data raised several privacy concerns, mainly because the end-user - i.e., the actual owner of the data - is out of the loop in this collection process. Also, the existing privacy-enhanced solutions that emerged in the last years follow an all or nothing approach, leaving the user the sole option to accept or completely deny the access to privacy-related data. This work has the two-fold objective of assessing the privacy implications on the usage of analytics libraries in mobile apps and proposing a data anonymization methodology that enables a trade-off between the utility and privacy of the collected data and gives the user complete control over the sharing process. To achieve that, we present an empirical privacy assessment on the analytics libraries contained in the 4500 most-used Android apps of the Google Play Store between November 2020 and January 2021. Then, we propose an empowered anonymization methodology, based on MobHide, that gives the end-user complete control over the collection and anonymization process. Finally, we empirically demonstrate the applicability and effectiveness of such anonymization methodology thanks to HideDroid, a fully-fledged anonymization app for the Android ecosystem.
GW190412 is the first observation of a black hole binary with definitively unequal masses. GW190412s mass asymmetry, along with the measured positive effective inspiral spin, allowed for inference of a component black hole spin: the primary black hole in the system was found to have a dimensionless spin magnitude between 0.17 and 0.59 (90% credible range). We investigate how the choice of priors for the spin magnitudes and tilts of the component black holes affect the robustness of parameter estimates for GW190412, and report Bayes factors across a suite of prior assumptions. Depending on the waveform family used to describe the signal, we find either marginal to moderate (2:1-6:1) or strong ($gtrsim$ 20:1) support for the primary black hole being spinning compared to cases where only the secondary is allowed to have spin. We show how these choices influence parameter estimates, and find the asymmetric masses and positive effective inspiral spin of GW190412 to be qualitatively, but not quantitatively, robust to prior assumptions. Our results highlight the importance of both considering astrophysically motivated or population-based priors in interpreting observations and considering their relative support from the data.
We study the problem of unlearning datapoints from a learnt model. The learner first receives a dataset $S$ drawn i.i.d. from an unknown distribution, and outputs a model $widehat{w}$ that performs well on unseen samples from the same distribution. However, at some point in the future, any training datapoint $z in S$ can request to be unlearned, thus prompting the learner to modify its output model while still ensuring the same accuracy guarantees. We initiate a rigorous study of generalization in machine unlearning, where the goal is to perform well on previously unseen datapoints. Our focus is on both computational and storage complexity. For the setting of convex losses, we provide an unlearning algorithm that can unlearn up to $O(n/d^{1/4})$ samples, where $d$ is the problem dimension. In comparison, in general, differentially private learning (which implies unlearning) only guarantees deletion of $O(n/d^{1/2})$ samples. This demonstrates a novel separation between differential privacy and machine unlearning.
Being able to control the acoustic events (AEs) to which we want to listen would allow the development of more controllable hearable devices. This paper addresses the AE sound selection (or removal) problems, that we define as the extraction (or suppression) of all the sounds that belong to one or multiple desired AE classes. Although this problem could be addressed with a combination of source separation followed by AE classification, this is a sub-optimal way of solving the problem. Moreover, source separation usually requires knowing the maximum number of sources, which may not be practical when dealing with AEs. In this paper, we propose instead a universal sound selection neural network that enables to directly select AE sounds from a mixture given user-specified target AE classes. The proposed framework can be explicitly optimized to simultaneously select sounds from multiple desired AE classes, independently of the number of sources in the mixture. We experimentally show that the proposed method achieves promising AE sound selection performance and could be generalized to mixtures with a number of sources that are unseen during training.
This paper proposes Differential-Critic Generative Adversarial Network (DiCGAN) to learn the distribution of user-desired data when only partial instead of the entire dataset possesses the desired property, which generates desired data that meets users expectations and can assist in designing biological products with desired properties. Existing approaches select the desired samples first and train regular GANs on the selected samples to derive the user-desired data distribution. However, the selection of the desired data relies on an expert criterion and supervision over the entire dataset. DiCGAN introduces a differential critic that can learn the preference direction from the pairwise preferences, which is amateur knowledge and can be defined on part of the training data. The resultant critic guides the generation of the desired data instead of the whole data. Specifically, apart from the Wasserstein GAN loss, a ranking loss of the pairwise preferences is defined over the critic. It endows the difference of critic values between each pair of samples with the pairwise preference relation. The higher critic value indicates that the sample is preferred by the user. Thus training the generative model for higher critic values encourages the generation of user-preferred samples. Extensive experiments show that our DiCGAN achieves state-of-the-art performance in learning the user-desired data distributions, especially in the cases of insufficient desired data and limited supervision.