No Arabic abstract
Musical preferences have been considered a mirror of the self. In this age of Big Data, online music streaming services allow us to capture ecologically valid music listening behavior and provide a rich source of information to identify several user-specific aspects. Studies have shown musical engagement to be an indirect representation of internal states including internalized symptomatology and depression. The current study aims at unearthing patterns and trends in the individuals at risk for depression as it manifests in naturally occurring music listening behavior. Mental well-being scores, musical engagement measures, and listening histories of Last.fm users (N=541) were acquired. Social tags associated with each listeners most popular tracks were analyzed to unearth the mood/emotions and genres associated with the users. Results revealed that social tags prevalent in the users at risk for depression were predominantly related to emotions depicting Sadness associated with genre tags representing neo-psychedelic-, avant garde-, dream-pop. This study will open up avenues for an MIR-based approach to characterizing and predicting risk for depression which can be helpful in early detection and additionally provide bases for designing music recommendations accordingly.
Supervised music representation learning has been performed mainly using semantic labels such as music genres. However, annotating music with semantic labels requires time and cost. In this work, we investigate the use of factual metadata such as artist, album, and track information, which are naturally annotated to songs, for supervised music representation learning. The results show that each of the metadata has individual concept characteristics, and using them jointly improves overall performance.
This paper describes computational methods for the visual display and analysis of music information. We provide a concise description of software, music descriptors and data visualization techniques commonly used in music information retrieval. Finally, we provide use cases where the described software, descriptors and visualizations are showcased.
Research on mid-level image representations has conventionally concentrated relatively obvious attributes and overlooked non-obvious attributes, i.e., characteristics that are not readily observable when images are viewed independently of their context or function. Non-obvious attributes are not necessarily easily nameable, but nonetheless they play a systematic role in people`s interpretation of images. Clusters of related non-obvious attributes, called interpretation dimensions, emerge when people are asked to compare images, and provide important insight on aspects of social images that are considered relevant. In contrast to aesthetic or affective approaches to image analysis, non-obvious attributes are not related to the personal perspective of the viewer. Instead, they encode a conventional understanding of the world, which is tacit, rather than explicitly expressed. This paper introduces a procedure for discovering non-obvious attributes using crowdsourcing. We discuss this procedure using a concrete example of a crowdsourcing task on Amazon Mechanical Turk carried out in the domain of fashion. An analysis comparing discovered non-obvious attributes with user tags demonstrated the added value delivered by our procedure.
In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.
Music, an integral part of our lives, which is not only a source of entertainment but plays an important role in mental well-being by impacting moods, emotions and other affective states. Music preferences and listening strategies have been shown to be associated with the psychological well-being of listeners including internalized symptomatology and depression. However, till date no studies exist that examine time-varying music consumption, in terms of acoustic content, and its association with users well-being. In the current study, we aim at unearthing static and dynamic patterns prevalent in active listening behavior of individuals which may be used as indicators of risk for depression. Mental well-being scores and listening histories of 541 Last.fm users were examined. Static and dynamic acoustic and emotion-related features were extracted from each users listening history and correlated with their mental well-being scores. Results revealed that individuals with greater depression risk resort to higher dependency on music with greater repetitiveness in their listening activity. Furthermore, the affinity of depressed individuals towards music that can be perceived as sad was found to be resistant to change over time. This study has large implications for future work in the area of assessing mental illness risk by exploiting digital footprints of users via online music streaming platforms.