New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Low-dimensional Embodied Semantics for Music and Language

79 0 0.0 ( 0 )

Download Cite

Added by Francisco Afonso Raposo

Publication date 2019

fields Biology Informatics Engineering

and research's language is English

Authors Francisco Afonso Raposo - David Martins de Matos - Ricardo Ribeiro

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Embodied cognition states that semantics is encoded in the brain as firing patterns of neural circuits, which are learned according to the statistical structure of human multimodal experience. However, each human brain is idiosyncratically biased, according to its subjective experience history, making this biological semantic machinery noisy with respect to the overall semantics inherent to media artifacts, such as music and language excerpts. We propose to represent shared semantics using low-dimensional vector embeddings by jointly modeling several brains from human subjects. We show these unsupervised efficient representations outperform the original high-dimensional fMRI voxel spaces in proxy music genre and language topic classification tasks. We further show that joint modeling of several subjects increases the semantic richness of the learned latent vector spaces.

rate research

Learning Embodied Semantics via Music and Dance Semiotic Correlations

337 - Francisco Afonso Raposo , David Martins de Matos , Ricardo Ribeiro 2019

Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We leverage this aspect of cognition, by considering dance as a proxy for music perception, in a statistical computational model that learns semiotic correlations between music audio and dance video. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and show the model can recommend music audio for dance video queries and vice-versa.

Computer Vision and Pattern Recognition Machine Learning Sound

Physics of the mind: Concepts, emotions, language, cognition, consciousness, beauty, music, and symbolic culture

300 - Leonid Perlovsky 2010

Mathematical approaches to modeling the mind since the 1950s are reviewed. Difficulties faced by these approaches are related to the fundamental incompleteness of logic discovered by K. Godel. A recent mathematical advancement, dynamic logic (DL) overcame these past difficulties. DL is described conceptually and related to neuroscience, psychology, cognitive science, and philosophy. DL models higher cognitive functions: concepts, emotions, instincts, understanding, imagination, intuition, consciousness. DL is related to the knowledge instinct that drives our understanding of the world and serves as a foundation for higher cognitive functions. Aesthetic emotions and perception of beauty are related to everyday functioning of the mind. The article reviews mechanisms of human symbolic ability, language and cognition, joint evolution of the mind, consciousness, and cultures. It touches on a manifold of aesthetic emotions in music, their cognitive function, origin, and evolution. The article concentrates on elucidating the first principles and reviews aspects of the theory proven in laboratory research.

Neurons and Cognition

Score-informed Networks for Music Performance Assessment

278 - Jiawen Huang , Yun-Ning Hung , Ashis Pati 2020

The assessment of music performances in most cases takes into account the underlying musical score being performed. While there have been several automatic approaches for objective music performance assessment (MPA) based on extracted features from both the performance audio and the score, deep neural network-based methods incorporating score information into MPA models have not yet been investigated. In this paper, we introduce three different models capable of score-informed performance assessment. These are (i) a convolutional neural network that utilizes a simple time-series input comprising of aligned pitch contours and score, (ii) a joint embedding model which learns a joint latent space for pitch contours and scores, and (iii) a distance matrix-based convolutional neural network which utilizes patterns in the distance matrix between pitch contours and musical score to predict assessment ratings. Our results provide insights into the suitability of different architectures and input representations and demonstrate the benefits of score-informed models as compared to score-independent models.

Audio and Speech Processing Information Retrieval Machine Learning

Multi-scale Embedded CNN for Music Tagging (MsE-CNN)

85 - Nima Hamidi , Mohsen Vahidzadeh , Stephen Baek 2019

Convolutional neural networks (CNN) recently gained notable attraction in a variety of machine learning tasks: including music classification and style tagging. In this work, we propose implementing intermediate connections to the CNN architecture to facilitate the transfer of multi-scale/level knowledge between different layers. Our novel model for music tagging shows significant improvement in comparison to the proposed approaches in the literature, due to its ability to carry low-level timbral features to the last layer.

Sound Information Retrieval Machine Learning

Metric Learning vs Classification for Disentangled Music Representation Learning

100 - Jongpil Lee , Nicholas J. Bryan , Justin Salamon 2020

Deep representation learning offers a powerful paradigm for mapping input data onto an organized embedding space and is useful for many music information retrieval tasks. Two central methods for representation learning include deep metric learning and classification, both having the same goal of learning a representation that can generalize well across tasks. Along with generalization, the emerging concept of disentangled representations is also of great interest, where multiple semantic concepts (e.g., genre, mood, instrumentation) are learned jointly but remain separable in the learned representation space. In this paper we present a single representation learning framework that elucidates the relationship between metric learning, classification, and disentanglement in a holistic manner. For this, we (1) outline past work on the relationship between metric learning and classification, (2) extend this relationship to multi-label data by exploring three different learning approaches and their disentangl

Sound Information Retrieval Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Low-dimensional Embodied Semantics for Music and Language

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions