ترغب بنشر مسار تعليمي؟ اضغط هنا

Discovering Global Patterns in Linguistic Networks through Spectral Analysis: A Case Study of the Consonant Inventories

432   0   0.0 ( 0 )
 نشر من قبل Animesh Mukherjee
 تاريخ النشر 2009
والبحث باللغة English




اسأل ChatGPT حول البحث

Recent research has shown that language and the socio-cognitive phenomena associated with it can be aptly modeled and visualized through networks of linguistic entities. However, most of the existing works on linguistic networks focus only on the local properties of the networks. This study is an attempt to analyze the structure of languages via a purely structural technique, namely spectral analysis, which is ideally suited for discovering the global correlations in a network. Application of this technique to PhoNet, the co-occurrence network of consonants, not only reveals several natural linguistic principles governing the structure of the consonant inventories, but is also able to quantify their relative importance. We believe that this powerful technique can be successfully applied, in general, to study the structure of natural languages.



قيم البحث

اقرأ أيضاً

n this paper, we attempt to explain the emergence of the linguistic diversity that exists across the consonant inventories of some of the major language families of the world through a complex network based growth model. There is only a single parame ter for this model that is meant to introduce a small amount of randomness in the otherwise preferential attachment based growth process. The experiments with this model parameter indicates that the choice of consonants among the languages within a family are far more preferential than it is across the families. The implications of this result are twofold -- (a) there is an innate preference of the speakers towards acquiring certain linguistic structures over others and (b) shared ancestry propels the stronger preferential connection between the languages within a family than across them. Furthermore, our observations indicate that this parameter might bear a correlation with the period of existence of the language families under investigation.
We study the self-organization of the consonant inventories through a complex network approach. We observe that the distribution of occurrence as well as cooccurrence of the consonants across languages follow a power-law behavior. The co-occurrence n etwork of consonants exhibits a high clustering coefficient. We propose four novel synthesis models for these networks (each of which is a refinement of the earlier) so as to successively match with higher accuracy (a) the above mentioned topological properties as well as (b) the linguistic property of feature economy exhibited by the consonant inventories. We conclude by arguing that a possible interpretation of this mechanism of network growth is the process of child language acquisition. Such models essentially increase our understanding of the structure of languages that is influenced by their evolutionary dynamics and this, in turn, can be extremely useful for building future NLP applications.
Online reviews play an integral part for success or failure of businesses. Prior to purchasing services or goods, customers first review the online comments submitted by previous customers. However, it is possible to superficially boost or hinder som e businesses through posting counterfeit and fake reviews. This paper explores a natural language processing approach to identify fake reviews. We present a detailed analysis of linguistic features for distinguishing fake and trustworthy online reviews. We study 15 linguistic features and measure their significance and importance towards the classification schemes employed in this study. Our results indicate that fake reviews tend to include more redundant terms and pauses, and generally contain longer sentences. The application of several machine learning classification algorithms revealed that we were able to discriminate fake from real reviews with high accuracy using these linguistic features.
We present a Gaussian regression method for time series with missing data and stationary residuals of unknown power spectral density (PSD). The missing data are efficiently estimated by their conditional expectation as in universal Kriging, based on the circulant approximation of the complete data covariance. After initialization with an autoregessive fit of the noise, a few iterations of estimation/reconstruction steps are performed until convergence of the regression and PSD estimates, in a way similar to the expectation-conditional-maximization algorithm. The estimation can be performed for an arbitrary PSD provided that it is sufficiently smooth. The algorithm is developed in the framework of the MICROSCOPE space mission whose goal is to test the weak equivalence principle (WEP) with a precision of $10^{-15}$. We show by numerical simulations that the developed method allows us to meet three major requirements: to maintain the targeted precision of the WEP test in spite of the loss of data, to calculate a reliable estimate of this precision and of the noise level, and finally to provide consistent and faithful reconstructed data to the scientific community.
Speech sounds of the languages all over the world show remarkable patterns of cooccurrence. In this work, we attempt to automatically capture the patterns of cooccurrence of the consonants across languages and at the same time figure out the nature o f the force leading to the emergence of such patterns. For this purpose we define a weighted network where the consonants are the nodes and an edge between two nodes (read consonants) signify their co-occurrence likelihood over the consonant inventories. Through this network we identify communities of consonants that essentially reflect their patterns of co-occurrence across languages. We test the goodness of the communities and observe that the constituent consonants frequently occur in such groups in real languages also. Interestingly, the consonants forming these communities reflect strong correlations in terms of their features, which indicate that the principle of feature economy acts as a driving force towards community formation. In order to measure the strength of this force we propose an information theoretic definition of feature economy and show that indeed the feature economy exhibited by the consonant communities are substantially better than those if the consonant inventories had evolved just by chance.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا