ترغب بنشر مسار تعليمي؟ اضغط هنا

Syntax is from Mars while Semantics from Venus! Insights from Spectral Analysis of Distributional Similarity Networks

140   0   0.0 ( 0 )
 نشر من قبل Animesh Mukherjee
 تاريخ النشر 2009
والبحث باللغة English




اسأل ChatGPT حول البحث

We study the global topology of the syntactic and semantic distributional similarity networks for English through the technique of spectral analysis. We observe that while the syntactic network has a hierarchical structure with strong communities and their mixtures, the semantic network has several tightly knit communities along with a large core without any such well-defined community structure.



قيم البحث

اقرأ أيضاً

A major challenge in both neuroscience and machine learning is the development of useful tools for understanding complex information processing systems. One such tool is probes, i.e., supervised models that relate features of interest to activation p atterns arising in biological or artificial neural networks. Neuroscience has paved the way in using such models through numerous studies conducted in recent decades. In this work, we draw insights from neuroscience to help guide probing research in machine learning. We highlight two important design choices for probes $-$ direction and expressivity $-$ and relate these choices to research goals. We argue that specific research goals play a paramount role when designing a probe and encourage future probing studies to be explicit in stating these goals.
Recent studies have shown that a system composed from several randomly interdependent networks is extremely vulnerable to random failure. However, real interdependent networks are usually not randomly interdependent, rather a pair of dependent nodes are coupled according to some regularity which we coin inter-similarity. For example, we study a system composed from an interdependent world wide port network and a world wide airport network and show that well connected ports tend to couple with well connected airports. We introduce two quantities for measuring the level of inter-similarity between networks (i) Inter degree-degree correlation (IDDC) (ii) Inter-clustering coefficient (ICC). We then show both by simulation models and by analyzing the port-airport system that as the networks become more inter-similar the system becomes significantly more robust to random failure.
Building robust natural language understanding systems will require a clear characterization of whether and how various linguistic meaning representations complement each other. To perform a systematic comparative analysis, we evaluate the mapping be tween meaning representations from different frameworks using two complementary methods: (i) a rule-based converter, and (ii) a supervised delexicalized parser that parses to one framework using only information from the other as features. We apply these methods to convert the STREUSLE corpus (with syntactic and lexical semantic annotations) to UCCA (a graph-structured full-sentence meaning representation). Both methods yield surprisingly accurate target representations, close to fully supervised UCCA parser quality---indicating that UCCA annotations are partially redundant with STREUSLE annotations. Despite this substantial convergence between frameworks, we find several important areas of divergence.
Inspired by humans remarkable ability to master arithmetic and generalize to unseen problems, we present a new dataset, HINT, to study machines capability of learning generalizable concepts at three different levels: perception, syntax, and semantics . In particular, concepts in HINT, including both digits and operators, are required to learn in a weakly-supervised fashion: Only the final results of handwriting expressions are provided as supervision. Learning agents need to reckon how concepts are perceived from raw signals such as images (i.e., perception), how multiple concepts are structurally combined to form a valid expression (i.e., syntax), and how concepts are realized to afford various reasoning tasks (i.e., semantics). With a focus on systematic generalization, we carefully design a five-fold test set to evaluate both the interpolation and the extrapolation of learned concepts. To tackle this challenging problem, we propose a neural-symbolic system by integrating neural networks with grammar parsing and program synthesis, learned by a novel deduction--abduction strategy. In experiments, the proposed neural-symbolic system demonstrates strong generalization capability and significantly outperforms end-to-end neural methods like RNN and Transformer. The results also indicate the significance of recursive priors for extrapolation on syntax and semantics.
Constructions in type-driven compositional distributional semantics associate large collections of matrices of size $D$ to linguistic corpora. We develop the proposal of analysing the statistical characteristics of this data in the framework of permu tation invariant matrix models. The observables in this framework are permutation invariant polynomial functions of the matrix entries, which correspond to directed graphs. Using the general 13-parameter permutation invariant Gaussian matrix models recently solved, we find, using a dataset of matrices constructed via standard techniques in distributional semantics, that the expectation values of a large class of cubic and quartic observables show high gaussianity at levels between 90 to 99 percent. Beyond expectation values, which are averages over words, the dataset allows the computation of standard deviations for each observable, which can be viewed as a measure of typicality for each observable. There is a wide range of magnitudes in the measures of typicality. The permutation invariant matrix models, considered as functions of random couplings, give a very good prediction of the magnitude of the typicality for different observables. We find evidence that observables with similar matrix model characteristics of Gaussianity and typicality also have high degrees of correlation between the ranked lists of words associated to these observables.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا