بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Equilibrium (Zipf) and Dynamic (Grasseberg-Procaccia) method based analyses of human texts. A comparison of natural (english) and artificial (esperanto) languages

343 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Marcel Ausloos

تاريخ النشر 2008

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف M. Ausloos

الفيزياء والمجتمع الحساب واللغة تحليل البيانات والإحصاءات والاحتمال

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A comparison of two english texts from Lewis Carroll, one (Alice in wonderland), also translated into esperanto, the other (Through a looking glass) are discussed in order to observe whether natural and artificial languages significantly differ from each other. One dimensional time series like signals are constructed using only word frequencies (FTS) or word lengths (LTS). The data is studied through (i) a Zipf method for sorting out correlations in the FTS and (ii) a Grassberger-Procaccia (GP) technique based method for finding correlations in LTS. Features are compared : different power laws are observed with characteristic exponents for the ranking properties, and the {it phase space attractor dimensionality}. The Zipf exponent can take values much less than unity ($ca.$ 0.50 or 0.30) depending on how a sentence is defined. This non-universality is conjectured to be a measure of the author $style$. Moreover the attractor dimension $r$ is a simple function of the so called phase space dimension $n$, i.e., $r = n^{lambda}$, with $lambda = 0.79$. Such an exponent should also conjecture to be a measure of the author $creativity$. However, even though there are quantitative differences between the original english text and its esperanto translation, the qualitative differences are very minutes, indicating in this case a translation relatively well respecting, along our analysis lines, the content of the author writing.

قيم البحث

390 - J. Gillet , M. Ausloos 2008

We present a comparison of two english texts, written by Lewis Carroll, one (Alice in wonderland) and the other (Through a looking glass), the former translated into esperanto, in order to observe whether natural and artificial languages significantl y differ from each other. We construct one dimensional time series like signals using either word lengths or word frequencies. We use the multifractal ideas for sorting out correlations in the writings. In order to check the robustness of the methods we also write the corresponding shuffled texts. We compare characteristic functions and e.g. observe marked differences in the (far from parabolic) f(alpha) curves, differences which we attribute to Tsallis non extensive statistical features in the frequency time series and length time series. The esperanto text has more extreme vallues. A very rough approximation consists in modeling the texts as a random Cantor set if resulting from a binomial cascade of long and short words (or words and blanks). This leads to parameters characterizing the text style, and most likely in fine the author writings.

الحساب واللغة تحليل البيانات والإحصاءات والاحتمال

Automatic Alignment of English-Chinese Bilingual Texts of CNS News

66 - Donghua Xu , Chew Lim Tan 1996

In this paper we address a method to align English-Chinese bilingual news reports from China News Service, combining both lexical and satistical approaches. Because of the sentential structure differences between English and Chinese, matching at the sentence level as in many other works may result in frequent matching of several sentences en masse. In view of this, the current work also attempts to create shorter alignment pairs by permitting finer matching between clauses from both texts if possible. The current method is based on statiscal correlation between sentence or clause length of both texts and at the same time uses obvious anchors such as numbers and place names appearing frequently in the news reports as lexcial cues.

الحساب واللغة

Positivity of the English language

572 - Isabel M. Kloumann , Christopher M. Danforth , Kameron Decker Harris 2011

Over the last million years, human language has emerged and evolved as a fundamental instrument of social communication and semiotic representation. People use language in part to convey emotional information, leading to the central and contingent qu estions: (1) What is the emotional spectrum of natural language? and (2) Are natural languages neutrally, positively, or negatively biased? Here, we report that the human-perceived positivity of over 10,000 of the most frequently used English words exhibits a clear positive bias. More deeply, we characterize and quantify distributions of word positivity for four large and distinct corpora, demonstrating that their form is broadly invariant with respect to frequency of word use.

الفيزياء والمجتمع الحساب واللغة

Internal and external dynamics in language: Evidence from verb regularity in a historical corpus of English

559 - Christine F. Cuskley , Martina Pugliese , Claudio Castellano 2014

Human languages are rule governed, but almost invariably these rules have exceptions in the form of irregularities. Since rules in language are efficient and productive, the persistence of irregularity is an anomaly. How does irregularity linger in t he face of internal (endogenous) and external (exogenous) pressures to conform to a rule? Here we address this problem by taking a detailed look at simple past tense verbs in the Corpus of Historical American English. The data show that the language is open, with many new verbs entering. At the same time, existing verbs might tend to regularize or irregularize as a consequence of internal dynamics, but overall, the amount of irregularity sustained by the language stays roughly constant over time. Despite continuous vocabulary growth, and presumably, an attendant increase in expressive power, there is no corresponding growth in irregularity. We analyze the set of irregulars, showing they may adhere to a set of minority rules, allowing for increased stability of irregularity over time. These findings contribute to the debate on how language systems become rule governed, and how and why they sustain exceptions to rules, providing insight into the interplay between the emergence and maintenance of rules and exceptions in language.

الفيزياء والمجتمع الحساب واللغة

Understanding the complexity of the Levy-walk nature of human mobility with a multi-scale cost/benefit model

429 - Nicola Scafetta 2012

Probability distributions of human displacements has been fit with exponentially truncated Levy flights or fat tailed Pareto inverse power law probability distributions. Thus, people usually stay within a given location (for example, the city of resi dence), but with a non-vanishing frequency they visit nearby or far locations too. Herein, we show that an important empirical distribution of human displacements (range: from 1 to 1000 km) can be well fit by three consecutive Pareto distributions with simple integer exponents equal to 1, 2 and ($gtrapprox$) 3. These three exponents correspond to three displacement range zones of about 1 km $lesssim Delta r lesssim$ 10 km, 10 km $lesssim Delta r lesssim$ 300 km and 300 km $lesssim Delta r lesssim $ 1000 km, respectively. These three zones can be geographically and physically well determined as displacements within a city, visits to nearby cities that may occur within just one-day trips, and visit to far locations that may require multi-days trips. The incremental integer values of the three exponents can be easily explained with a three-scale mobility cost/benefit model for human displacements based on simple geometrical constrains. Essentially, people would divide the space into three major regions (close, medium and far distances) and would assume that the travel benefits are randomly/uniformly distributed mostly only within specific urban-like areas.

الفيزياء والمجتمع الميكانيكا الإحصائية تحليل البيانات والإحصاءات والاحتمال

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة الافتراضية السورية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Equilibrium (Zipf) and Dynamic (Grasseberg-Procaccia) method based analyses of human texts. A comparison of natural (english) and artificial (esperanto) languages

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً