أوراق بحثية, رسائل ماجستير ودكتوراه حول تحليل البيانات والإحصاءات والاحتمال

Complex networks and public funding: the case of the 2007-2013 Italian program

550 - Stefano Nicotri , Eufemia Tinelli , Nicola Amoroso 2015

In this paper we apply techniques of complex network analysis to data sources representing public funding programs and discuss the importance of the considered indicators for program evaluation. Starting from the Open Data repository of the 2007-2013 Italian Program Programma Operativo Nazionale Ricerca e Competitivit`a (PON R&C), we build a set of data models and perform network analysis over them. We discuss the obtained experimental results outlining interesting new perspectives that emerge from the application of the proposed methods to the socio-economical evaluation of funded programs.

الفيزياء والمجتمع الشبكات الاجتماعية والمعلومات تحليل البيانات والإحصاءات والاحتمال

Local observability of state variables and parameters in nonlinear modeling quantified by delay reconstruction

336 - Ulrich Parlitz , Jan Schumann-Bischoff , Stefan Luther 2015

Features of the Jacobian matrix of the delay coordinates map are exploited for quantifying the robustness and reliability of state and parameter estimations for a given dynamical model using an observed time series. Relevant concepts of this approach are introduced and illustrated for discrete and continuous time systems employing a filtered Henon map and a Rossler system.

ديناميات الفوضوية تحليل البيانات والإحصاءات والاحتمال

Traffic incident analysis on urban arterials using ESE: A method for moderate length of time window

461 - Zhen-zhen Yang , Liang Gao , Zi-you Gao 2015

Moderate length of time window can get the best accurate result in detecting the key incident time using extended spectral envelope. This paper presents a method to calculate the moderate length of time window. Two factors are mainly considered: (1) The significant vertical lines consist of negative elements of eigenvectors; (2) the least amount of interruption. The elements of eigenvectors are transformed into binary variable to eliminate the interruption of positive elements. Sine transform is introduced to highlight the significant vertical lines of negative elements. A novel Quality Index (QI) is proposed to measure the effect of different lengths of time window. Empirical studies on four real traffic incidents in Beijing verify the validity of this method.

الفيزياء والمجتمع تحليل البيانات والإحصاءات والاحتمال

Traffic Incident Analysis on Urban Arterials Using Extended Spectral Envelope Method

547 - Zhen-zhen Yang , Liang Gao , Zi-you Gao 2015

A traffic incident analysis method based on extended spectral envelope (ESE) method is presented to detect the key incident time. Sensitivity analysis of parameters (the length of time window, the length of sliding window and the study period) are di scussed on four real traffic incidents in Beijing. The results show that: (1) Moderate length of time window got the best accurate in detection. (2) The shorter the sliding window is, the more accurate the key incident time are detected. (3) If the study period is too short, the end time of an incident cannot be detected. Empirical studies show that the proposed method can effectively discover the key incident time, which can provide a theoretic basis for traffic incident management.

الفيزياء والمجتمع تحليل البيانات والإحصاءات والاحتمال

Bounds of memory strength for power-law series

448 - Fangjian Guo , Dan Yang , Zimo Yang 2015

Many time series produced by complex systems are empirically found to follow power-law distributions with different exponents $alpha$. By permuting the independently drawn samples from a power-law distribution, we present non-trivial bounds on the me mory strength (1st-order autocorrelation) as a function of $alpha$, which are markedly different from the ordinary $pm 1$ bounds for Gaussian or uniform distributions. When $1 < alpha leq 3$, as $alpha$ grows bigger, the upper bound increases from 0 to +1 while the lower bound remains 0; when $alpha > 3$, the upper bound remains +1 while the lower bound descends below 0. Theoretical bounds agree well with numerical simulations. Based on the posts on Twitter, ratings of MovieLens, calling records of the mobile operator Orange, and browsing behavior of Taobao, we find that empirical power-law distributed data produced by human activities obey such constraints. The present findings explain some observed constraints in bursty time series and scale-free networks, and challenge the validity of measures like autocorrelation and assortativity coefficient in heterogeneous systems.

تحليل البيانات والإحصاءات والاحتمال الفيزياء والمجتمع

A conjugate subgradient algorithm with adaptive preconditioning for LASSO minimization

471 - Alessandro Mirone , Pierre Paleo 2015

This paper describes a new efficient conjugate subgradient algorithm which minimizes a convex function containing a least squares fidelity term and an absolute value regularization term. This method is successfully applied to the inversion of ill-con ditioned linear problems, in particular for computed tomography with the dictionary learning method. A comparison with other state-of-art methods shows a significant reduction of the number of iterations, which makes this algorithm appealing for practical use.

تحليل البيانات والإحصاءات والاحتمال التحليل العددي

Non-parametric causal inference for bivariate time series

438 - James M. McCracken , Robert S. Weigel 2015

We introduce new quantities for exploratory causal inference between bivariate time series. The quantities, called penchants and leanings, are computationally straightforward to apply, follow directly from assumptions of probabilistic causality, do n ot depend on any assumed models for the time series generating process, and do not rely on any embedding procedures; these features may provide a clearer interpretation of the results than those from existing time series causality tools. The penchant and leaning are computed based on a structured method for computing probabilities.

تحليل البيانات والإحصاءات والاحتمال المنهجية

Detecting somatic mutations in genomic sequences by means of Kolmogorov-Arnold analysis

406 - V.G. Gurzadyan , H. Yan , G. Vlahovic 2015

The Kolmogorov-Arnold stochasticity parameter technique is applied for the first time to the study of cancer genome sequencing, to reveal mutations. Using data generated by next generation sequencing technologies, we have analyzed the exome sequences of brain tumor patients with matched tumor and normal blood. We show that mutations contained in sequencing data can be revealed using this technique thus providing a new methodology for determining subsequences of given length containing mutations i.e. its value differs from those of subsequences without mutations. A potential application for this technique involves simplifying the procedure of finding segments with mutations, speeding up genomic research, and accelerating its implementation in clinical diagnostic. Moreover, the prediction of a mutation associated to a family of frequent mutations in numerous types of cancers based purely on the value of the Kolmogorov function, indicates that this applied marker may recognize genomic sequences that are in extremely low abundance and can be used in revealing new types of mutations.

الجينوم تحليل البيانات والإحصاءات والاحتمال

Bayesian semiparametric power spectral density estimation with applications in gravitational wave data analysis

912 - Matthew C. Edwards , Renate Meyer , Nelson Christensen 2015

The standard noise model in gravitational wave (GW) data analysis assumes detector noise is stationary and Gaussian distributed, with a known power spectral density (PSD) that is usually estimated using clean off-source data. Real GW data often depar t from these assumptions, and misspecified parametric models of the PSD could result in misleading inferences. We propose a Bayesian semiparametric approach to improve this. We use a nonparametric Bernstein polynomial prior on the PSD, with weights attained via a Dirichlet process distribution, and update this using the Whittle likelihood. Posterior samples are obtained using a blocked Metropolis-within-Gibbs sampler. We simultaneously estimate the reconstruction parameters of a rotating core collapse supernova GW burst that has been embedded in simulated Advanced LIGO noise. We also discuss an approach to deal with non-stationary data by breaking longer data streams into smaller and locally stationary components.

النسبية العامة وهدية الكونيات الكم تحليل البيانات والإحصاءات والاحتمال تطبيقات الإحصاء

Untangling the roles of parasites in food webs with generative network models

406 - Abigail Z. Jacobs , Jennifer A. Dunne , Cristopher Moore 2015

Food webs represent the set of consumer-resource interactions among a set of species that co-occur in a habitat, but most food web studies have omitted parasites and their interactions. Recent studies have provided conflicting evidence on whether inc luding parasites changes food web structure, with some suggesting that parasitic interactions are structurally distinct from those among free-living species while others claim the opposite. Here, we describe a principled method for understanding food web structure that combines an efficient optimization algorithm from statistical physics called parallel tempering with a probabilistic generalization of the empirically well-supported food web niche model. This generative model approach allows us to rigorously estimate the degree to which interactions that involve parasites are statistically distinguishable from interactions among free-living species, whether parasite niches behave similarly to free-living niches, and the degree to which existing hypotheses about food web structure are naturally recovered. We apply this method to the well-studied Flensburg Fjord food web and show that while predation on parasites, concomitant predation of parasites, and parasitic intraguild trophic interactions are largely indistinguishable from free-living predation interactions, parasite-host interactions are different. These results provide a powerful new tool for evaluating the impact of classes of species and interactions on food web structure to shed new light on the roles of parasites in food webs

السكان والتطور تحليل البيانات والإحصاءات والاحتمال الأساليب الكمية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد