ترغب بنشر مسار تعليمي؟ اضغط هنا

gwpcorMapper: an interactive mapping tool for exploring geographically weighted correlation and partial correlation in high-dimensional geospatial datasets

58   0   0.0 ( 0 )
 نشر من قبل Narumasa Tsutsumida
 تاريخ النشر 2021
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Exploratory spatial data analysis (ESDA) plays a key role in research that includes geographic data. In ESDA, analysts often want to be able to visualize observations and local relationships on a map. However, software dedicated to visualizing local spatial relations be-tween multiple variables in high dimensional datasets remains undeveloped. This paper introduces gwpcorMapper, a newly developed software application for mapping geographically weighted correlation and partial correlation in large multivariate datasets. gwpcorMap-per facilitates ESDA by giving researchers the ability to interact with map components that describe local correlative relationships. We built gwpcorMapper using the R Shiny framework. The software inherits its core algorithm from GWpcor, an R library for calculating the geographically weighted correlation and partial correlation statistics. We demonstrate the application of gwpcorMapper by using it to explore census data in order to find meaningful relationships that describe the work-life environment in the 23 special wards of Tokyo, Japan. We show that gwpcorMapper is useful in both variable selection and parameter tuning for geographically weighted statistics. gwpcorMapper highlights that there are strong statistically clear local variations in the relationship between the number of commuters and the total number of hours worked when considering the total population in each district across the 23 special wards of Tokyo. Our application demonstrates that the ESDA process with high-dimensional geospatial data using gwpcorMapper has applications across multiple fields.



قيم البحث

اقرأ أيضاً

Sequential, multiple assignment, randomized trial (SMART) designs have become increasingly popular in the field of precision medicine by providing a means for comparing sequences of treatments tailored to the individual patient, i.e., dynamic treatme nt regime (DTR). The construction of evidence-based DTRs promises a replacement to adhoc one-size-fits-all decisions pervasive in patient care. However, there are substantial statistical challenges in sizing SMART designs due to the complex correlation structure between the DTRs embedded in the design. Since the primary goal of SMARTs is the construction of an optimal DTR, investigators are interested in sizing SMARTs based on the ability to screen out DTRs inferior to the optimal DTR by a given amount which cannot be done using existing methods. In this paper, we fill this gap by developing a rigorous power analysis framework that leverages multiple comparisons with the best methodology. Our method employs Monte Carlo simulation in order to compute the minimum number of individuals to enroll in an arbitrary SMART. We will evaluate our method through extensive simulation studies. We will illustrate our method by retrospectively computing the power in the Extending Treatment Effectiveness of Naltrexone SMART study.
133 - Pete Philipson 2021
Assessing the relative merits of sportsmen and women whose careers took place far apart in time via a suitable statistical model is a complex task as any comparison is compromised by fundamental changes to the sport and society and often handicapped by the popularity of inappropriate traditional metrics. In this work we focus on cricket and the ranking of Test match bowlers using bowling data from the first Test in 1877 onwards. A truncated, mean-parameterised Conway-Maxwell-Poisson model is developed to handle the under- and overdispersed nature of the data, which are in the form of small counts, and to extract the innate ability of individual bowlers. Inferences are made using a Bayesian approach by deploying a Markov Chain Monte Carlo algorithm to obtain parameter estimates and confidence intervals. The model offers a good fit and indicates that the commonly used bowling average is a flawed measure.
151 - Hai Shu , Zhe Qu , Hongtu Zhu 2020
Modern biomedical studies often collect multiple types of high-dimensional data on a common set of objects. A popular model for the joint analysis of multi-type datasets decomposes each data matrix into a low-rank common-variation matrix generated by latent factors shared across all datasets, a low-rank distinctive-variation matrix corresponding to each dataset, and an additive noise matrix. We propose decomposition-based generalized canonical correlation analysis (D-GCCA), a novel decomposition method that appropriately defines those matrices on the L2 space of random variables, whereas most existing methods are developed on its approximation, the Euclidean dot product space. Moreover to well calibrate common latent factors, we impose a desirable orthogonality constraint on distinctive latent factors. Existing methods inadequately consider such orthogonality and can thus suffer from substantial loss of undetected common variation. Our D-GCCA takes one step further than GCCA by separating common and distinctive variations among canonical variables, and enjoys an appealing interpretation from the perspective of principal component analysis. Consistent estimators of our common-variation and distinctive-variation matrices are established with good finite-sample numerical performance, and have closed-form expressions leading to efficient computation especially for large-scale datasets. The superiority of D-GCCA over state-of-the-art methods is also corroborated in simulations and real-world data examples.
124 - R. E. Ryan Jr 2011
We present a suite of IDL routines to interactively run GALFIT whereby the various surface brightness profiles (and their associated parameters) are represented by regions, which the User is expected to place. The regions may be saved and/or loaded f rom the ASCII format used by ds9 or in the Hierarchical Data Format (version 5). The software has been tested to run stably on Mac OS X and Linux with IDL 7.0.4. In addition to its primary purpose of modeling galaxy images with GALFIT, this package has several ancillary uses, including a flexible image display routines, several basic photometry functions, and qualitatively assessing Source Extractor. We distribute the package freely and without any implicit or explicit warranties, guarantees, or assurance of any kind. We kindly ask users to report any bugs, errors, or suggestions to us directly (as opposed to fixing them themselves) to ensure version control and uniformity.
We consider alignment of sparse graphs, which consists in finding a mapping between the nodes of two graphs which preserves most of the edges. Our approach is to compare local structures in the two graphs, matching two nodes if their neighborhoods ar e close enough: for correlated ErdH{o}s-Renyi random graphs, this problem can be locally rephrased in terms of testing whether a pair of branching trees is drawn from either a product distribution, or a correlated distribution. We design an optimal test for this problem which gives rise to a message-passing algorithm for graph alignment, which provably returns in polynomial time a positive fraction of correctly matched vertices, and a vanishing fraction of mismatches. With an average degree $lambda = O(1)$ in the graphs, and a correlation parameter $s in [0,1]$, this result holds with $lambda s$ large enough, and $1-s$ small enough, completing the recent state-of-the-art diagram. Tighter conditions for determining whether partial graph alignment (or correlation detection in trees) is feasible in polynomial time are given in terms of Kullback-Leibler divergences.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا