بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Hypothesis Testing for Topological Data Analysis

331 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Katharine Turner

تاريخ النشر 2013

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Andrew Robinson - Katharine Turner

تطبيقات الإحصاء الهندسة الحسابية الطوبولوجيا الجبرية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Persistent homology is a vital tool for topological data analysis. Previous work has developed some statistical estimators for characteristics of collections of persistence diagrams. However, tools that provide statistical inference for observations that are persistence diagrams are limited. Specifically, there is a need for tests that can assess the strength of evidence against a claim that two samples arise from the same population or process. We propose the use of randomization-style null hypothesis significance tests (NHST) for these situations. The test is based on a loss function that comprises pairwise distances between the elements of each sample and all the elements in the other sample. We use this method to analyze a range of simulated and experimental data. Through these examples we experimentally explore the power of the p-values. Our results show that the randomization-style NHST based on pairwise distances can distinguish between samples from different processes, which suggests that its use for hypothesis tests upon persistence diagrams is reasonable. We demonstrate its application on a real dataset of fMRI data of patients with ADHD.

قيم البحث

402 - Michelle Feng , Abigail Hickok , 2021

In this chapter, we discuss applications of topological data analysis (TDA) to spatial systems. We briefly review the recently proposed level-set construction of filtered simplicial complexes, and we then examine persistent homology in two cases stud ies: street networks in Shanghai and hotspots of COVID-19 infections. We then summarize our results and provide an outlook on TDA in spatial systems.

الشبكات الاجتماعية والمعلومات الهندسة الحسابية الطوبولوجيا الجبرية

Breaking hypothesis testing for failure rates

89 - Rohit Pandey , Yingnong Dang , Gil Lapid Shafriri 2020

We describe the utility of point processes and failure rates and the most common point process for modeling failure rates, the Poisson point process. Next, we describe the uniformly most powerful test for comparing the rates of two Poisson point proc esses for a one-sided test (henceforth referred to as the rate test). A common argument against using this test is that real world data rarely follows the Poisson point process. We thus investigate what happens when the distributional assumptions of tests like these are violated and the test still applied. We find a non-pathological example (using the rate test on a Compound Poisson distribution with Binomial compounding) where violating the distributional assumptions of the rate test make it perform better (lower error rates). We also find that if we replace the distribution of the test statistic under the null hypothesis with any other arbitrary distribution, the performance of the test (described in terms of the false negative rate to false positive rate trade-off) remains exactly the same. Next, we compare the performance of the rate test to a version of the Wald test customized to the Negative Binomial point process and find it to perform very similarly while being much more general and versatile. Finally, we discuss the applications to Microsoft Azure. The code for all experiments performed is open source and linked in the introduction.

تطبيقات الإحصاء التعلم الآلي

An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists

104 - Frederic Chazal 2017

Topological Data Analysis is a recent and fast growing field providing a set of new topological and geometric tools to infer relevant features for possibly complex data. This paper is a brief introduction, through a few selected topics, to basic fund amental and practical aspects of tda for non experts.

نظرية الإحصاء التعلم الآلي الطوبولوجيا الجبرية

Topological Data Analysis for True Step Detection in Piecewise Constant Signals

52 - Firas A. Khasawneh , , Elizabeth Munch 2018

This paper introduces a simple yet powerful approach based on topological data analysis (TDA) for detecting the true steps in a piecewise constant (PWC) signal. The signal is a two-state square wave with randomly varying in-between-pulse spacing, and subject to spurious steps at the rising or falling edges which we refer to as digital ringing. We use persistent homology to derive mathematical guarantees for the resulting change detection which enables accurate identification and counting of the true pulses. The approach is described and tested using both synthetic and experimental data obtained using an engine lathe instrumented with a laser tachometer. The described algorithm enables the accurate calculation of the spindle speed with the appropriate error bounds. The results of the described approach are compared to the frequency domain approach via Fourier transform. It is found that both our approach and the Fourier analysis yield comparable results for numerical and experimental pulses with regular spacing and digital ringing. However, the described approach significantly outperforms Fourier analysis when the spacing between the peaks is varied. We also generalize the approach to higher dimensional PWC signals, although utilizing this extension remains an interesting question for future research.

معالجة الإشارات الهندسة الحسابية

Statistical hypothesis testing versus machine-learning binary classification: distinctions and guidelines

108 - Jingyi Jessica Li , Xin Tong 2020

Making binary decisions is a common data analytical task in scientific research and industrial applications. In data sciences, there are two related but distinct strategies: hypothesis testing and binary classification. In practice, how to choose bet ween these two strategies can be unclear and rather confusing. Here we summarize key distinctions between these two strategies in three aspects and list five practical guidelines for data analysts to choose the appropriate strategy for specific analysis needs. We demonstrate the use of those guidelines in a cancer driver gene prediction example.

تطبيقات الإحصاء الأساليب الكمية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة المأمون الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Hypothesis Testing for Topological Data Analysis

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً