New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Multiple-Instance Logistic Regression with LASSO Penalty

108 0 0.0 ( 0 )

Download Cite

Added by Sheng-Mao Chang

Publication date 2016

fields Mathematical Statistics

and research's language is English

Authors Ray-Bing Chen - Kuang-Hung Cheng - Sheng-Mao Chang

Machine Learning Applications

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this work, we consider a manufactory process which can be described by a multiple-instance logistic regression model. In order to compute the maximum likelihood estimation of the unknown coefficient, an expectation-maximization algorithm is proposed, and the proposed modeling approach can be extended to identify the important covariates by adding the coefficient penalty term into the likelihood function. In addition to essential technical details, we demonstrate the usefulness of the proposed method by simulations and real examples.

rate research

Weighted Lasso Estimates for Sparse Logistic Regression: Non-asymptotic Properties with Measurement Error

281 - Huamei Huang , Yujing Gao , Huiming Zhang 2020

When we are interested in high-dimensional system and focus on classification performance, the $ell_{1}$-penalized logistic regression is becoming important and popular. However, the Lasso estimates could be problematic when penalties of different coefficients are all the same and not related to the data. We proposed two types of weighted Lasso estimates depending on covariates by the McDiarmid inequality. Given sample size $n$ and dimension of covariates $p$, the finite sample behavior of our proposed methods with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities such as $ell_{1}$-estimation error and squared prediction error of the unknown parameters. We compare the performance of our methods with former weighted estimates on simulated data, then apply these methods to do real data analysis.

Machine Learning Machine Learning

Logistic Regression for Massive Data with Rare Events

116 - HaiYing Wang 2020

This paper studies binary logistic regression for rare events data, or imbalanced data, where the number of events (observations in one class, often called cases) is significantly smaller than the number of nonevents (observations in the other class, often called controls). We first derive the asymptotic distribution of the maximum likelihood estimator (MLE) of the unknown parameter, which shows that the asymptotic variance convergences to zero in a rate of the inverse of the number of the events instead of the inverse of the full data sample size. This indicates that the available information in rare events data is at the scale of the number of events instead of the full data sample size. Furthermore, we prove that under-sampling a small proportion of the nonevents, the resulting under-sampled estimator may have identical asymptotic distribution to the full data MLE. This demonstrates the advantage of under-sampling nonevents for rare events data, because this procedure may significantly reduce the computation and/or data collection costs. Another common practice in analyzing rare events data is to over-sample (replicate) the events, which has a higher computational cost. We show that this procedure may even result in efficiency loss in terms of parameter estimation.

Machine Learning Machine Learning Statistics Theory

Model-Based Multiple Instance Learning

255 - Ba-Ngu Vo , Dinh Phung , Quang N. Tran 2017

While Multiple Instance (MI) data are point patterns -- sets or multi-sets of unordered points -- appropriate statistical point pattern models have not been used in MI learning. This article proposes a framework for model-based MI learning using point process theory. Likelihood functions for point pattern data derived from point process theory enable principled yet conceptually transparent extensions of learning tasks, such as classification, novelty detection and clustering, to point pattern data. Furthermore, tractable point pattern models as well as solutions for learning and decision making from point pattern data are developed.

Machine Learning Machine Learning

On connectivity of fibers with positive marginals in multiple logistic regression

177 - Hisayuki Hara , Akimichi Takemura , Ruriko Yoshida 2008

In this paper we consider exact tests of a multiple logistic regression, where the levels of covariates are equally spaced, via Markov beses. In usual application of multiple logistic regression, the sample size is positive for each combination of levels of the covariates. In this case we do not need a whole Markov basis, which guarantees connectivity of all fibers. We first give an explicit Markov basis for multiple Poisson regression. By the Lawrence lifting of this basis, in the case of bivariate logistic regression, we show a simple subset of the Markov basis which connects all fibers with a positive sample size for each combination of levels of covariates.

Statistics Theory Combinatorics Statistics Theory

Spike and slab variational Bayes for high dimensional logistic regression

383 - Kolyan Ray , Botond Szabo , Gabriel Clara 2020

Variational Bayes (VB) is a popular scalable alternative to Markov chain Monte Carlo for Bayesian inference. We study a mean-field spike and slab VB approximation of widely used Bayesian model selection priors in sparse high-dimensional logistic regression. We provide non-asymptotic theoretical guarantees for the VB posterior in both $ell_2$ and prediction loss for a sparse truth, giving optimal (minimax) convergence rates. Since the VB algorithm does not depend on the unknown truth to achieve optimality, our results shed light on effective prior choices. We confirm the improved performance of our VB algorithm over common sparse VB approaches in a numerical study.

Machine Learning Machine Learning Statistics Theory

comments

Fetching comments

Oran 1 University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Multiple-Instance Logistic Regression with LASSO Penalty

Ask ChatGPT about the research

No Arabic abstract

Read More