ﻻ يوجد ملخص باللغة العربية
Faced with massive data, is it possible to trade off (statistical) risk, and (computational) space and time? This challenge lies at the heart of large-scale machine learning. Using k-means clustering as a prototypical unsupervised learning problem, we show how we can strategically summarize the data (control space) in order to trade off risk and time when data is generated by a probabilistic model. Our summarization is based on coreset constructions from computational geometry. We also develop an algorithm, TRAM, to navigate the space/time/data/risk tradeoff in practice. In particular, we show that for a fixed risk (or data size), as the data size increases (resp. risk increases) the running time of TRAM decreases. Our extensive experiments on real data sets demonstrate the existence and practical utility of such tradeoffs, not only for k-means but also for Gaussian Mixture Models.
Learning from unlabeled and noisy data is one of the grand challenges of machine learning. As such, it has seen a flurry of research with new ideas proposed continuously. In this work, we revisit a classical idea: Steins Unbiased Risk Estimator (SURE
We present a faster symbolic algorithm for the following central problem in probabilistic verification: Compute the maximal end-component (MEC) decomposition of Markov decision processes (MDPs). This problem generalizes the SCC decomposition problem
In recent years much effort has been concentrated towards achieving polynomial time lower bounds on algorithms for solving various well-known problems. A useful technique for showing such lower bounds is to prove them conditionally based on well-stud
In function inversion, we are given a function $f: [N] mapsto [N]$, and want to prepare some advice of size $S$, such that we can efficiently invert any image in time $T$. This is a well studied problem with profound connections to cryptography, data
We explore bounds of {em time-space tradeoffs} in language recognition on {em two-way finite automata} for some special languages. We prove: (1) a time-space tradeoff upper bound for recognition of the languages $L_{EQ}(n)$ on {em two-way probabilist