Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

On Conditional Branches in Optimal Decision Trees

60 0 0.0 ( 0 )

Download Cite

Added by Michael Baer

Publication date 2006

fields Informatics Engineering

and research's language is English

Authors Michael B. Baer

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The decision tree is one of the most fundamental programming abstractions. A commonly used type of decision tree is the alphabetic binary tree, which uses (without loss of generality) ``less than versus greater than or equal to tests in order to determine one of $n$ outcome events. The process of finding an optimal alphabetic binary tree for a known probability distribution on outcome events usually has the underlying assumption that the cost (time) per decision is uniform and thus independent of the outcome of the decision. This assumption, however, is incorrect in the case of software to be optimized for a given microprocessor, e.g., in compiling switch statements or in fine-tuning program bottlenecks. The operation of the microprocessor generally means that the cost for the more likely decision outcome can or will be less -- often far less -- than the less likely decision outcome. Here we formulate a variety of $O(n^3)$-time $O(n^2)$-space dynamic programming algorithms to solve such optimal binary decision tree problems, optimizing for the behavior of processors with predictive branch capabilities, both static and dynamic. In the static case, we use existing results to arrive at entropy-based performance bounds. Solutions to this formulation are often faster in practice than ``optimal decision trees as formulated in the literature, and, for small problems, are easily worth the extra complexity in finding the better solution. This can be applied in fast implementation of decoding Huffman codes.

rate research

Generalized and Scalable Optimal Sparse Decision Trees

72 - Jimmy Lin , Chudi Zhong , Diane Hu 2020

Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have allowed practical algorithms to find optimal decision trees. These new techniques have the potential to trigger a paradigm shift where it is possible to construct sparse decision trees to efficiently optimize a variety of objective functions without relying on greedy splitting and pruning heuristics that often lead to suboptimal solutions. The contribution in this work is to provide a general framework for decision tree optimization that addresses the two significant open problems in the area: treatment of imbalanced data and fully optimizing over continuous variables. We present techniques that produce optimal decision trees over a variety of objectives including F-score, AUC, and partial area under the ROC convex hull. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables and speeds up decision tree construction by several orders of magnitude relative to the state-of-the art.

Machine Learning Machine Learning

The Missing Piece Syndrome in Peer-to-Peer Communication

394 - Bruce Hajek , Ji Zhu 2010

Typical protocols for peer-to-peer file sharing over the Internet divide files to be shared into pieces. New peers strive to obtain a complete collection of pieces from other peers and from a seed. In this paper we investigate a problem that can occur if the seeding rate is not large enough. The problem is that, even if the statistics of the system are symmetric in the pieces, there can be symmetry breaking, with one piece becoming very rare. If peers depart after obtaining a complete collection, they can tend to leave before helping other peers receive the rare piece. Assuming that peers arrive with no pieces, there is a single seed, random peer contacts are made, random useful pieces are downloaded, and peers depart upon receiving the complete file, the system is stable if the seeding rate (in pieces per time unit) is greater than the arrival rate, and is unstable if the seeding rate is less than the arrival rate. The result persists for any piece selection policy that selects from among useful pieces, such as rarest first, and it persists with the use of network coding.

Performance Information Theory Information Theory

Detecting State Transitions of a Markov Source: Sampling Frequency and Age Trade-off

97 - Jaya Prakash Champati , Mikael Skoglund , 2020

We consider a finite-state Discrete-Time Markov Chain (DTMC) source that can be sampled for detecting the events when the DTMC transits to a new state. Our goal is to study the trade-off between sampling frequency and staleness in detecting the events. We argue that, for the problem at hand, using Age of Information (AoI) for quantifying the staleness of a sample is conservative and therefore, introduce textit{age penalty} for this purpose. We study two optimization problems: minimize average age penalty subject to an average sampling frequency constraint, and minimize average sampling frequency subject to an average age penalty constraint; both are Constrained Markov Decision Problems. We solve them using linear programming approach and compute Markov policies that are optimal among all causal policies. Our numerical results demonstrate that the computed Markov policies not only outperform optimal periodic sampling policies, but also achieve sampling frequencies close to or lower than that of an optimal clairvoyant (non-causal) sampling policy, if a small age penalty is allowed.

Performance Information Theory Information Theory

Distributed Chernoff Test: Optimal decision systems over networks

90 - Anshuka Rangi , Massimo Franceschetti , Stefano Marano 2018

We study active decision making over sensor networks where the sensors sequential probing actions are actively chosen by continuously learning from past observations. We consider two network settings: with and without central coordination. In the first case, the network nodes interact with each other through a central entity, which plays the role of a fusion center. In the second case, the network nodes interact in a fully distributed fashion. In both of these scenarios, we propose sequential and adaptive hypothesis tests extending the classic Chernoff test. We compare the performance of the proposed tests to the optimal sequential test. In the presence of a fusion center, our test achieves the same asymptotic optimality of the Chernoff test, minimizing the risk, expressed by the expected cost required to reach a decision plus the expected cost of making a wrong decision, when the observation cost per unit time tends to zero. The test is also asymptotically optimal in the higher moments of the time required to reach a decision. Additionally, the test is parsimonious in terms of communications, and the expected number of channel uses per network node tends to a small constant. In the distributed setup, our test achieves the same asymptotic optimality of Chernoffs test, up to a multiplicative constant in terms of both risk and the higher moments of the decision time. Additionally, the test is parsimonious in terms of communications in comparison to state-of-the-art schemes proposed in the literature. The analysis of these tests is also extended to account for message quantization and communication over channels with random erasures.

Methodology Information Theory Multiagent Systems

Multiscale Optimal Filtering on the Sphere

245 - Adeem Aslam , Zubair Khalid , Jason D. McEwen 2020

We present a framework for the optimal filtering of spherical signals contaminated by realizations of an additive, zero-mean, uncorrelated and anisotropic noise process on the sphere. Filtering is performed in the wavelet domain given by the scale-discretized wavelet transform on the sphere. The proposed filter is optimal in the sense that it minimizes the mean square error between the filtered wavelet representation and wavelet representation of the noise-free signal. We also present a simplified formulation of the filter for the case when azimuthally symmetric wavelet functions are used. We demonstrate the use of the proposed optimal filter for denoising of an Earth topography map in the presence of additive, zero-mean, uncorrelated and white Gaussian noise, and show that the proposed filter performs better than the hard thresholding method and weighted spherical harmonic~(weighted-SPHARM) signal estimation framework.

Signal Processing Information Theory Information Theory

comments

Fetching comments

Higher Institute for Demographic Studies and Researches

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On Conditional Branches in Optimal Decision Trees

Ask ChatGPT about the research

No Arabic abstract

Read More