An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R

149 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Luis Torgo

تاريخ النشر 2014

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Luis Torgo

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This document describes an infra-structure provided by the R package performanceEstimation that allows to estimate the predictive performance of different approaches (workflows) to predictive tasks. The infra-structure is generic in the sense that it can be used to estimate the values of any performance metrics, for any workflow on different predictive tasks, namely, classification, regression and time series tasks. The package also includes several standard workflows that allow users to easily set up their experiments limiting the amount of work and information they need to provide. The overall goal of the infra-structure provided by our package is to facilitate the task of estimating the predictive performance of different modeling approaches to predictive tasks in the R environment.

قيم البحث

75 - Paula Branco , Rita P. Ribeiro , Luis Torgo 2016

This document describes the R package UBL that allows the use of several methods for handling utility-based learning problems. Classification and regression problems that assume non-uniform costs and/or benefits pose serious challenges to predictive analytic tasks. In the context of meteorology, finance, medicine, ecology, among many other, specific domain information concerning the preference bias of the users must be taken into account to enhance the models predictive performance. To deal with this problem, a large number of techniques was proposed by the research community for both classification and regression tasks. The main goal of UBL package is to facilitate the utility-based predictive analytic task by providing a set of methods to deal with this type of problems in the R environment. It is a versatile tool that provides mechanisms to handle both regression and classification (binary and multiclass) tasks. Moreover, UBL package allows the user to specify his domain preferences, but it also provides some automatic methods that try to infer those preference bias from the domain, considering some common known settings.

البرمجيات الرياضية التعلم الآلي التعلم الالي

Elements of Design for Containers and Solutions in the LinBox Library

491 - Brice Boyer , Jean-Guillaume Dumas 2014

We describe in this paper new design techniques used in the cpp exact linear algebra library linbox, intended to make the library safer and easier to use, while keeping it generic and efficient. First, we review the new simplified structure for conta iners, based on our emph{founding scope allocation} model. We explain design choices and their impact on coding: unification of our matrix classes, clearer model for matrices and submatrices, etc Then we present a variation of the emph{strategy} design pattern that is comprised of a controller--plugin system: the controller (solution) chooses among plug-ins (algorithms) that always call back the controllers for subtasks. We give examples using the solution mul. Finally we present a benchmark architecture that serves two purposes: Providing the user with easier ways to produce graphs; Creating a framework for automatically tuning the library and supporting regression testing.

البرمجيات الرياضية الحساب الرمزي هندسة البرمجيات

An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation

87 - Conrad Sanderson , Ryan Curtin 2017

Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentia lly unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-thread

البرمجيات الرياضية التعلم الآلي

Comparison of hazard rate estimation in R

91 - Yolanda Hagar , Vanja Dukic 2015

We give an overview of eight different software packages and functions available in R for semi- or non-parametric estimation of the hazard rate for right-censored survival data. Of particular interest is the accuracy of the estimation of the hazard r ate in the presence of covariates, as well as the user-friendliness of the packages. In addition, we investigate the ability to incorporate covariates under both the proportional and the non-proportional hazards assumptions. We contrast the robustness, variability and precision of the functions through simulations, and then further compare differences between the functions by analyzing the cancer and TRACE survival data sets available in R, including covariates under the proportional and non-proportional hazards settings.

حساب

LightSeq: A High Performance Inference Library for Transformers

76 - Xiaohui Wang , Ying Xiong , Yang Wei 2020

Transformer, BERT and their variants have achieved great success in natural language processing. Since Transformer models are huge in size, serving these models is a challenge for real industrial applications. In this paper, we propose LightSeq, a hi ghly efficient inference library for models in the Transformer family. LightSeq includes a series of GPU optimization techniques to to streamline the computation of neural layers and to reduce memory footprint. LightSeq can easily import models trained using PyTorch and Tensorflow. Experimental results on machine translation benchmarks show that LightSeq achieves up to 14x speedup compared with TensorFlow and 1.4x compared with FasterTransformer, a concurrent CUDA implementation. The code is available at https://github.com/bytedance/lightseq.

البرمجيات الرياضية التعلم الآلي