New community

Subscribe to the gold package and get unlimited access to Shamra Academy

INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]

67 0 0.0 ( 0 )

Download Cite

Added by Srividya Subramanian

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Sihem Amer-Yahia

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data exploration system -- that leverages, on the one hand, Machine Learning and, on the other hand, semantics for the purpose of Data Management (DM). Our vision is to develop a classic unified, comprehensive platform that provides extensive access to open datasets, and we demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics. INODE offers sustainable services in (a) data modeling and linking, (b) integrated query processing using natural language, (c) guidance, and (d) data exploration through visualization, thus facilitating the user in discovering new insights. We demonstrate that our system is uniquely accessible to a wide range of users from larger scientific communities to the public. Finally, we briefly illustrate how this work paves the way for new research opportunities in DM.

rate research

PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning

100 - Yuening Li , Daochen Zha , Praveen Kumar Venugopal 2020

Outlier detection is an important task for various data mining applications. Current outlier detection techniques are often manually designed for specific domains, requiring large human efforts of database setup, algorithm selection, and hyper-parameter tuning. To fill this gap, we present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support, which automatically optimizes an outlier detection pipeline for a new data source at hand. Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space. PyODDS enables end-to-end executions based on an Apache Spark backend server and a light-weight database. It also provides unified interfaces and visualizations for users with or without data science or machine learning background. In particular, we demonstrate PyODDS on several real-world datasets, with quantification analysis and visualization results.

Machine Learning Artificial Intelligence Machine Learning

PyODDS: An End-to-End Outlier Detection System

109 - Yuening Li , Daochen Zha , Na Zou 2019

PyODDS is an end-to end Python system for outlier detection with database support. PyODDS provides outlier detection algorithms which meet the demands for users in different fields, w/wo data science or machine learning background. PyODDS gives the ability to execute machine learning algorithms in-database without moving data out of the database server or over the network. It also provides access to a wide range of outlier detection algorithms, including statistical analysis and more recent deep learning based approaches. PyODDS is released under the MIT open-source license, and currently available at (https://github.com/datamllab/pyodds) with official documentations at (https://pyodds.github.io/).

Machine Learning Databases Machine Learning

Putting An End to End-to-End: Gradient-Isolated Learning of Representations

91 - Sindy Lowe , Peter OConnor , Bastiaan S. Veeling 2019

We propose a novel deep learning method for local self-supervised representation learning that does not require labels nor end-to-end backpropagation but exploits the natural order in data instead. Inspired by the observation that biological neural networks appear to learn without backpropagating a global error signal, we split a deep neural network into a stack of gradient-isolated modules. Each module is trained to maximally preserve the information of its inputs using the InfoNCE bound from Oord et al. [2018]. Despite this greedy training, we demonstrate that each module improves upon the output of its predecessor, and that the representations created by the top module yield highly competitive results on downstream classification tasks in the audio and visual domain. The proposal enables optimizing modules asynchronously, allowing large-scale distributed training of very deep neural networks on unlabelled datasets.

Machine Learning Artificial Intelligence Machine Learning

An End-to-End ML System for Personalized Conversational Voice Models in Walmart E-Commerce

159 - Rahul Radhakrishnan Iyer , Praveenkumar Kanumala , Stephen Guo 2020

Searching for and making decisions about products is becoming increasingly easier in the e-commerce space, thanks to the evolution of recommender systems. Personalization and recommender systems have gone hand-in-hand to help customers fulfill their shopping needs and improve their experiences in the process. With the growing adoption of conversational platforms for shopping, it has become important to build personalized models at scale to handle the large influx of data and perform inference in real-time. In this work, we present an end-to-end machine learning system for personalized conversational voice commerce. We include components for implicit feedback to the model, model training, evaluation on update, and a real-time inference engine. Our system personalizes voice shopping for Walmart Grocery customers and is currently available via Google Assistant, Siri and Google Home devices.

Machine Learning Artificial Intelligence Information Retrieval

Avalanche: an End-to-End Library for Continual Learning

150 - Vincenzo Lomonaco , Lorenzo Pellegrini , Andrea Cossu 2021

Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms.

Machine Learning Artificial Intelligence Computer Vision and Pattern Recognition

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions