RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation

109 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ju Fan

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Nan Tang - Ju Fan - Fangyi Li

التعلم الآلي قواعد البيانات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Can AI help automate human-easy but computer-hard data preparation tasks that burden data scientists, practitioners, and crowd workers? We answer this question by presenting RPT, a denoising auto-encoder for tuple-to-X models (X could be tuple, token, label, JSON, and so on). RPT is pre-trained for a tuple-to-tuple model by corrupting the input tuple and then learning a model to reconstruct the original tuple. It adopts a Transformer-based neural translation architecture that consists of a bidirectional encoder (similar to BERT) and a left-to-right autoregressive decoder (similar to GPT), leading to a generalization of both BERT and GPT. The pre-trained RPT can already support several common data preparation tasks such as data cleaning, auto-completion and schema matching. Better still, RPT can be fine-tuned on a wide range of data preparation tasks, such as value normalization, data transformation, data annotation, etc. To complement RPT, we also discuss several appealing techniques such as collaborative training and few-shot learning for entity resolution, and few-shot learning and NLP question-answering for information extraction. In addition, we identify a series of research opportunities to advance the field of data preparation.

قيم البحث

اقرأ أيضاً

Is Fast Adaptation All You Need?

299 - Khurram Javed , Hengshuai Yao , Martha White 2019

Gradient-based meta-learning has proven to be highly effective at learning model initializations, representations, and update rules that allow fast adaptation from a few samples. The core idea behind these approaches is to use fast adaptation and gen eralization -- two second-order metrics -- as training signals on a meta-training dataset. However, little attention has been given to other possible second-order metrics. In this paper, we investigate a different training signal -- robustness to catastrophic interference -- and demonstrate that representations learned by directing minimizing interference are more conducive to incremental learning than those learned by just maximizing fast adaptation.

التعلم الآلي التعلم الالي

MemGEN: Memory is All You Need

133 - Sylvain Gelly , Karol Kurach , Marcin Michalski 2018

We propose a new learning paradigm called Deep Memory. It has the potential to completely revolutionize the Machine Learning field. Surprisingly, this paradigm has not been reinvented yet, unlike Deep Learning. At the core of this approach is the tex tit{Learning By Heart} principle, well studied in primary schools all over the world. Inspired by poem recitation, or by $pi$ decimal memorization, we propose a concrete algorithm that mimics human behavior. We implement this paradigm on the task of generative modeling, and apply to images, natural language and even the $pi$ decimals as long as one can print them as text. The proposed algorithm even generated this paper, in a one-shot learning setting. In carefully designed experiments, we show that the generated samples are indistinguishable from the training examples, as measured by any statistical tests or metrics.

التعلم الآلي

Logarithmic Pruning is All You Need

93 - Laurent Orseau , Marcus Hutter , Omar Rivasplata 2020

The Lottery Ticket Hypothesis is a conjecture that every large neural network contains a subnetwork that, when trained in isolation, achieves comparable performance to the large network. An even stronger conjecture has been proven recently: Every suf ficiently overparameterized network contains a subnetwork that, at random initialization, but without training, achieves comparable accuracy to the trained large network. This latter result, however, relies on a number of strong assumptions and guarantees a polynomial factor on the size of the large network compared to the target function. In this work, we remove the most limiting assumptions of this previous work while providing significantly tighter bounds:the overparameterized network only needs a logarithmic factor (in all variables but depth) number of neurons per weight of the target subnetwork.

التعلم الآلي التعلم الالي

Categorical Representation Learning: Morphism is All You Need

68 - Artan Sheshmani , Yizhuang You 2021

We provide a construction for categorical representation learning and introduce the foundations of $textit{categorifier}$. The central theme in representation learning is the idea of $textbf{everything to vector}$. Every object in a dataset $mathcal{ S}$ can be represented as a vector in $mathbb{R}^n$ by an $textit{encoding map}$ $E: mathcal{O}bj(mathcal{S})tomathbb{R}^n$. More importantly, every morphism can be represented as a matrix $E: mathcal{H}om(mathcal{S})tomathbb{R}^{n}_{n}$. The encoding map $E$ is generally modeled by a $textit{deep neural network}$. The goal of representation learning is to design appropriate tasks on the dataset to train the encoding map (assuming that an encoding is optimal if it universally optimizes the performance on various tasks). However, the latter is still a $textit{set-theoretic}$ approach. The goal of the current article is to promote the representation learning to a new level via a $textit{category-theoretic}$ approach. As a proof of concept, we provide an example of a text translator equipped with our technology, showing that our categorical learning model outperforms the current deep learning models by 17 times. The content of the current article is part of the recent US patent proposal (patent application number: 63110906).

التعلم الآلي الأنظمة المضطربة والشبكات العصبية الذكاء الاصطناعي

Segmentation is All You Need

151 - Zehua Cheng , Yuxiang Wu , Zhenghua Xu 2019

Region proposal mechanisms are essential for existing deep learning approaches to object detection in images. Although they can generally achieve a good detection performance under normal circumstances, their recall in a scene with extreme cases is u nacceptably low. This is mainly because bounding box annotations contain much environment noise information, and non-maximum suppression (NMS) is required to select target boxes. Therefore, in this paper, we propose the first anchor-free and NMS-free object detection model called weakly supervised multimodal annotation segmentation (WSMA-Seg), which utilizes segmentation models to achieve an accurate and robust object detection without NMS. In WSMA-Seg, multimodal annotations are proposed to achieve an instance-aware segmentation using weakly supervised bounding boxes; we also develop a run-data-based following algorithm to trace contours of objects. In addition, we propose a multi-scale pooling segmentation (MSP-Seg) as the underlying segmentation model of WSMA-Seg to achieve a more accurate segmentation and to enhance the detection accuracy of WSMA-Seg. Experimental results on multiple datasets show that the proposed WSMA-Seg approach outperforms the state-of-the-art detectors.

الرؤية الحاسوبية وتمييز الأنماط