Exploring dual information in distance metric learning for clustering

112 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Rodrigo Alves Randel

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Rodrigo Randel - Daniel Aloise - Alain Hertz

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Distance metric learning algorithms aim to appropriately measure similarities and distances between data points. In the context of clustering, metric learning is typically applied with the assist of side-information provided by experts, most commonly expressed in the form of cannot-link and must-link constraints. In this setting, distance metric learning algorithms move closer pairs of data points involved in must-link constraints, while pairs of points involved in cannot-link constraints are moved away from each other. For these algorithms to be effective, it is important to use a distance metric that matches the expert knowledge, beliefs, and expectations, and the transformations made to stick to the side-information should preserve geometrical properties of the dataset. Also, it is interesting to filter the constraints provided by the experts to keep only the most useful and reject those that can harm the clustering process. To address these issues, we propose to exploit the dual information associated with the pairwise constraints of the semi-supervised clustering problem. Experiments clearly show that distance metric learning algorithms benefit from integrating this dual information.

قيم البحث

61 - Panpan Yu , Qingna Li 2019

Image ranking is to rank images based on some known ranked images. In this paper, we propose an improved linear ordinal distance metric learning approach based on the linear distance metric learning model. By decomposing the distance metric $A$ as $L ^TL$, the problem can be cast as looking for a linear map between two sets of points in different spaces, meanwhile maintaining some data structures. The ordinal relation of the labels can be maintained via classical multidimensional scaling, a popular tool for dimension reduction in statistics. A least squares fitting term is then introduced to the cost function, which can also maintain the local data structure. The resulting model is an unconstrained problem, and can better fit the data structure. Extensive numerical results demonstrate the improvement of the new approach over the linear distance metric learning model both in speed and ranking performance.

التعلم الآلي التحسين والتحكم التعلم الالي

Exploring Adversarial Robustness of Deep Metric Learning

77 - Thomas Kobber Panum , Zi Wang , Pengyu Kan 2021

Deep Metric Learning (DML), a widely-used technique, involves learning a distance metric between pairs of samples. DML uses deep neural architectures to learn semantic embeddings of the input, where the distance between similar examples is small whil e dissimilar ones are far apart. Although the underlying neural networks produce good accuracy on naturally occurring samples, they are vulnerable to adversarially-perturbed samples that reduce performance. We take a first step towards training robust DML models and tackle the primary challenge of the metric losses being dependent on the samples in a mini-batch, unlike standard losses that only depend on the specific input-output pair. We analyze this dependence effect and contribute a robust optimization formulation. Using experiments on three commonly-used DML datasets, we demonstrate 5-76 fold increases in adversarial accuracy, and outperform an existing DML model that sought out to be robust.

التعلم الآلي الذكاء الاصطناعي

Graph Neural Distance Metric Learning with Graph-Bert

70 - Jiawei Zhang 2020

Graph distance metric learning serves as the foundation for many graph learning problems, e.g., graph clustering, graph classification and graph matching. Existing research works on graph distance metric (or graph kernels) learning fail to maintain t he basic properties of such metrics, e.g., non-negative, identity of indiscernibles, symmetry and triangle inequality, respectively. In this paper, we will introduce a new graph neural network based distance metric learning approaches, namely GB-DISTANCE (GRAPH-BERT based Neural Distance). Solely based on the attention mechanism, GB-DISTANCE can learn graph instance representations effectively based on a pre-trained GRAPH-BERT model. Different from the existing supervised/unsupervised metrics, GB-DISTANCE can be learned effectively in a semi-supervised manner. In addition, GB-DISTANCE can also maintain the distance metric basic properties mentioned above. Extensive experiments have been done on several benchmark graph datasets, and the results demonstrate that GB-DISTANCE can out-perform the existing baseline methods, especially the recent graph neural network model based graph metrics, with a significant gap in computing the graph distance.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Gravity Dual of Quantum Information Metric

334 - Masamichi Miyaji , Tokiro Numasawa , Noburo Shiba 2015

We study a quantum information metric (or fidelity susceptibility) in conformal field theories with respect to a small perturbation by a primary operator. We argue that its gravity dual is approximately given by a volume of maximal time slice in an A dS spacetime when the perturbation is exactly marginal. We confirm our claim in several examples.

الفيزياء عالية الطاقة - النظرية الإلكترونات المرتبطة بشدة فيزياء الكم

Trust in AutoML: Exploring Information Needs for Establishing Trust in Automated Machine Learning Systems

175 - Jaimie Drozdal , Justin Weisz , Dakuo Wang 2020

We explore trust in a relatively new area of data science: Automated Machine Learning (AutoML). In AutoML, AI methods are used to generate and optimize machine learning models by automatically engineering features, selecting models, and optimizing hy perparameters. In this paper, we seek to understand what kinds of information influence data scientists trust in the models produced by AutoML? We operationalize trust as a willingness to deploy a model produced using automated methods. We report results from three studies -- qualitative interviews, a controlled experiment, and a card-sorting task -- to understand the information needs of data scientists for establishing trust in AutoML systems. We find that including transparency features in an AutoML tool increased user trust and understandability in the tool; and out of all proposed features, model performance metrics and visualizations are the most important information to data scientists when establishing their trust with an AutoML tool.

التعلم الآلي أجهزة الكمبيوتر والمجتمع تفاعل الإنسان والحاسوب