أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yang Tan

Sharper bounds on the Fourier concentration of DNFs

76 - Victor Lecomte , Li-Yang Tan 2021

In 1992 Mansour proved that every size-$s$ DNF formula is Fourier-concentrated on $s^{O(loglog s)}$ coefficients. We improve this to $s^{O(loglog k)}$ where $k$ is the read number of the DNF. Since $k$ is always at most $s$, our bound matches Mansour s for all DNFs and strengthens it for small-read ones. The previous best bound for read-$k$ DNFs was $s^{O(k^{3/2})}$. For $k$ up to $tilde{Theta}(loglog s)$, we further improve our bound to the optimal $mathrm{poly}(s)$; previously no such bound was known for any $k = omega_s(1)$. Our techniques involve new connections between the term structure of a DNF, viewed as a set system, and its Fourier spectrum.

التعقيد الحسابي

Probeable DARTS with Application to Computational Pathology

103 - Sheyang Tang , Mahdi S. Hosseini , Lina Chen 2021

AI technology has made remarkable achievements in computational pathology (CPath), especially with the help of deep neural networks. However, the network performance is highly related to architecture design, which commonly requires human experts with domain knowledge. In this paper, we combat this challenge with the recent advance in neural architecture search (NAS) to find an optimal network for CPath applications. In particular, we use differentiable architecture search (DARTS) for its efficiency. We first adopt a probing metric to show that the original DARTS lacks proper hyperparameter tuning on the CIFAR dataset, and how the generalization issue can be addressed using an adaptive optimization strategy. We then apply our searching framework on CPath applications by searching for the optimum network architecture on a histological tissue type dataset (ADP). Results show that the searched network outperforms state-of-the-art networks in terms of prediction accuracy and computation complexity. We further conduct extensive experiments to demonstrate the transferability of the searched network to new CPath applications, the robustness against downscaled inputs, as well as the reliability of predictions.

الرؤية الحاسوبية وتمييز الأنماط

Unsupervised Monocular Depth Estimation in Highly Complex Environments

132 - Chaoqiang Zhao , Yang Tang , Qiyu Sun 2021

Previous unsupervised monocular depth estimation methods mainly focus on the day-time scenario, and their frameworks are driven by warped photometric consistency. While in some challenging environments, like night, rainy night or snowy winter, the ph otometry of the same pixel on different frames is inconsistent because of the complex lighting and reflection, so that the day-time unsupervised frameworks cannot be directly applied to these complex scenarios. In this paper, we investigate the problem of unsupervised monocular depth estimation in certain highly complex scenarios. We address this challenging problem by using domain adaptation, and a unified image transfer-based adaptation framework is proposed based on monocular videos in this paper. The depth model trained on day-time scenarios is adapted to different complex scenarios. Instead of adapting the whole depth network, we just consider the encoder network for lower computational complexity. The depth models adapted by the proposed framework to different scenarios share the same decoder, which is practical. Constraints on both feature space and output space promote the framework to learn the key features for depth decoding, and the smoothness loss is introduced into the adaptation framework for better depth estimation performance. Extensive experiments show the effectiveness of the proposed unsupervised framework in estimating the dense depth map from the night-time, rainy night-time and snowy winter images.

الرؤية الحاسوبية وتمييز الأنماط

Practical Transferability Estimation for Image Classification Tasks

95 - Yang Tan , Yang Li , Shao-Lun Huang 2021

Transferability estimation is an essential problem in transfer learning to predict how good the performance is when transferring a source model (or source task) to a target task. Recent analytical transferability metrics have been widely used for sou rce model selection and multi-task learning. A major challenge is how to make transfereability estimation robust under the cross-domain cross-task settings. The recently proposed OTCE score solves this problem by considering both domain and task differences, with the help of transfer experiences on auxiliary tasks, which causes an efficiency overhead. In this work, we propose a practical transferability metric called JC-NCE score that dramatically improves the robustness of the task difference estimation in OTCE, thus removing the need for auxiliary tasks. Specifically, we build the joint correspondences between source and target data via solving an optimal transport problem with a ground cost considering both the sample distance and label distance, and then compute the transferability score as the negative conditional entropy of the matched labels. Extensive validations under the intra-dataset and inter-dataset transfer settings demonstrate that our JC-NCE score outperforms the auxiliary-task free version of OTCE for 7% and 12%, respectively, and is also more robust than other existing transferability metrics on average.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

Multi-modal Entity Alignment in Hyperbolic Space

80 - Hao Guo , Jiuyang Tang , Weixin Zeng 2021

Many AI-related tasks involve the interactions of data in multiple modalities. It has been a new trend to merge multi-modal information into knowledge graph(KG), resulting in multi-modal knowledge graphs (MMKG). However, MMKGs usually suffer from low coverage and incompleteness. To mitigate this problem, a viable approach is to integrate complementary knowledge from other MMKGs. To this end, although existing entity alignment approaches could be adopted, they operate in the Euclidean space, and the resulting Euclidean entity representations can lead to large distortion of KGs hierarchical structure. Besides, the visual information has yet not been well exploited. In response to these issues, in this work, we propose a novel multi-modal entity alignment approach, Hyperbolic multi-modal entity alignment(HMEA), which extends the Euclidean representation to hyperboloid manifold. We first adopt the Hyperbolic Graph Convolutional Networks (HGCNs) to learn structural representations of entities. Regarding the visual information, we generate image embeddings using the densenet model, which are also projected into the hyperbolic space using HGCNs. Finally, we combine the structure and visual representations in the hyperbolic space and use the aggregated embeddings to predict potential alignment results. Extensive experiments and ablation studies demonstrate the effectiveness of our proposed model and its components.

الذكاء الاصطناعي

Learning stochastic decision trees

67 - Guy Blanc , Jane Lange , Li-Yang Tan 2021

We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an $eta$-corrupted set of uniform random samples labeled by a size-$s$ stochastic decision tree, our algorithm run s in time $n^{O(log(s/varepsilon)/varepsilon^2)}$ and returns a hypothesis with error within an additive $2eta + varepsilon$ of the Bayes optimal. An additive $2eta$ is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of $O(eta) + varepsilon$ was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

التعلم الآلي بنى وهياكل البيانات والخوارزميات التعلم الالي

End-to-End Mandarin Tone Classification with Short Term Context Information

183 - Jiyang Tang , Ming Li 2021

In this paper, we propose an end-to-end Mandarin tone classification method from continuous speech utterances utilizing both the spectrogram and the short-term context information as the input. Both spectrograms and context segment features are used to train the tone classifier. We first divide the spectrogram frames into syllable segments using force alignment results produced by an ASR model. Then we extract the short-term segment features to capture the context information across multiple syllables. Feeding both the spectrogram and the short-term context segment features into an end-to-end model could significantly improve the performance. Experiments are performed on a large-scale open-source Mandarin speech dataset to evaluate the proposed method. Results show that this method improves the classification accuracy from 79.5% to 92.6% on the AISHELL3 database.

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations

142 - Yang Tan , Yang Li , Shao-Lun Huang 2021

Transfer learning across heterogeneous data distributions (a.k.a. domains) and distinct tasks is a more general and challenging problem than conventional transfer learning, where either domains or tasks are assumed to be the same. While neural networ k based feature transfer is widely used in transfer learning applications, finding the optimal transfer strategy still requires time-consuming experiments and domain knowledge. We propose a transferability metric called Optimal Transport based Conditional Entropy (OTCE), to analytically predict the transfer performance for supervised classification tasks in such cross-domain and cross-task feature transfer settings. Our OTCE score characterizes transferability as a combination of domain difference and task difference, and explicitly evaluates them from data in a unified framework. Specifically, we use optimal transport to estimate domain difference and the optimal coupling between source and target distributions, which is then used to derive the conditional entropy of the target task (task difference). Experiments on the largest cross-domain dataset DomainNet and Office31 demonstrate that OTCE shows an average of 21% gain in the correlation with the ground truth transfer accuracy compared to state-of-the-art methods. We also investigate two applications of the OTCE score including source model selection and multi-source feature fusion.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Fooling Gaussian PTFs via Local Hyperconcentration

105 - Ryan ODonnell , Rocco A. Servedio , Li-Yang Tan 2021

We give a pseudorandom generator that fools degree-$d$ polynomial threshold functions over $n$-dimensional Gaussian space with seed length $mathrm{poly}(d)cdot log n$. All previous generators had a seed length with at least a $2^d$ dependence on $d$. The key new ingredient is a Local Hyperconcentration Theorem, which shows that every degree-$d$ Gaussian polynomial is hyperconcentrated almost everywhere at scale $d^{-O(1)}$.

التعقيد الحسابي

Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds

76 - Yihao Feng , Ziyang Tang , Na Zhang 2021

Off-policy evaluation (OPE) is the task of estimating the expected reward of a given policy based on offline data previously collected under different policies. Therefore, OPE is a key step in applying reinforcement learning to real-world domains suc h as medical treatment, where interactive data collection is expensive or even unsafe. As the observed data tends to be noisy and limited, it is essential to provide rigorous uncertainty quantification, not just a point estimation, when applying OPE to make high stakes decisions. This work considers the problem of constructing non-asymptotic confidence intervals in infinite-horizon off-policy evaluation, which remains a challenging open question. We develop a practical algorithm through a primal-dual optimization-based approach, which leverages the kernel Bellman loss (KBL) of Feng et al.(2019) and a new martingale concentration inequality of KBL applicable to time-dependent data with unknown mixing conditions. Our algorithm makes minimum assumptions on the data and the function class of the Q-function, and works for the behavior-agnostic settings where the data is collected under a mix of arbitrary unknown behavior policies. We present empirical results that clearly demonstrate the advantages of our approach over existing methods.

التعلم الآلي التعلم الالي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد