أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Ziyun Wang

Cross-lingual Text Classification with Heterogeneous Graph Neural Network

269 - Ziyun Wang , Xuan Liu , Peiji Yang 2021

Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages, which is very useful for low-resource languages. Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks, but rarely consider factors beyond semantic similarity, causing performance degradation between some language pairs. In this paper we propose a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification using graph convolutional networks (GCN). In particular, we construct a heterogeneous graph by treating documents and words as nodes, and linking nodes with different relations, which include part-of-speech roles, semantic similarity, and document translations. Extensive experiments show that our graph-based method significantly outperforms state-of-the-art models on all tasks, and also achieves consistent performance gain over baselines in low-resource settings where external tools like translators are unavailable.

الحساب واللغة

Learning to Generate Cost-to-Go Functions for Efficient Motion Planning

151 - Jinwook Huh , Galen Xing , Ziyun Wang 2020

Traditional motion planning is computationally burdensome for practical robots, involving extensive collision checking and considerable iterative propagation of cost values. We present a novel neural network architecture which can directly generate t he cost-to-go (c2g) function for a given configuration space and a goal configuration. The output of the network is a continuous function whose gradient in configuration space can be directly used to generate trajectories in motion planning without the need for protracted iterations or extensive collision checking. This higher order function (i.e. a function generating another function) representation lies at the core of our motion planning architecture, c2g-HOF, which can take a workspace as input, and generate the cost-to-go function over the configuration space map (C-map). Simulation results for 2D and 3D environments show that c2g-HOF can be orders of magnitude faster at execution time than methods which explore the configuration space during execution. We also present an implementation of c2g-HOF which generates trajectories for robot manipulators directly from an overhead image of the workspace.

علم الروبوتات

Geodesic-HOF: 3D Reconstruction Without Cutting Corners

174 - Ziyun Wang , Eric A. Mitchell , Volkan Isler 2020

Single-view 3D object reconstruction is a challenging fundamental problem in computer vision, largely due to the morphological diversity of objects in the natural world. In particular, high curvature regions are not always captured effectively by met hods trained using only set-based loss functions, resulting in reconstructions short-circuiting the surface or cutting corners. In particular, high curvature regions are not always captured effectively by methods trained using only set-based loss functions, resulting in reconstructions short-circuiting the surface or cutting corners. To address this issue, we propose learning an image-conditioned mapping function from a canonical sampling domain to a high dimensional space where the Euclidean distance is equal to the geodesic distance on the object. The first three dimensions of a mapped sample correspond to its 3D coordinates. The additional lifted components contain information about the underlying geodesic structure. Our results show that taking advantage of these learned lifted coordinates yields better performance for estimating surface normals and generating surfaces than using point cloud reconstructions alone. Further, we find that this learned geodesic embedding space provides useful information for applications such as unsupervised object decomposition.

الرؤية الحاسوبية وتمييز الأنماط

Surface HOF: Surface Reconstruction from a Single Image Using Higher Order Function Networks

123 - Ziyun Wang , Volkan Isler , Daniel D. Lee 2019

We address the problem of generating a high-resolution surface reconstruction from a single image. Our approach is to learn a Higher Order Function (HOF) which takes an image of an object as input and generates a mapping function. The mapping functio n takes samples from a canonical domain (e.g. the unit sphere) and maps each sample to a local tangent plane on the 3D reconstruction of the object. Each tangent plane is represented as an origin point and a normal vector at that point. By efficiently learning a continuous mapping function, the surface can be generated at arbitrary resolution in contrast to other methods which generate fixed resolution outputs. We present the Surface HOF in which both the higher order function and the mapping function are represented as neural networks, and train the networks to generate reconstructions of PointNet objects. Experiments show that Surface HOF is more accurate and uses more efficient representations than other state of the art methods for surface reconstruction. Surface HOF is also easier to train: it requires minimal input pre-processing and output post-processing and generates surface representations that are more parameter efficient. Its accuracy and convenience make Surface HOF an appealing method for single image reconstruction.

الرؤية الحاسوبية وتمييز الأنماط

Motion Equivariance OF Event-based Camera Data with the Temporal Normalization Transform

61 - Ziyun Wang 2019

In this work, we focus on using convolution neural networks (CNN) to perform object recognition on the event data. In object recognition, it is important for a neural network to be robust to the variations of the data during testing. For traditional cameras, translations are well handled because CNNs are naturally equivariant to translations. However, because event cameras record the change of light intensity of an image, the geometric shape of event volumes will not only depend on the objects but also on their relative motions with respect to the camera. The deformation of the events caused by motions causes the CNN to be less robust to unseen motions during inference. To address this problem, we would like to explore the equivariance property of CNNs, a well-studied area that demonstrates to produce predictable deformation of features under certain transformations of the input image.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

Modeling question asking using neural program generation

102 - Ziyun Wang , Brenden M. Lake 2019

People ask questions that are far richer, more informative, and more creative than current AI systems. We propose a neuro-symbolic framework for modeling human question asking, which represents questions as formal programs and generates programs with an encoder-decoder based deep neural network. From extensive experiments using an information-search game, we show that our method can predict which questions humans are likely to ask in unconstrained settings. We also propose a novel grammar-based question generation framework trained with reinforcement learning, which is able to generate creative questions without supervised human data.

الحساب واللغة

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد