أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yi Cheng

Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

116 - Michaela Hardt , Xiaoguang Chen , Xiaoyi Cheng 2021

Understanding the predictions made by machine learning (ML) models and their potential biases remains a challenging and labor-intensive task that depends on the application, the dataset, and the specific model. We present Amazon SageMaker Clarify, an explainability feature for Amazon SageMaker that launched in December 2020, providing insights into data and ML models by identifying biases and explaining predictions. It is deeply integrated into Amazon SageMaker, a fully managed service that enables data scientists and developers to build, train, and deploy ML models at any scale. Clarify supports bias detection and feature importance computation across the ML lifecycle, during data preparation, model evaluation, and post-deployment monitoring. We outline the desiderata derived from customer input, the modular architecture, and the methodology for bias and explanation computations. Further, we describe the technical challenges encountered and the tradeoffs we had to make. For illustration, we discuss two customer use cases. We present our deployment results including qualitative customer feedback and a quantitative evaluation. Finally, we summarize lessons learned, and discuss best practices for the successful adoption of fairness and explanation tools in practice.

التعلم الآلي

Curvature-free linear length bounds on geodesics in closed Riemannian surfaces

87 - Herng Yi Cheng 2021

This paper proves that in any closed Riemannian surface $M$ with diameter $d$, the length of the $k^text{th}$-shortest geodesic between two given points $p$ and $q$ is at most $8kd$. This bound can be tightened further to $6kd$ if $p = q$. This impro ves prior estimates by A. Nabutovsky and R. Rotman.

الهندسة التفاضلية

Learning to Adversarially Blur Visual Object Tracking

125 - Qing Guo , Ziyi Cheng , Felix Juefei-Xu 2021

Motion blur caused by the moving of the object or camera during the exposure can be a key challenge for visual object tracking, affecting tracking accuracy significantly. In this work, we explore the robustness of visual object trackers against motio n blur from a new angle, i.e., adversarial blur attack (ABA). Our main objective is to online transfer input frames to their natural motion-blurred counterparts while misleading the state-of-the-art trackers during the tracking process. To this end, we first design the motion blur synthesizing method for visual tracking based on the generation principle of motion blur, considering the motion information and the light accumulation process. With this synthetic method, we propose textit{optimization-based ABA (OP-ABA)} by iteratively optimizing an adversarial objective function against the tracking w.r.t. the motion and light accumulation parameters. The OP-ABA is able to produce natural adversarial examples but the iteration can cause heavy time cost, making it unsuitable for attacking real-time trackers. To alleviate this issue, we further propose textit{one-step ABA (OS-ABA)} where we design and train a joint adversarial motion and accumulation predictive network (JAMANet) with the guidance of OP-ABA, which is able to efficiently estimate the adversarial motion and accumulation parameters in a one-step way. The experiments on four popular datasets (eg, OTB100, VOT2018, UAV123, and LaSOT) demonstrate that our methods are able to cause significant accuracy drops on four state-of-the-art trackers with high transferability. Please find the source code at https://github.com/tsingqguo/ABA

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

ScaleHLS: Scalable High-Level Synthesis through MLIR

132 - Hanchen Ye , Cong Hao , Jianyi Cheng 2021

High-level Synthesis (HLS) has been widely adopted as it significantly improves the hardware design productivity and enables efficient design space exploration (DSE). HLS tools can be used to deliver solutions for many different kinds of design probl ems, which are often better solved with different levels of abstraction. While existing HLS tools are built using compiler infrastructures largely based on a single-level abstraction (e.g., LLVM), we propose ScaleHLS, a next-generation HLS compilation flow, on top of a multi-level compiler infrastructure called MLIR, for the first time. By using an intermediate representation (IR) that can be better tuned to particular algorithms at different representation levels, we are able to build this new HLS tool that is more scalable and customizable towards various applications coming with intrinsic structural or functional hierarchies. ScaleHLS is able to represent and optimize HLS designs at multiple levels of abstraction and provides an HLS-dedicated transform and analysis library to solve the optimization problems at the suitable representation levels. On top of the library, we also build an automated DSE engine to explore the multi-dimensional design space efficiently. In addition, we develop an HLS C front-end and a C/C++ emission back-end to translate HLS designs into/from MLIR for enabling the end-to-end ScaleHLS flow. Experimental results show that, comparing to the baseline designs only optimized by Xilinx Vivado HLS, ScaleHLS improves the performances with amazing quality-of-results -- up to 768.1x better on computation kernel level programs and up to 3825.0x better on neural network models.

لغات البرمجة هندسة العتاد

Structural Design Recommendations in the Early Design Phase using Machine Learning

197 - Spyridon Ampanavos , Mehdi Nourbakhsh , Chin-Yi Cheng 2021

Structural engineering knowledge can be of significant importance to the architectural design team during the early design phase. However, architects and engineers do not typically work together during the conceptual phase; in fact, structural engine ers are often called late into the process. As a result, updates in the design are more difficult and time-consuming to complete. At the same time, there is a lost opportunity for better design exploration guided by structural feedback. In general, the earlier in the design process the iteration happens, the greater the benefits in cost efficiency and informed de-sign exploration, which can lead to higher-quality creative results. In order to facilitate an informed exploration in the early design stage, we suggest the automation of fundamental structural engineering tasks and introduce ApproxiFramer, a Machine Learning-based system for the automatic generation of structural layouts from building plan sketches in real-time. The system aims to assist architects by presenting them with feasible structural solutions during the conceptual phase so that they proceed with their design with adequate knowledge of its structural implications. In this paper, we describe the system and evaluate the performance of a proof-of-concept implementation in the domain of orthogonal, metal, rigid structures. We trained a Convolutional Neural Net to iteratively generate structural design solutions for sketch-level building plans using a synthetic dataset and achieved an average error of 2.2% in the predicted positions of the columns.

التعلم الآلي الهندسة الحاسوبية، المالية،العلوم

Contact Mode Guided Motion Planning for Quasidynamic Dexterous Manipulation in 3D

146 - Xianyi Cheng , Eric Huang , Yifan Hou 2021

This paper presents Contact Mode Guided Manipulation Planning (CMGMP) for general 3D quasistatic and quasidynamic rigid body motion planning in dexterous manipulation. The CMGMP algorithm generates hybrid motion plans including both continuous state transitions and discrete contact mode switches, without the need for pre-specified contact sequences or pre-designed motion primitives. The key idea is to use automatically enumerated contact modes to guide the tree expansions during the search. Contact modes automatically synthesize manipulation primitives, while the sampling-based planning framework sequences those primitives into a coherent plan. We test our algorithm on many simulated 3D manipulation tasks, and validate our models by executing the plans open-loop on a real robot-manipulator system.

علم الروبوتات

Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting

86 - Yi Cheng , Siyao Li , Bang Liu 2021

This paper explores the task of Difficulty-Controllable Question Generation (DCQG), which aims at generating questions with required difficulty levels. Previous research on this task mainly defines the difficulty of a question as whether it can be co rrectly answered by a Question Answering (QA) system, lacking interpretability and controllability. In our work, we redefine question difficulty as the number of inference steps required to answer it and argue that Question Generation (QG) systems should have stronger control over the logic of generated questions. To this end, we propose a novel framework that progressively increases question difficulty through step-by-step rewriting under the guidance of an extracted reasoning chain. A dataset is automatically constructed to facilitate the research, on which extensive experiments are conducted to test the performance of our method.

الحساب واللغة الذكاء الاصطناعي

Building-GAN: Graph-Conditioned Architectural Volumetric Design Generation

128 - Kai-Hung Chang , Chin-Yi Cheng , Jieliang Luo 2021

Volumetric design is the first and critical step for professional building design, where architects not only depict the rough 3D geometry of the building but also specify the programs to form a 2D layout on each floor. Though 2D layout generation for a single story has been widely studied, there is no developed method for multi-story buildings. This paper focuses on volumetric design generation conditioned on an input program graph. Instead of outputting dense 3D voxels, we propose a new 3D representation named voxel graph that is both compact and expressive for building geometries. Our generator is a cross-modal graph neural network that uses a pointer mechanism to connect the input program graph and the output voxel graph, and the whole pipeline is trained using the adversarial framework. The generated designs are evaluated qualitatively by a user study and quantitatively using three metrics: quality, diversity, and connectivity accuracy. We show that our model generates realistic 3D volumetric designs and outperforms previous methods and baselines.

التعلم الآلي

DeepMix: Online Auto Data Augmentation for Robust Visual Object Tracking

135 - Ziyi Cheng , Xuhong Ren , Felix Juefei-Xu 2021

Online updating of the object model via samples from historical frames is of great importance for accurate visual object tracking. Recent works mainly focus on constructing effective and efficient updating methods while neglecting the training sample s for learning discriminative object models, which is also a key part of a learning problem. In this paper, we propose the DeepMix that takes historical samples embeddings as input and generates augmented embeddings online, enhancing the state-of-the-art online learning methods for visual object tracking. More specifically, we first propose the online data augmentation for tracking that online augments the historical samples through object-aware filtering. Then, we propose MixNet which is an offline trained network for performing online data augmentation within one-step, enhancing the tracking accuracy while preserving high speeds of the state-of-the-art online learning methods. The extensive experiments on three different tracking frameworks, i.e., DiMP, DSiam, and SiamRPN++, and three large-scale and challenging datasets, ie, OTB-2015, LaSOT, and VOT, demonstrate the effectiveness and advantages of the proposed method.

الرؤية الحاسوبية وتمييز الأنماط

Dual-View Distilled BERT for Sentence Embedding

74 - Xingyi Cheng 2021

Recently, BERT realized significant progress for sentence matching via word-level cross sentence attention. However, the performance significantly drops when using siamese BERT-networks to derive two sentence embeddings, which fall short in capturing the global semantic since the word-level attention between two sentences is absent. In this paper, we propose a Dual-view distilled BERT~(DvBERT) for sentence matching with sentence embeddings. Our method deals with a sentence pair from two distinct views, i.e., Siamese View and Interaction View. Siamese View is the backbone where we generate sentence embeddings. Interaction View integrates the cross sentence interaction as multiple teachers to boost the representation ability of sentence embeddings. Experiments on six STS tasks show that our method outperforms the state-of-the-art sentence embedding methods significantly.

الذكاء الاصطناعي الحساب واللغة

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد