أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل UK

Phrase Retrieval Learns Passage Retrieval, Too

276 - Jinhyuk Lee , Alexander Wettig , Danqi Chen 2021

Dense retrieval methods have shown great promise over sparse retrieval methods in a range of NLP problems. Among them, dense phrase retrieval-the most fine-grained retrieval unit-is appealing because phrases can be directly used as the output for que stion answering and slot filling tasks. In this work, we follow the intuition that retrieving phrases naturally entails retrieving larger text blocks and study whether phrase retrieval can serve as the basis for coarse-level retrieval including passages and documents. We first observe that a dense phrase-retrieval system, without any retraining, already achieves better passage retrieval accuracy (+3-5% in top-5 accuracy) compared to passage retrievers, which also helps achieve superior end-to-end QA performance with fewer passages. Then, we provide an interpretation for why phrase-level supervision helps learn better fine-grained entailment compared to passage-level supervision, and also show that phrase retrieval can be improved to achieve competitive performance in document-retrieval tasks such as entity linking and knowledge-grounded dialogue. Finally, we demonstrate how phrase filtering and vector quantization can reduce the size of our index by 4-10x, making dense phrase retrieval a practical and versatile solution in multi-granularity retrieval.

الحساب واللغة استرجاع المعلومات

A Machine Learning Framework for Automatic Prediction of Human Semen Motility

249 - Sandra Ottl , Maurice Gerczuk , Shahin Amiriparian 2021

In the field of reproductive health, a vital aspect for the detection of male fertility issues is the analysis of human semen quality. Two factors of importance are the morphology and motility of the sperm cells. While the former describes defects in different parts of a spermatozoon, the latter measures the efficient movement of cells. For many non-human species, so-called Computer-Aided Sperm Analysis systems work well for assessing these characteristics from microscopic video recordings but struggle with human sperm samples which generally show higher degrees of debris and dead spermatozoa, as well as lower overall sperm motility. Here, machine learning methods that harness large amounts of training data to extract salient features could support physicians with the detection of fertility issues or in vitro fertilisation procedures. In this work, the overall motility of given sperm samples is predicted with the help of a machine learning framework integrating unsupervised methods for feature extraction with downstream regression models. The models evaluated herein improve on the state-of-the-art for video-based sperm-motility prediction.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Real Time Monocular Vehicle Velocity Estimation using Synthetic Data

115 - Robert McCraith , Lukas Neumann , Andrea Vedaldi 2021

Vision is one of the primary sensing modalities in autonomous driving. In this paper we look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car. Contrary to prior methods that train end-to-end deep networ ks that estimate the vehicles velocity from the video pixels, we propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity from the tracked bounding boxes. Surprisingly, we find that this still achieves state-of-the-art estimation performance with the significant benefit of separating perception from dynamics estimation via a clean, interpretable and verifiable interface which allows us distill the statistics which are crucial for velocity estimation. We show that the latter can be used to easily generate synthetic training data in the space of bounding boxes and use this to improve the performance of our method further.

الرؤية الحاسوبية وتمييز الأنماط

Lifting 2D Object Locations to 3D by Discounting LiDAR Outliers across Objects and Views

124 - Robert McCraith , Eldar Insafudinov , Lukas Neumann 2021

We present a system for automatic converting of 2D mask object predictions and raw LiDAR point clouds into full 3D bounding boxes of objects. Because the LiDAR point clouds are partial, directly fitting bounding boxes to the point clouds is meaningle ss. Instead, we suggest that obtaining good results requires sharing information between emph{all} objects in the dataset jointly, over multiple frames. We then make three improvements to the baseline. First, we address ambiguities in predicting the object rotations via direct optimization in this space while still backpropagating rotation prediction through the model. Second, we explicitly model outliers and task the network with learning their typical patterns, thus better discounting them. Third, we enforce temporal consistency when video data is available. With these contributions, our method significantly outperforms previous work despite the fact that those methods use significantly more complex pipelines, 3D models and additional human-annotated external sources of prior information.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Small PCSPs that reduce to large CSPs

145 - Alexandr Kazda , Peter Mayr , Dmitriy Zhuk 2021

For relational structures A, B of the same signature, the Promise Constraint Satisfaction Problem PCSP(A,B) asks whether a given input structure maps homomorphically to A or does not even map to B. We are promised that the input satisfies exactly one of these two cases. If there exists a structure C with homomorphisms $Ato Cto B$, then PCSP(A,B) reduces naturally to CSP(C). To the best of our knowledge all known tractable PCSPs reduce to tractable CSPs in this way. However Barto showed that some PCSPs over finite structures A, B require solving CSPs over infinite C. We show that even when such a reduction to finite C is possible, this structure may become arbitrarily large. For every integer $n>1$ and every prime p we give A, B of size n with a single relation of arity $n^p$ such that PCSP(A, B) reduces via a chain of homomorphisms $ Ato Cto B$ to a tractable CSP over some C of size p but not over any smaller structure. In a second family of examples, for every prime $pgeq 7$ we construct A, B of size $p-1$ with a single ternary relation such that PCSP(A, B) reduces via $Ato Cto B$ to a tractable CSP over some C of size p but not over any smaller structure. In contrast we show that if A, B are graphs and PCSP(A,B) reduces to tractable CSP(C) for some finite C, then already A or B has tractable CSP. This extends results and answers a question of Deng et al.

التعقيد الحسابي المنطق في علوم الحاسوب حلقات وجبر

An application of spectral localization to critical SQG on a ball

153 - Tsukasa Iwabuchi 2021

We study the Cauchy problem for the quasi-geostrophic equations in a unit ball of the two dimensional space with the homogeneous Dirichlet boundary condition. We show the existence, the uniqueness of the strong solution in the framework of Besov spac es. We establish a spectral localization technique and commutator estimates.

تحليل PDES

Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning

84 - Shikha Dubey , Farrukh Olimov , Muhammad Aasim Rafique 2021

Automatic transcription of scene understanding in images and videos is a step towards artificial general intelligence. Image captioning is a nomenclature for describing meaningful information in an image using computer vision techniques. Automated im age captioning techniques utilize encoder and decoder architecture, where the encoder extracts features from an image and the decoder generates a transcript. In this work, we investigate two unexplored ideas for image captioning using transformers: First, we demonstrate the enforcement of using objects relevance in the surrounding environment. Second, learning an explicit association between labels and language constructs. We propose label-attention Transformer with geometrically coherent objects (LATGeO). The proposed technique acquires a proposal of geometrically coherent objects using a deep neural network (DNN) and generates captions by investigating their relationships using a label-attention module. Object coherence is defined using the localized ratio of the geometrical properties of the proposals. The label-attention module associates the extracted objects classes to the available dictionary using self-attention layers. The experimentation results show that objects relevance in surroundings and binding of their visual feature with their geometrically localized ratios combined with its associated labels help in defining meaningful captions. The proposed framework is tested on the MSCOCO dataset, and a thorough evaluation resulting in overall better quantitative scores pronounces its superiority.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Secure Transmission for Hierarchical Information Accessibility in Downlink MU-MIMO

211 - Kanguk Lee , Jinseok Choi , Dong Ku Kim 2021

Physical layer security is a useful tool to prevent confidential information from wiretapping. In this paper, we consider a generalized model of conventional physical layer security, referred to as hierarchical information accessibility (HIA). A main feature of the HIA model is that a network has a hierarchy in information accessibility, wherein decoding feasibility is determined by a priority of users. Under this HIA model, we formulate a sum secrecy rate maximization problem with regard to precoding vectors. This problem is challenging since multiple non-smooth functions are involved into the secrecy rate to fulfill the HIA conditions and also the problem is non-convex. To address the challenges, we approximate the minimum function by using the LogSumExp technique, thereafter obtain the first-order optimality condition. One key observation is that the derived condition is cast as a functional eigenvalue problem, where the eigenvalue is equivalent to the approximated objective function of the formulated problem. Accordingly, we show that finding a principal eigenvector is equivalent to finding a local optimal solution. To this end, we develop a novel method called generalized power iteration for HIA (GPI-HIA). Simulations demonstrate that the GPI-HIA significantly outperforms other baseline methods in terms of the secrecy rate.

معالجة الإشارات نظرية المعلومات نظرية المعلومات

Rota-Baxter $C^{ast}$-algebras

162 - Zhonghua Li , Shukun Wang 2021

This paper introduces the notion of Rota-Baxter $C^{ast}$-algebras. Here a Rota-Baxter $C^{ast}$-algebra is a $C^{ast}$-algebra with a Rota-Baxter operator. Symmetric Rota-Baxter operators, as special cases of Rota-Baxter operators on $C^{ast}$-algeb ra, are defined and studied. A theorem of Rota-Baxter operators on concrete $C^{ast}$-algebras is given, deriving the relationship between two kinds of Rota-Baxter algebras. As a corollary, some connection between $ast$-representations and Rota-Baxter operators is given. The notion of representations of Rota-Baxter $C^{ast}$-algebras are constructed, and a theorem of representations of direct sums of Rota-Baxter representations is derived. Finally using Rota-Baxter operators, the notion of quasidiagonal operators on $C^{ast}$-algebra is reconstructed.

عامل الجبر حلقات وجبر

R-PCC: A Baseline for Range Image-based Point Cloud Compression

82 - Sukai Wang , Jianhao Jiao , Peide Cai 2021

In autonomous vehicles or robots, point clouds from LiDAR can provide accurate depth information of objects compared with 2D images, but they also suffer a large volume of data, which is inconvenient for data storage or transmission. In this paper, w e propose a Range image-based Point Cloud Compression method, R-PCC, which can reconstruct the point cloud with uniform or non-uniform accuracy loss. We segment the original large-scale point cloud into small and compact regions for spatial redundancy and salient region classification. Compared with other voxel-based or image-based compression methods, our method can keep and align all points from the original point cloud in the reconstructed point cloud. It can also control the maximum reconstruction error for each point through a quantization module. In the experiments, we prove that our easier FPS-based segmentation method can achieve better performance than instance-based segmentation methods such as DBSCAN. To verify the advantages of our proposed method, we evaluate the reconstruction quality and fidelity for 3D object detection and SLAM, as the downstream tasks. The experimental results show that our elegant framework can achieve 30$times$ compression ratio without affecting downstream tasks, and our non-uniform compression framework shows a great improvement on the downstream tasks compared with the state-of-the-art large-scale point cloud compression methods. Our real-time method is efficient and effective enough to act as a baseline for range image-based point cloud compression. The code is available on https://github.com/StevenWang30/R-PCC.git.

علم الروبوتات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد