ﻻ يوجد ملخص باللغة العربية
We introduce DatasetGAN: an automatic procedure to generate massive datasets of high-quality semantically segmented images requiring minimal human effort. Current deep networks are extremely data-hungry, benefiting from training on large-scale datasets, which are time consuming to annotate. Our method relies on the power of recent GANs to generate realistic images. We show how the GAN latent code can be decoded to produce a semantic segmentation of the image. Training the decoder only needs a few labeled examples to generalize to the rest of the latent space, resulting in an infinite annotated dataset generator! These generated datasets can then be used for training any computer vision architecture just as real datasets are. As only a few images need to be manually segmented, it becomes possible to annotate images in extreme detail and generate datasets with rich object and part segmentations. To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts. Our approach outperforms all semi-supervised baselines significantly and is on par with fully supervised methods, which in some cases require as much as 100x more annotated data as our method.
Quality estimation aims to measure the quality of translated content without access to a reference translation. This is crucial for machine translation systems in real-world scenarios where high-quality translation is needed. While many approaches ex
Pre-trained language models achieves high performance on machine reading comprehension (MRC) tasks but the results are hard to explain. An appealing approach to make models explainable is to provide rationales for its decision. To facilitate supervis
In a wide array of areas, algorithms are matching and surpassing the performance of human experts, leading to consideration of the roles of human judgment and algorithmic prediction in these domains. The discussion around these developments, however,
Historical documents present many challenges for offline handwriting recognition systems, among them, the segmentation and labeling steps. Carefully annotated textlines are needed to train an HTR system. In some scenarios, transcripts are only availa
Scheduling problems are generally NP-hard combinatorial problems, and a lot of research has been done to solve these problems heuristically. However, most of the previous approaches are problem-specific and research into the development of a general