Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Sketchforme: Composing Sketched Scenes from Text Descriptions for Interactive Applications

451 0 0.0 ( 0 )

Download Cite

Added by Forrest Huang

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Forrest Huang - John F. Canny

Human-Computer Interaction Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Sketching and natural languages are effective communication media for interactive applications. We introduce Sketchforme, the first neural-network-based system that can generate sketches based on text descriptions specified by users. Sketchforme is capable of gaining high-level and low-level understanding of multi-object sketched scenes without being trained on sketched scene datasets annotated with text descriptions. The sketches composed by Sketchforme are expressive and realistic: we show in our user study that these sketches convey descriptions better than human-generated sketches in multiple cases, and 36.5% of those sketches are considered to be human-generated. We develop multiple interactive applications using these generated sketches, and show that Sketchforme can significantly improve language learning applications and support intelligent language-based sketching assistants.

rate research

Text2Scene: Generating Compositional Scenes from Textual Descriptions

79 - Fuwen Tan , Song Feng , Vicente Ordonez 2018

In this paper, we propose Text2Scene, a model that generates various forms of compositional scene representations from natural language descriptions. Unlike recent works, our method does NOT use Generative Adversarial Networks (GANs). Text2Scene instead learns to sequentially generate objects and their attributes (location, size, appearance, etc) at every time step by attending to different parts of the input text and the current status of the generated scene. We show that under minor modifications, the proposed framework can handle the generation of different forms of scene representations, including cartoon-like scenes, object layouts corresponding to real images, and synthetic images. Our method is not only competitive when compared with state-of-the-art GAN-based methods using automatic metrics and superior based on human judgments but also has the advantage of producing interpretable results.

Computer Vision and Pattern Recognition Computation and Language

Kori: Interactive Synthesis of Text and Charts in Data Documents

115 - Shahid Latif , Zheng Zhou , Yoon Kim 2021

Charts go hand in hand with text to communicate complex data and are widely adopted in news articles, online blogs, and academic papers. They provide graphical summaries of the data, while text explains the message and context. However, synthesizing information across text and charts is difficult; it requires readers to frequently shift their attention. We investigated ways to support the tight coupling of text and charts in data documents. To understand their interplay, we analyzed the design space of chart-text references through news articles and scientific papers. Informed by the analysis, we developed a mixed-initiative interface enabling users to construct interactive references between text and charts. It leverages natural language processing to automatically suggest references as well as allows users to manually construct other references effortlessly. A user study complemented with algorithmic evaluation of the system suggests that the interface provides an effective way to compose interactive data documents.

Human-Computer Interaction

Text2App: A Framework for Creating Android Apps from Text Descriptions

97 - Masum Hasan , Kazi Sajeed Mehrab , Wasi Uddin Ahmad 2021

We present Text2App -- a framework that allows users to create functional Android applications from natural language specifications. The conventional method of source code generation tries to generate source code directly, which is impractical for creating complex software. We overcome this limitation by transforming natural language into an abstract intermediate formal language representing an application with a substantially smaller number of tokens. The intermediate formal representation is then compiled into target source codes. This abstraction of programming details allows seq2seq networks to learn complex application structures with less overhead. In order to train sequence models, we introduce a data synthesis method grounded in a human survey. We demonstrate that Text2App generalizes well to unseen combination of app components and it is capable of handling noisy natural language instructions. We explore the possibility of creating applications from highly abstract instructions by coupling our system with GPT-3 -- a large pretrained language model. We perform an extensive human evaluation and identify the capabilities and limitations of our system. The source code, a ready-to-run demo notebook, and a demo video are publicly available at url{https://github.com/text2app/Text2App}.

Computation and Language Artificial Intelligence

CrowdTSC: Crowd-based Neural Networks for Text Sentiment Classification

82 - Keyu Yang , Yunjun Gao , Lei Liang 2020

Sentiment classification is a fundamental task in content analysis. Although deep learning has demonstrated promising performance in text classification compared with shallow models, it is still not able to train a satisfying classifier for text sentiment. Human beings are more sophisticated than machine learning models in terms of understanding and capturing the emotional polarities of texts. In this paper, we leverage the power of human intelligence into text sentiment classification. We propose Crowd-based neural networks for Text Sentiment Classification (CrowdTSC for short). We design and post the questions on a crowdsourcing platform to collect the keywords in texts. Sampling and clustering are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network, which incorporate the collected keywords as human beings guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTSC outperforms state-of-the-art models, justifying the effectiveness of crowd-based keyword guidance.

Human-Computer Interaction Machine Learning

OoDAnalyzer: Interactive Analysis of Out-of-Distribution Samples

60 - Changjian Chen , Jun Yuan , Yafeng Lu 2020

One major cause of performance degradation in predictive models is that the test samples are not well covered by the training data. Such not well-represented samples are called OoD samples. In this paper, we propose OoDAnalyzer, a visual analysis approach for interactively identifying OoD samples and explaining them in context. Our approach integrates an ensemble OoD detection method and a grid-based visualization. The detection method is improved from deep ensembles by combining more features with algorithms in the same family. To better analyze and understand the OoD samples in context, we have developed a novel kNN-based grid layout algorithm motivated by Halls theorem. The algorithm approximates the optimal layout and has $O(kN^2)$ time complexity, faster than the grid layout algorithm with overall best performance but $O(N^3)$ time complexity. Quantitative evaluation and case studies were performed on several datasets to demonstrate the effectiveness and usefulness of OoDAnalyzer.

Human-Computer Interaction Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Sketchforme: Composing Sketched Scenes from Text Descriptions for Interactive Applications

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions