No Arabic abstract
App reviews deliver user opinions and emerging issues (e.g., new bugs) about the app releases. Due to the dynamic nature of app reviews, topics and sentiment of the reviews would change along with app relea
Millions of mobile apps are available in app stores, such as Apples App Store and Google Play. For a mobile app, it would be increasingly challenging to stand out from the enormous competitors and become prevalent among users. Good user experience and well-designed functionalities are the keys to a successful app. To achieve this, popular apps usually schedule their updates frequently. If we can capture the critical app issues faced by users in a timely and accurate manner, developers can make timely updates, and good user experience can be ensured. There exist prior studies on analyzing reviews for detecting emerging app issues. These studies are usually based on topic modeling or clustering techniques. However, the short-length characteristics and sentiment of user reviews have not been considered. In this paper, we propose a novel emerging issue detection approach named MERIT to take into consideration the two aforementioned characteristics. Specifically, we propose an Adaptive Online Biterm Sentiment-Topic (AOBST) model for jointly modeling topics and corresponding sentiments that takes into consideration a
Recent studies showed that the dialogs between app developers and app users on app stores are important to increase user satisfaction and apps overall ratings. However, the large volume of reviews and the limitation of resources discourage app developers from engaging with customers through this channel. One solution to this problem is to develop an Automated Responding System for developers to respond to app reviews in a manner that is most similar to a human response. Toward designing such system, we have conducted an empirical study of the characteristics of mobile apps reviews and their human-written responses. We found that an app reviews can have multiple fragments at sentence level with different topics and intentions. Similarly, a response also can be divided into multiple fragments with unique intentions to answer certain parts of their review (e.g., complaints, requests, or information seeking). We have also identified several characteristics of review (rating, topics, intentions, quantitative text feature) that can be used to rank review by their priority of need for response. In addition, we identified the degree of re-usability of past responses is based on their context (single app, apps of the same category, and their common features). Last but not least, a responses can be reused in another review if some parts of it can be replaced by a placeholder that is either a named-entity or a hyperlink. Based on those findings, we discuss the implications of developing an Automated Responding System to help mobile apps developers write the responses for users reviews more effectively.
A mobile app interface usually consists of a set of user interface modules. How to properly design these user interface modules is vital to achieving user satisfaction for a mobile app. However, there are few methods to determine design variables for user interface modules except for relying on the judgment of designers. Usually, a laborious post-processing step is necessary to verify the key change of each design variable. Therefore, there is a only very limited amount of design solutions that can be tested. It is timeconsuming and almost impossible to figure out the best design solutions as there are many modules. To this end, we introduce FEELER, a framework to fast and intelligently explore design solutions of user interface modules with a collective machine learning approach. FEELER can help designers quantitatively measure the preference score of different design solutions, aiming to facilitate the designers to conveniently and quickly adjust user interface module. We conducted extensive experimental evaluations on two real-life datasets to demonstrate its applicability in real-life cases of user interface module design in the Baidu App, which is one of the most popular mobile apps in China.
User reviews of mobile apps often contain complaints or suggestions which are valuable for app developers to improve user experience and satisfaction. However, due to the large volume and noisy-nature of those reviews, manually analyzing them for useful opinions is inherently challenging. To address this problem, we propose MARK, a keyword-based framework for semi-automated review analysis. MARK allows an analyst describing his interests in one or some mobile apps by a set of keywords. It then finds and lists the reviews most relevant to those keywords for further analysis. It can also draw the trends over time of those keywords and detect their sudden changes, which might indicate the occurrences of serious issues. To help analysts describe their interests more effectively, MARK can automatically extract keywords from raw reviews and rank them by their associations with negative reviews. In addition, based on a vector-based semantic representation of keywords, MARK can divide a large set of keywords into more cohesive subsets, or suggest keywords similar to the selected ones.
In this paper, we propose the Brand-Topic Model (BTM) which aims to detect brand-associated polarity-bearing topics from product reviews. Different from existing models for sentiment-topic extraction which assume topics are grouped under discrete sentiment categories such as `positive, `negative and `neural, BTM is able to automatically infer real-valued brand-associated sentiment scores and generate fine-grained sentiment-topics in which we can observe continuous changes of words under a certain topic (e.g., `shaver or `cream) while its associated sentiment gradually varies from negative to positive. BTM is built on the Poisson factorisation model with the incorporation of adversarial learning. It has been evaluated on a dataset constructed from Amazon reviews. Experimental results show that BTM outperforms a number of competitive baselines in brand ranking, achieving a better balance of topic coherence and uniqueness, and extracting better-separated polarity-bearing topics.