No Arabic abstract
Due to time constraints, course instructors often need to selectively participate in student discussion threads, due to their limited bandwidth and lopsided student--instructor ratio on online forums. We propose the first deep learning models for this binary prediction problem. We propose novel attention based models to infer the amount of latent context necessary to predict instructor intervention. Such models also allow themselves to be tuned to instructors preference to intervene early or late. Our three proposed attentive model variants to infer the latent context improve over the state-of-the-art by a significant, large margin of 11% in F1 and 10% in recall, on average. Further, introspection of attention help us better understand what aspects of a discussion post propagate through the discussion thread that prompts instructor intervention.
With large student enrollment, MOOC instructors face the unique challenge in deciding when to intervene in forum discussions with their limited bandwidth. We study this problem of instructor intervention. Using a large sample of forum data culled from 61 courses, we design a binary classifier to predict whether an instructor should intervene in a discussion thread or not. By incorporating novel information about a forums type into the classification process, we improve significantly over the previous state-of-the-art. We show how difficult this decision problem is in the real world by validating against indicative human judgment, and empirically show the problems sensitivity to instructors intervention preferences. We conclude this paper with our take on the future research issues in intervention.
Social learning, i.e., students learning from each other through social interactions, has the potential to significantly scale up instruction in online education. In many cases, such as in massive open online courses (MOOCs), social learning is facilitated through discussion forums hosted by course providers. In this paper, we propose a probabilistic model for the process of learners posting on such forums, using point processes. Different from existing works, our method integrates topic modeling of the post text, timescale modeling of the decay in post activity over time, and learner topic interest modeling into a single model, and infers this information from user data. Our method also varies the excitation levels induced by posts according to the thread structure, to reflect typical notification settings in discussion forums. We experimentally validate the proposed model on three real-world MOOC datasets, with the largest one containing up to 6,000 learners making 40,000 posts in 5,000 threads. Results show that our model excels at thread recommendation, achieving significant improvement over a number of baselines, thus showing promise of being able to direct learners to threads that they are interested in more efficiently. Moreover, we demonstrate analytics that our model parameters can provide, such as the timescales of different topic categories in a course.
Many context-sensitive data flow analyses can be formulated as a variant of the all-pairs Dyck-CFL reachability problem, which, in general, is of sub-cubic time complexity and quadratic space complexity. Such high complexity significantly limits the scalability of context-sensitive data flow analysis and is not affordable for analyzing large-scale software. This paper presents textsc{Flare}, a reduction from the CFL reachability problem to the conventional graph reachability problem for context-sensitive data flow analysis. This reduction allows us to benefit from recent advances in reachability indexing schemes, which often consume almost linear space for answering reachability queries in almost constant time. We have applied our reduction to a context-sensitive alias analysis and a context-sensitive information-flow analysis for C/C++ programs. Experimental results on standard benchmarks and open-source software demonstrate that we can achieve orders of magnitude speedup at the cost of only moderate space to store the indexes. The implementation of our approach is publicly available.
Conversational context information, higher-level knowledge that spans across sentences, can help to recognize a long conversation. However, existing speech recognition models are typically built at a sentence level, and thus it may not capture important conversational context information. The recent progress in end-to-end speech recognition enables integrating context with other available information (e.g., acoustic, linguistic resources) and directly recognizing words from speech. In this work, we present a direct acoustic-to-word, end-to-end speech recognition model capable of utilizing the conversational context to better process long conversations. We evaluate our proposed approach on the Switchboard conversational speech corpus and show that our system outperforms a standard end-to-end speech recognition system.
Underground online forums are platforms that enable trades of illicit services and stolen goods. Carding forums, in particular, are known for being focused on trading financial information. However, little evidence exists about the sellers that are present on carding forums, the precise types of products they advertise, and the prices buyers pay. Existing literature mainly focuses on the organisation and structure of the forums. Furthermore, studies on carding forums are usually based on literature review, expert interviews, or data from forums that have already been shut down. This paper provides first-of-its-kind empirical evidence on active forums where stolen financial data is traded. We monitored 5 out of 25 discovered forums, collected posts from the forums over a three-month period, and analysed them quantitatively and qualitatively. We focused our analyses on products, prices, seller prolificacy, seller specialisation, and seller reputation.