ترغب بنشر مسار تعليمي؟ اضغط هنا

Topic Modeling the Reading and Writing Behavior of Information Foragers

61   0   0.0 ( 0 )
 نشر من قبل Jaimie Murdock
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English
 تأليف Jaimie Murdock




اسأل ChatGPT حول البحث

The general problem of information foraging in an environment about which agents have incomplete information has been explored in many fields, including cognitive psychology, neuroscience, economics, finance, ecology, and computer science. In all of these areas, the searcher aims to enhance future performance by surveying enough of existing knowledge to orient themselves in the information space. Individuals can be viewed as conducting a cognitive search in which they must balance exploration of ideas that are novel to them against exploitation of knowledge in domains in which they are already expert. In this dissertation, I present several case studies that demonstrate how reading and writing behaviors interact to construct personal knowledge bases. These studies use LDA topic modeling to represent the information environment of the texts each author read and wrote. Three studies revolve around Charles Darwin. Darwin left detailed records of every book he read for 23 years, from disembarking from the H.M.S. Beagle to just after publication of The Origin of Species. Additionally, he left copies of his drafts before publication. I characterize his reading behavior, then show how that reading behavior interacted with the drafts and subsequent revisions of The Origin of Species, and expand the dataset to include later readings and writings. Then, through a study of Thomas Jeffersons correspondence, I expand the study to non-book data. Finally, through an examination of neuroscience citation data, I move from individual behavior to collective behavior in constructing an information environment. Together, these studies reveal the interplay between individual and collective phenomena where innovation takes place (Tria et al. 2014).



قيم البحث

اقرأ أيضاً

Ancient Chinese texts present an area of enormous challenge and opportunity for humanities scholars interested in exploiting computational methods to assist in the development of new insights and interpretations of culturally significant materials. I n this paper we describe a collaborative effort between Indiana University and Xian Jiaotong University to support exploration and interpretation of a digital corpus of over 18,000 ancient Chinese documents, which we refer to as the Handian ancient classics corpus (H`an diu{a}n gu{u} ji, i.e, the Han canon or Chinese classics). It contains classics of ancient Chinese philosophy, documents of historical and biographical significance, and literary works. We begin by describing the Digital Humanities context of this joint project, and the advances in humanities computing that made this project feasible. We describe the corpus and introduce our application of probabilistic topic modeling to this corpus, with attention to the particular challenges posed by modeling ancient Chinese documents. We give a specific example of how the software we have developed can be used to aid discovery and interpretation of themes in the corpus. We outline more advanced forms of computer-aided interpretation that are also made possible by the programming interface provided by our system, and the general implications of these methods for understanding the nature of meaning in these texts.
The abolitionist movement of the nineteenth-century United States remains among the most significant social and political movements in US history. Abolitionist newspapers played a crucial role in spreading information and shaping public opinion aroun d a range of issues relating to the abolition of slavery. These newspapers also serve as a primary source of information about the movement for scholars today, resulting in powerful new accounts of the movement and its leaders. This paper supplements recent qualitative work on the role of women in abolitions vanguard, as well as the role of the Black press, with a quantitative text modeling approach. Using diachronic word embeddings, we identify which newspapers tended to lead lexical semantic innovations -- the introduction of new usages of specific words -- and which newspapers tended to follow. We then aggregate the evidence across hundreds of changes into a weighted network with the newspapers as nodes; directed edge weights represent the frequency with which each newspaper led the other in the adoption of a lexical semantic change. Analysis of this network reveals pathways of lexical semantic influence, distinguishing leaders from followers, as well as others who stood apart from the semantic changes that swept through this period. More specifically, we find that two newspapers edited by women -- THE PROVINCIAL FREEMAN and THE LILY -- led a large number of semantic changes in our corpus, lending additional credence to the argument that a multiracial coalition of women led the abolitionist movement in terms of both thought and action. It also contributes additional complexity to the scholarship that has sought to tease apart the relation of the abolitionist movement to the womens suffrage movement, and the vexed racial politics that characterized their relation.
The highest-density magnetic storage media will code data in single-atom bits. To date, the smallest individually addressable bistable magnetic bits on surfaces consist of 5-12 atoms. Long magnetic relaxation times were demonstrated in molecular magn ets containing one lanthanide atom, and recently in ensembles of single holmium (Ho) atoms supported on magnesium oxide (MgO). Those experiments indicated the possibility for data storage at the fundamental limit, but it remained unclear how to access the individual magnetic centers. Here we demonstrate the reading and writing of individual Ho atoms on MgO, and show that they independently retain their magnetic information over many hours. We read the Ho states by tunnel magnetoresistance and write with current pulses using a scanning tunneling microscope. The magnetic origin of the long-lived states is confirmed by single-atom electron paramagnetic resonance (EPR) on a nearby Fe sensor atom, which shows that Ho has a large out-of-plane moment of $(10.1 pm 0.1)$ $mu_{rm B}$ on this surface. In order to demonstrate independent reading and writing, we built an atomic scale structure with two Ho bits to which we write the four possible states and which we read out remotely by EPR. The high magnetic stability combined with electrical reading and writing shows that single-atom magnetic memory is possible.
We use a combination of charge writing and scanning gate microscopy to map and modify the local charge neutrality point of graphene field-effect devices. We give a demonstration of the technique by writing remote charge in a thin dielectric layer ove r the graphene-metal interface and detecting the resulting shift in local charge neutrality point. We perform electrostatic simulations to characterize the gating effect of a realistic scanning probe tip on a graphene bilayer and find a good agreement with the experimental results.
This work focuses on combining nonparametric topic models with Auto-Encoding Variational Bayes (AEVB). Specifically, we first propose iTM-VAE, where the topics are treated as trainable parameters and the document-specific topic proportions are obtain ed by a stick-breaking construction. The inference of iTM-VAE is modeled by neural networks such that it can be computed in a simple feed-forward manner. We also describe how to introduce a hyper-prior into iTM-VAE so as to model the uncertainty of the prior parameter. Actually, the hyper-prior technique is quite general and we show that it can be applied to other AEVB based models to alleviate the {it collapse-to-prior} problem elegantly. Moreover, we also propose HiTM-VAE, where the document-specific topic distributions are generated in a hierarchical manner. HiTM-VAE is even more flexible and can generate topic distributions with better variability. Experimental results on 20News and Reuters RCV1-V2 datasets show that the proposed models outperform the state-of-the-art baselines significantly. The advantages of the hyper-prior technique and the hierarchical model construction are also confirmed by experiments.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا