No Arabic abstract
The majority of scientific papers are distributed in PDF, which pose challenges for accessibility, especially for blind and low vision (BLV) readers. We characterize the scope of this problem by assessing the accessibility of 11,397 PDFs published 2010--2019 sampled across various fields of study, finding that only 2.4% of these PDFs satisfy all of our defined accessibility criteria. We introduce the SciA11y system to offset some of the issues around inaccessibility. SciA11y incorporates several machine learning models to extract the content of scientific PDFs and render this content as accessible HTML, with added novel navigational features to support screen reader users. An intrinsic evaluation of extraction quality indicates that the majority of HTML renders (87%) produced by our system have no or only some readability issues. We perform a qualitative user study to understand the needs of BLV researchers when reading papers, and to assess whether the SciA11y system could address these needs. We summarize our user study findings into a set of five design recommendations for accessible scientific reader systems. User response to SciA11y was positive, with all users saying they would be likely to use the system in the future, and some stating that the system, if available, would become their primary workflow. We successfully produce HTML renders for over 12M papers, of which an open access subset of 1.5M are available for browsing at https://scia11y.org/
We present a bibliographic analysis of Chandra, Hubble, and Spitzer publications. We find (a) archival data are used in >60% of the publication output and (b) archives for these missions enable a much broader set of institutions and countries to scientifically use data from these missions. Specifically, we find that authors from institutions that have published few papers from a given mission publish 2/3 archival publications, while those with many publications typically have 1/3 archival publications. We also show that countries with lower GDP per capita overwhelmingly produce archival publications, while countries with higher GDP per capital produce guest observer and archival publications in equal amounts. We argue that robust archives are thus not only critical for the scientific productivity of mission data, but also the scientific accessibility of mission data. We argue that the astronomical community should support archives to maximize the overall scientific societal impact of astronomy, and represent an excellent investment in astronomys future.
Accessibility research sits at the junction of several disciplines, drawing influence from HCI, disability studies, psychology, education, and more. To characterize the influences and extensions of accessibility research, we undertake a study of citation trends for accessibility and related HCI communities. We assess the diversity of venues and fields of study represented among the referenced and citing papers of 836 accessibility research papers from ASSETS and CHI, finding that though publications in computer science dominate these citation relationships, the relative proportion of citations from papers on psychology and medicine has grown over time. Though ASSETS is a more niche venue than CHI in terms of citational diversity, both conferences display standard levels of diversity among their incoming and outgoing citations when analyzed in the context of 53K papers from 13 accessibility and HCI conference venues.
We present a novel system providing summaries for Computer Science publications. Through a qualitative user study, we identified the most valuable scenarios for discovery, exploration and understanding of scientific documents. Based on these findings, we built a system that retrieves and summarizes scientific documents for a given information need, either in form of a free-text query or by choosing categorized values such as scientific tasks, datasets and more. Our system ingested 270,000 papers, and its summarization module aims to generate concise yet detailed summaries. We validated our approach with human experts.
Scientific publishing is the means by which we communicate and share scientific knowledge, but this process currently often lacks transparency and machine-interpretable representations. Scientific articles are published in long coarse-grained text with complicated structures, and they are optimized for human readers and not for automated means of organization and access. Peer reviewing is the main method of quality assessment, but these peer reviews are nowadays rarely published and their own complicated structure and linking to the respective articles is not accessible. In order to address these problems and to better align scientific publishing with the principles of the Web and Linked Data, we propose here an approach to use nanopublications as a unifying model to represent in a semantic way the elements of publications, their assessments, as well as the involved processes, actors, and provenance in general. To evaluate our approach, we present a dataset of 627 nanopublications representing an interlinked network of the elements of articles (such as individual paragraphs) and their reviews (such as individual review comments). Focusing on the specific scenario of editors performing a meta-review, we introduce seven competency questions and show how they can be executed as SPARQL queries. We then present a prototype of a user interface for that scenario that shows different views on the set of review comments provided for a given manuscript, and we show in a user study that editors find the interface useful to answer their competency questions. In summary, we demonstrate that a unified and semantic publication model based on nanopublications can make scientific communication more effective and user-friendly.
In the same way ecosystems tend to increase maturity by decreasing the flow of energy per unit biomass, we should move towards a more mature science by publishing less but high-quality papers and getting away from joining large teams in small roles. That is, we should decrease our scientific productivity for good.