ﻻ يوجد ملخص باللغة العربية
A central challenge in science is to understand how systems behaviors emerge from complex networks. This often requires aggregating, reusing, and integrating heterogeneous information. Supplementary spreadsheets to articles are a key data source. Spreadsheets are popular because they are easy to read and write. However, spreadsheets are often difficult to reanalyze because they capture data ad hoc without schemas that define the objects, relationships, and attributes that they represent. To help researchers reuse and compose spreadsheets, we developed ObjTables, a toolkit that makes spreadsheets human- and machine-readable by combining spreadsheets with schemas and an object-relational mapping system. ObjTables includes a format for schemas; markup for indicating the class and attribute represented by each spreadsheet and column; numerous data types for scientific information; and high-level software for using schemas to read, write, validate, compare, merge, revision, and analyze spreadsheets. By making spreadsheets easier to reuse, ObjTables could enable unprecedented secondary meta-analyses. By making it easy to build new formats and associated software for new types of data, ObjTables can also accelerate emerging scientific fields.
Digital data is a gold mine for modern journalism. However, datasets which interest journalists are extremely heterogeneous, ranging from highly structured (relational databases), semi-structured (JSON, XML, HTML), graphs (e.g., RDF), and text. Journ
Nowadays, journalism is facilitated by the existence of large amounts of digital data sources, including many Open Data ones. Such data sources are extremely heterogeneous, ranging from highly struc-tured (relational databases), semi-structured (JSON
Current metagenomic analysis algorithms require significant computing resources, can report excessive false positives (type I errors), may miss organisms (type II errors / false negatives), or scale poorly on large datasets. This paper explores using
Spreadsheets are end-user programs and domain models that are heavily employed in administration, financial forecasting, education, and science because of their intuitive, flexible, and direct approach to computation. As a result, institutions are sw
Data metrology -- the assessment of the quality of data -- particularly in scientific and industrial settings, has emerged as an important requirement for the UK National Physical Laboratory (NPL) and other national metrology institutes. Data provena