ترغب بنشر مسار تعليمي؟ اضغط هنا

A Tiled-Table Convention for Compressing FITS Binary Tables

80   0   0.0 ( 0 )
 نشر من قبل Robert Seaman
 تاريخ النشر 2012
والبحث باللغة English




اسأل ChatGPT حول البحث

This document describes a convention for compressing FITS binary tables that is modeled after the FITS tiled-image compression method (White et al. 2009) that has been in use for about a decade. The input table is first optionally subdivided into tiles, each containing an equal number of rows, then every column of data within each tile is compressed and stored as a variable-length array of bytes in the output FITS binary table. All the header keywords from the input table are copied to the header of the output table and remain uncompressed for efficient access. The output compressed table contains the same number and order of columns as in the input uncompressed binary table. There is one row in the output table corresponding to each tile of rows in the input table. In principle, each column of data can be compressed using a different algorithm that is optimized for the type of data within that column, however in the prototype implementation described here, the gzip algorithm is used to compress every column.


قيم البحث

اقرأ أيضاً

This document describes a convention for compressing n-dimensional images and storing the resulting byte stream in a variable-length column in a FITS binary table. The FITS file structure outlined here is independent of the specific data compression algorithm that is used. The implementation details for 4 widely used compression algorithms are described here, but any other compression technique could also be supported by this convention. The general principle used in this convention is to first divide the n-dimensional image into a rectangular grid of subimages or tiles. Each tile is then compressed as a block of data, and the resulting compressed byte stream is stored in a row of a variable length column in a FITS binary table. By dividing the image into tiles it is generally possible to extract and uncompress subsections of the image without having to uncompress the whole image.
This document describes a FITS convention developed by the IRAF Group (D. Tody, R. Seaman, and N. Zarate) at the National Optical Astronomical Observatory (NOAO). This convention is implemented by the fgread/fgwrite tasks in the IRAF fitsutil package . It was first used in May 1999 to encapsulate preview PNG-format graphics files into FITS files in the NOAO High Performance Pipeline System. A FITS extension of type FOREIGN provides a mechanism for storing an arbitrary file or tree of files in FITS, allowing it to be restored to disk at a later time.
The checksum keywords described here provide an integrity check on the information contained in FITS HDUs. (Header and Data Units are the basic components of FITS files, consisting of header keyword records followed by optional associated data record s). The CHECKSUM keyword is defined to have a value that forces the 32-bit 1s complement checksum accumulated over all the 2880-byte FITS logical records in the HDU to equal negative 0. (Note that 1s complement arithmetic has both positive and negative zero elements). Verifying that the accumulated checksum is still equal to -0 provides a fast and fairly reliable way to determine that the HDU has not been modified by subsequent data processing operations or corrupted while copying or storing the file on physical media.
Many data we collect today are in tabular form, with rows as records and columns as attributes associated with each record. Understanding the structural relationship in tabular data can greatly facilitate the data science process. Traditionally, much of this relational information is stored in table schema and maintained by its creators, usually domain experts. In this paper, we develop automated methods to uncover deep relationships in a single data table without expert or domain knowledge. Our method can decompose a data table into layers of smaller tables, revealing its deep structure. The key to our approach is a computationally lightweight forward addition algorithm that we developed to recursively extract the functional dependencies between table columns that are scalable to tables with many columns. With our solution, data scientists will be provided with automatically generated, data-driven insights when exploring new data sets.
We propose a new end-to-end method for extending a Knowledge Graph (KG) from tables. Existing techniques tend to interpret tables by focusing on information that is already in the KG, and therefore tend to extract many redundant facts. Our method aim s to find more novel facts. We introduce a new technique for table interpretation based on a scalable graphical model using entity similarities. Our method further disambiguates cell values using KG embeddings as additional ranking method. Other distinctive features are the lack of assumptions about the underlying KG and the enabling of a fine-grained tuning of the precision/recall trade-off of extracted facts. Our experiments show that our approach has a higher recall during the interpretation process than the state-of-the-art, and is more resistant against the bias observed in extracting mostly redundant facts since it produces more novel extractions.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا