بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Number Parsing at a Gigabyte per Second

50 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Daniel Lemire

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Daniel Lemire

بنى وهياكل البيانات والخوارزميات البرمجيات الرياضية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

With disks and networks providing gigabytes per second, parsing decimal numbers from strings becomes a bottleneck. We consider the problem of parsing decimal numbers to the nearest binary floating-point value. The general problem requires variable-precision arithmetic. However, we need at most 17 digits to represent 64-bit standard floating-point numbers (IEEE 754). Thus we can represent the decimal significand with a single 64-bit word. By combining the significand and precomputed tables, we can compute the nearest floating-point number using as few as one or two 64-bit multiplications. Our implementation can be several times faster than conventional functions present in standard C libraries on modern 64-bit systems (Intel, AMD, ARM and POWER9). Our work is available as open source software used by major systems such as Apache Arrow and Yandex ClickHouse. The Go standard library has adopted a version of our approach.

قيم البحث

اقرأ أيضاً

Parsing Gigabytes of JSON per Second

107 - Geoff Langdale , Daniel Lemire 2019

JavaScript Object Notation or JSON is a ubiquitous data exchange format on the Web. Ingesting JSON documents can become a performance bottleneck due to the sheer volume of data. We are thus motivated to make JSON parsing as fast as possible. Despit e the maturity of the problem of JSON parsing, we show that substantial speedups are possible. We present the first standard-compliant JSON parser to process gigabytes of data per second on a single core, using commodity processors. We can use a quarter or fewer instructions than a state-of-the-art reference parser like RapidJSON. Unlike other validating parsers, our software (simdjson) makes extensive use of Single Instruction, Multiple Data (SIMD) instructions. To ensure reproducibility, simdjson is freely available as open-source software under a liberal license.

قواعد البيانات الأداء

A New Test for Hamming-Weight Dependencies

101 - David Blackman , Sebastiano Vigna 2021

We describe a new statistical test for pseudorandom number generators (PRNGs). Our test can find bias induced by dependencies among the Hamming weights of the outputs of a PRNG, even for PRNGs that pass state-of-the-art tests of the same kind from th e literature, and in particular for generators based on F_2-linear transformations such as the dSFMT, xoroshiro128+, and WELL512.

بنى وهياكل البيانات والخوارزميات البرمجيات الرياضية

Megaverse: Simulating Embodied Agents at One Million Experiences per Second

110 - Aleksei Petrenko , Erik Wijmans , Brennan Shacklett 2021

We present Megaverse, a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of our engine enables physics-based simulation with high-dimensional egocentric observations at more than 1,000,000 actions p er second on a single 8-GPU node. Megaverse is up to 70x faster than DeepMind Lab in fully-shaded 3D scenes with interactive objects. We achieve this high simulation performance by leveraging batched simulation, thereby taking full advantage of the massive parallelism of modern GPUs. We use Megaverse to build a new benchmark that consists of several single-agent and multi-agent tasks covering a variety of cognitive challenges. We evaluate model-free RL on this benchmark to provide baselines and facilitate future research. The source code is available at https://www.megaverse.info

التعلم الآلي الذكاء الاصطناعي

Second-Order Unsupervised Neural Dependency Parsing

88 - Songlin Yang , Yong Jiang , Wenjuan Han 2020

Most of the unsupervised dependency parsers are based on first-order probabilistic generative models that only consider local parent-child information. Inspired by second-order supervised dependency parsing, we proposed a second-order extension of un supervised neural dependency models that incorporate grandparent-child or sibling information. We also propose a novel design of the neural parameterization and optimization methods of the dependency models. In second-order models, the number of grammar rules grows cubically with the increase of vocabulary size, making it difficult to train lexicalized models that may contain thousands of words. To circumvent this problem while still benefiting from both second-order parsing and lexicalization, we use the agreement-based learning framework to jointly train a second-order unlexicalized model and a first-order lexicalized model. Experiments on multiple datasets show the effectiveness of our second-order models compared with recent state-of-the-art methods. Our joint model achieves a 10% improvement over the previous state-of-the-art parser on the full WSJ test set

الحساب واللغة

High-resolution single-shot ultrafast imaging at ten trillion frames per second

72 - Xuanke Zeng , Yi Cai , Shuiqin Zheng 2018

Ultrafast imaging is a powerful tool for studying space-time dynamics in photonic material, plasma physics, living cells, and neural activity. Pushing the imaging speed to the quantum limit could reveal extraordinary scenes about the questionable qua ntization of life and intelligence, or the wave-particle duality of light. However, previous designs of ultrafast photography are intrinsically limited by framing speed. Here, we introduce a new technique based on a multiple non-collinear optical parametric amplifier principle (MOPA), which readily push the frame rate into the area of ten trillion frames per second with higher spatial resolution than 30 line pairs per millimeter. The MOPA imaging is applied to record the femtosecond early evolution of laser-induced plasma grating in air for the first time. Our approach avoids the intrinsic limitations of previous methods, thus can be potentially optimized for higher speed and resolution, opening the way of approaching quantum limits to test the fundamentals.

أجهزة الكشف الفيزيائية بصريات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة أسيوط

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Number Parsing at a Gigabyte per Second

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً