بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Short Note on Costs of Floating Point Operations on current x86-64 Architectures: Denormals, Overflow, Underflow, and Division by Zero

397 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Markus Wittmann

تاريخ النشر 2015

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Markus Wittmann - Thomas Zeiser - Georg Hager

الأداء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Simple floating point operations like addition or multiplication on normalized floating point values can be computed by current AMD and Intel processors in three to five cycles. This is different for denormalized numbers, which appear when an underflow occurs and the value can no longer be represented as a normalized floating-point value. Here the costs are about two magnitudes higher.

قيم البحث

اقرأ أيضاً

A Note on Disk Drag Dynamics

259 - Neil J. Gunther 2012

The electrical power consumed by typical magnetic hard disk drives (HDD) not only increases linearly with the number of spindles but, more significantly, it increases as very fast power-laws of speed (RPM) and diameter. Since the theoretical basis fo r this relationship is neither well-known nor readily accessible in the literature, we show how these exponents arise from aerodynamic disk drag and discuss their import for green storage capacity planning.

الأداء قواعد البيانات الفيزياء الكلاسيكية

Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels

66 - Jan Laukemann , Julian Hammer , Georg Hager 2019

Useful models of loop kernel runtimes on out-of-order architectures require an analysis of the in-core performance behavior of instructions and their dependencies. While an instruction throughput prediction sets a lower bound to the kernel runtime, t he critical path defines an upper bound. Such predictions are an essential part of analytic (i.e., white-box) performance models like the Roofline and Execution-Cache-Memory (ECM) models. They enable a better understanding of the performance-relevant interactions between hardware architecture and loop code. The Open Source Architecture Code Analyzer (OSACA) is a static analysis tool for predicting the execution time of sequential loops. It previously supported only x86 (Intel and AMD) architectures and simple, optimistic full-throughput execution. We have heavily extended OSACA to support ARM instructions and critical path prediction including the detection of loop-carried dependencies, which turns it into a versatile cross-architecture modeling tool. We show runtime predictions for code on Intel Cascade Lake, AMD Zen, and Marvell ThunderX2 micro-architectures based on machine models from available documentation and semi-automatic benchmarking. The predictions are compared with actual measurements.

الأداء

Direct N-body application on low-power and energy-efficient parallel architectures

180 - D. Goz , G. Ieronymakis , V. Papaefstathiou 2019

The aim of this work is to quantitatively evaluate the impact of computation on the energy consumption on ARM MPSoC platforms, exploiting CPUs, embedded GPUs and FPGAs. One of them possibly represents the future of High Performance Computing systems: a prototype of an Exascale supercomputer. Performance and energy measurements are made using a state-of-the-art direct $N$-body code from the astrophysical domain. We provide a comparison of the time-to-solution and energy delay product metrics, for different software configurations. We have shown that FPGA technologies can be used for application kernel acceleration and are emerging as a promising alternative to traditional technologies for HPC, which purely focus on peak-performance than on power-efficiency.

الأداء الأجهزة والأساليب للزيئات الفيزياء الفلكية

A note on higher-order differential operations

408 - Branko J. Malesevic 2007

In this paper we consider successive iterations of the first-order differential operations in space ${bf R}^3.$

الهندسة التفاضلية التحليل الكلاسيكي و ODEs

Division by zero in common meadows

523 - Jan A. Bergstra , Alban Ponse 2014

Common meadows are fields expanded with a total inverse function. Division by zero produces an additional value denoted with a that propagates through all operations of the meadow signature (this additional value can be interpreted as an error elemen t). We provide a basis theorem for so-called common cancellation meadows of characteristic zero, that is, common meadows of characteristic zero that admit a certain cancellation law.

حلقات وجبر

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حماه

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Short Note on Costs of Floating Point Operations on current x86-64 Architectures: Denormals, Overflow, Underflow, and Division by Zero

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً