ﻻ يوجد ملخص باللغة العربية
There is an explosive growth in the size of the input and/or intermediate data used and generated by modern and emerging applications. Unfortunately, modern computing systems are not capable of handling large amounts of data efficiently. Major concepts and components (e.g., the virtual memory system) and predominant execution models (e.g., the processor-centric execution model) used in almost all computing systems are designed without having modern applications overwhelming data demand in mind. As a result, accessing, moving, and processing large amounts of data faces important challenges in todays systems, making data a first-class concern and a prime performance and energy bottleneck in such systems. This thesis studies the root cause of inefficiency in modern computing systems when handling modern applications data demand, and aims to fundamentally address such inefficiencies, with a focus on two directions. First, we design SIMDRAM, an end-to-end processing-using-DRAM framework that aids the widespread adoption of processing-using-DRAM, a data-centric computation paradigm that improves the overall performance and efficiency of the system when computing large amounts of data by minimizing the cost of data movement and enabling computation where the data resides. Second, we introduce the Virtual Block Interface (VBI), a novel virtual memory framework that 1) eliminates the inefficiencies of the conventional virtual memory frameworks when handling the high memory demand in modern applications, and 2) is built from the ground up to understand, convey, and exploit data properties, to create opportunities for performance and efficiency improvements.
We aim to implement a Big Data/Extreme Computing (BDEC) capable system infrastructure as we head towards the era of Exascale computing - termed SAGE (Percipient StorAGe for Exascale Data Centric Computing). The SAGE system will be capable of storing
Next Generation Sequencing (NGS) technology has resulted in massive amounts of proteomics and genomics data. This data is of no use if it is not properly analyzed. ETL (Extraction, Transformation, Loading) is an important step in designing data analy
An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics
Reliability is a fundamental requirement in any microprocessor to guarantee correct execution over its lifetime. The design rules related to reliability depend on the process technology being used and the expected operating conditions of the device.
The Payload Data Handling System (PDHS) of Gaia is a technological challenge, since it will have to process a huge amount of data with limited resources. Its main tasks include the optimal codification of science data, its packetisation and its compr