Do you want to publish a course? Click here

On the operating unit size of load/store architectures

107   0   0.0 ( 0 )
 Added by Kees Middelburg
 Publication date 2008
and research's language is English




Ask ChatGPT about the research

We introduce a strict version of the concept of a load/store instruction set architecture in the setting of Maurer machines. We take the view that transformations on the states of a Maurer machine are achieved by applying threads as considered in thread algebra to the Maurer machine. We study how the transformations on the states of the main memory of a strict load/store instruction set architecture that can be achieved by applying threads depend on the operating unit size, the cardinality of the instruction set, and the maximal number of states of the threads.



rate research

Read More

226 - Tie Hou 2009
In this paper, we study how certain conditions can affect the transformations on the states of the memory of a strict load-store Maurer ISA, when half of the data memory serves as the part of the operating unit.
424 - Shaoshan Liu , Yuhao Zhu , Bo Yu 2021
The commercialization of autonomous machines is a thriving sector, and likely to be the next major computing demand driver, after PC, cloud computing, and mobile computing. Nevertheless, a suitable computer architecture for autonomous machines is missing, and many companies are forced to develop ad hoc computing solutions that are neither scalable nor extensible. In this article, we analyze the demands of autonomous machine computing, and argue for the promise of dataflow architectures in autonomous machines.
Independent of the technology, it is generally expected that future nanoscale devices will be built from vast numbers of densely arranged devices that exhibit high failure rates. Other than that, there is little consensus on what type of technology and computing architecture holds most promises to go far beyond todays top-down engineered silicon devices. Cellular automata (CA) have been proposed in the past as a possible class of architectures to the von Neumann computing architecture, which is not generally well suited for future parallel and fine-grained nanoscale electronics. While the top-down engineered semi-conducting technology favors regular and locally interconnected structures, future bottom-up self-assembled devices tend to have irregular structures because of the current lack precise control over these processes. In this paper, we will assess random dynamical networks, namely Random Boolean Networks (RBNs) and Random Threshold Networks (RTNs), as alternative computing architectures and models for future information processing devices. We will illustrate that--from a theoretical perspective--they offer superior properties over classical CA-based architectures, such as inherent robustness as the system scales up, more efficient information processing capabilities, and manufacturing benefits for bottom-up designed devices, which motivates this investigation. We will present recent results on the dynamic behavior and robustness of such random dynamical networks while also including manufacturing issues in the assessment.
413 - Jia Yu , Wei Wu , Xi Chen 2007
With the scaling of technology and higher requirements on performance and functionality, power dissipation is becoming one of the major design considerations in the development of network processors. In this paper, we use an assertion-based methodology for system-level power/performance analysis to study two dynamic voltage scaling (DVS) techniques, traffic-based DVS and execution-based DVS, in a network processor model. Using the automatically generated distribution analyzers, we analyze the power and performance distributions and study their trade-offs for the two DVS policies with different parameter settings such as threshold values and window sizes. We discuss the optimal configurations of the two DVS policies under different design requirements. By a set of experiments, we show that the assertion-based trace analysis methodology is an efficient tool that can help a designer easily compare and study optimal architectural configurations in a large design space.
Commodity memory interfaces have difficulty in scaling memory capacity to meet the needs of modern multicore and big data systems. DRAM device density and maximum device count are constrained by technology, package, and signal in- tegrity issues that limit total memory capacity. Synchronous DRAM protocols require data to be returned within a fixed latency, and thus memory extension methods over commodity DDRx interfaces fail to support scalable topologies. Current extension approaches either use slow PCIe interfaces, or require expensive changes to the memory interface, which limits commercial adoptability. Here we propose twin-load, a lightweight asynchronous memory access mechanism over the synchronous DDRx interface. Twin-load uses two special loads to accomplish one access request to extended memory, the first serves as a prefetch command to the DRAM system, and the second asynchronously gets the required data. Twin-load requires no hardware changes on the processor side and only slight soft- ware modifications. We emulate this system on a prototype to demonstrate the feasibility of our approach. Twin-load has comparable performance to NUMA extended memory and outperforms a page-swapping PCIe-based system by several orders of magnitude. Twin-load thus enables instant capacity increases on commodity platforms, but more importantly, our architecture opens opportunities for the design of novel, efficient, scalable, cost-effective memory subsystems.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا