ﻻ يوجد ملخص باللغة العربية
Data intensive applications on clusters often require requests quickly be sent to the node managing the desired data. In many applications, one must look through a sorted tree structure to determine the responsible node for accessing or storing the data. Examples include object tracking in sensor networks, packet routing over the internet, request processing in publish-subscribe middleware, and query processing in database systems. When the tree structure is larger than the CPU cache, the standard implementation potentially incurs many cache misses for each lookup; one cache miss at each successive level of the tree. As the CPU-RAM gap grows, this performance degradation will only become worse in the future. We propose a solution that takes advantage of the growing speed of local area networks for clusters. We split the sorted tree structure among the nodes of the cluster. We assume that the structure will fit inside the aggregation of the CPU caches of the entire cluster. We then send a word over the network (as part of a larger packet containing other words) in order to examine the tree structure in another nodes CPU cache. We show that this is often faster than the standard solution, which locally incurs multiple cache misses while accessing each successive level of the tree.
Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, a
With the advancement of technology, the data generated in our lives is getting faster and faster, and the amount of data that various applications need to process becomes extremely huge. Therefore, we need to put more effort into analyzing data and e
Power flow analysis plays a fundamental and critical role in the energy management system (EMS). It is required to well accommodate large and complex power system. To achieve a high performance and accurate power flow analysis, a graph computing base
The FFT of three-dimensional (3D) input data is an important computational kernel of numerical simulations and is widely used in High Performance Computing (HPC) codes running on a large number of processors. Performance of many scientific applicatio
Wireless sensor networks become integral part of our life. These networks can be used for monitoring the data in various domain due to their flexibility and functionality. Query processing and optimization in the WSN is a very challenging task becaus