Cache Replacement Algorithm

232 0 0.0 ( 0 )

Download Cite

Added by Sarwan Ali

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Sarwan Ali

Data Structures and Algorithms

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Cache replacement algorithms are used to optimize the time taken by processor to process the information by storing the information needed by processor at that time and possibly in future so that if processor needs that information, it can be provided immediately. There are a number of techniques (LIFO, FIFO, LRU, MRU, Hybrid) used to organize information in such a way that processor remains busy almost all the time. But there are some limitations of every technique. We tried to overcome those limitations. We used Probabilistic Graphical Model(PGM), which gives conditional dependency between random variables using directed or undirected graph. In our research, we exploited the Bayesian network technique to predict the future request by processor. The main goal of the research was to increase the cache hit rate but not by increasing the size of cache and also reducing or maintaining the overhead. We achieved 7% more cache hits in best case scenario than those classical algorithms by using PGM technique. This proves the success of our technique as far as cache hits are concerned. Also, pre-eviction proves to be a better technique to get more cache hits. Combining both pre-eviction and pre-fetching using PGM gives us the results which were intended to achieve as the sole purpose of this research.

rate research

A Linear Time Algorithm for Finding Minimum Spanning Tree Replacement Edges

61 - David A. Bader , Paul Burkhardt 2019

Given an undirected, weighted graph, the minimum spanning tree (MST) is a tree that connects all of the vertices of the graph with minimum sum of edge weights. In real world applications, network designers often seek to quickly find a replacement edge for each edge in the MST. For example, when a traffic accident closes a road in a transportation network, or a line goes down in a communication network, the replacement edge may reconnect the MST at lowest cost. In the paper, we consider the case of finding the lowest cost replacement edge for each edge of the MST. A previous algorithm by Tarjan takes $O(m alpha(m, n))$ time, where $alpha(m, n)$ is the inverse Ackermanns function. Given the MST and sorted non-tree edges, our algorithm is the first that runs in $O(m+n)$ time and $O(m+n)$ space to find all replacement edges. Moreover, it is easy to implement and our experimental study demonstrates fast performance on several types of graphs. Additionally, since the most vital edge is the tree edge whose removal causes the highest cost, our algorithm finds it in linear time.

Data Structures and Algorithms

An Imitation Learning Approach for Cache Replacement

75 - Evan Zheran Liu , Milad Hashemi , Kevin Swersky 2020

Program execution speed critically depends on increasing cache hits, as cache hits are orders of magnitude faster than misses. To increase cache hits, we focus on the problem of cache replacement: choosing which cache line to evict upon inserting a new line. This is challenging because it requires planning far ahead and currently there is no known practical solution. As a result, current replacement policies typically resort to heuristics designed for specific common access patterns, which fail on more diverse and complex access patterns. In contrast, we propose an imitation learning approach to automatically learn cache access patterns by leveraging Beladys, an oracle policy that computes the optimal eviction decision given the future cache accesses. While directly applying Beladys is infeasible since the future is unknown, we train a policy conditioned only on past accesses that accurately approximates Beladys even on diverse and complex access patterns, and call this approach Parrot. When evaluated on 13 of the most memory-intensive SPEC applications, Parrot increases cache miss rates by 20% over the current state of the art. In addition, on a large-scale web search benchmark, Parrot increases cache hit rates by 61% over a conventional LRU policy. We release a Gym environment to facilitate research in this area, as data is plentiful, and further advancements can have significant real-world impact.

Machine Learning Hardware Architecture Machine Learning

Optimal Cache-Oblivious Mesh Layouts

190 - Michael A. Bender , Bradley C. Kuszmaul , Shang-Hua Teng 2009

A mesh is a graph that divides physical space into regularly-shaped regions. Meshes computations form the basis of many applications, e.g. finite-element methods, image rendering, and collision detection. In one important mesh primitive, called a mesh update, each mesh vertex stores a value and repeatedly updates this value based on the values stored in all neighboring vertices. The performance of a mesh update depends on the layout of the mesh in memory. This paper shows how to find a memory layout that guarantees that the mesh update has asymptotically optimal memory performance for any set of memory parameters. Such a memory layout is called cache-oblivious. Formally, for a $d$-dimensional mesh $G$, block size $B$, and cache size $M$ (where $M=Omega(B^d)$), the mesh update of $G$ uses $O(1+|G|/B)$ memory transfers. The paper also shows how the mesh-update performance degrades for smaller caches, where $M=o(B^d)$. The paper then gives two algorithms for finding cache-oblivious mesh layouts. The first layout algorithm runs in time $O(|G|log^2|G|)$ both in expectation and with high probability on a RAM. It uses $O(1+|G|log^2(|G|/M)/B)$ memory transfers in expectation and $O(1+(|G|/B)(log^2(|G|/M) + log|G|))$ memory transfers with high probability in the cache-oblivious and disk-access machine (DAM) models. The layout is obtained by finding a fully balanced decomposition tree of $G$ and then performing an in-order traversal of the leaves of the tree. The second algorithm runs faster by almost a $log|G|/loglog|G|$ factor in all three memory models, both in expectation and with high probability. The layout obtained by finding a relax-balanced decomposition tree of $G$ and then performing an in-order traversal of the leaves of the tree.

Data Structures and Algorithms Computational Engineering Mathematical Software

Cache-Aware Lock-Free Concurrent Hash Tries

67 - Aleksandar Prokopec , Phil Bagwell , Martin Odersky 2017

This report describes an implementation of a non-blocking concurrent shared-memory hash trie based on single-word compare-and-swap instructions. Insert, lookup and remove operations modifying different parts of the hash trie can be run independent of each other and do not contend. Remove operations ensure that the unneeded memory is freed and that the trie is kept compact. A pseudocode for these operations is presented and a proof of correctness is given -- we show that the implementation is linearizable and lock-free. Finally, benchmarks are presented which compare concurrent hash trie operations against the corresponding operations on other concurrent data structures, showing their performance and scalability.

Data Structures and Algorithms

Graph Reordering for Cache-Efficient Near Neighbor Search

237 - Benjamin Coleman , Santiago Segarra , Anshumali Shrivastava 2021

Graph search is one of the most successful algorithmic trends in near neighbor search. Several of the most popular and empirically successful algorithms are, at their core, a simple walk along a pruned near neighbor graph. Such algorithms consistently perform at the top of industrial speed benchmarks for applications such as embedding search. However, graph traversal applications often suffer from poor memory access patterns, and near neighbor search is no exception to this rule. Our measurements show that popular search indices such as the hierarchical navigable small-world graph (HNSW) can have poor cache miss performance. To address this problem, we apply graph reordering algorithms to near neighbor graphs. Graph reordering is a memory layout optimization that groups commonly-accessed nodes together in memory. We present exhaustive experiments applying several reordering algorithms to a leading graph-based near neighbor method based on the HNSW index. We find that reordering improves the query time by up to 40%, and we demonstrate that the time needed to reorder the graph is negligible compared to the time required to construct the index.

Data Structures and Algorithms