ترغب بنشر مسار تعليمي؟ اضغط هنا

MetaFlow: a Scalable Metadata Lookup Service for Distributed File Systems in Data Centers

121   0   0.0 ( 0 )
 نشر من قبل Peng Sun
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In large-scale distributed file systems, efficient meta- data operations are critical since most file operations have to interact with metadata servers first. In existing distributed hash table (DHT) based metadata management systems, the lookup service could be a performance bottleneck due to its significant CPU overhead. Our investigations showed that the lookup service could reduce system throughput by up to 70%, and increase system latency by a factor of up to 8 compared to ideal scenarios. In this paper, we present MetaFlow, a scalable metadata lookup service utilizing software-defined networking (SDN) techniques to distribute lookup workload over network components. MetaFlow tackles the lookup bottleneck problem by leveraging B-tree, which is constructed over the physical topology, to manage flow tables for SDN-enabled switches. Therefore, metadata requests can be forwarded to appropriate servers using only switches. Extensive performance evaluations in both simulations and testbed showed that MetaFlow increases system throughput by a factor of up to 3.2, and reduce system latency by a factor of up to 5 compared to DHT-based systems. We also deployed MetaFlow in a distributed file system, and demonstrated significant performance improvement.



قيم البحث

اقرأ أيضاً

Context information has emerged as an important resource to enable autonomy and flexibility of pervasive applications. The widespread use of context information necessitates efficient wide-area lookup services. In this paper, we present the design an d implementation of a peer-to-peer context lookup system to support contextaware applications over multiple smart spaces. Our system provides a distributed repository for context storage, and a semantic peer-to-peer network for context lookup. Collaborative context-aware applications that utilize different context information in multiple smart spaces can be easily built by invoking a pull or push service provided by our system. We outline the design and implementation of our system, and validate our system through the development of cross-domain applications
In the current era of Big Data, data engineering has transformed into an essential field of study across many branches of science. Advancements in Artificial Intelligence (AI) have broadened the scope of data engineering and opened up new application s in both enterprise and research communities. Aggregations (also termed reduce in functional programming) are an integral functionality in these applications. They are traditionally aimed at generating meaningful information on large data-sets, and today, they are being used for engineering more effective features for complex AI models. Aggregations are usually carried out on top of data abstractions such as tables/ arrays and are combined with other operations such as grouping of values. There are frameworks that excel in the said domains individually. But, we believe that there is an essential requirement for a data analytics tool that can universally integrate with existing frameworks, and thereby increase the productivity and efficiency of the entire data analytics pipeline. Cylon endeavors to fulfill this void. In this paper, we present Cylons fast and scalable aggregation operations implemented on top of a distributed in-memory table structure that universally integrates with existing frameworks.
69 - J. Lowell Wofford 2021
As distributed systems grow in scale and complexity, the need for flexible automation of systems management functions also grows. We outline a framework for building tools that provide distributed, scalable, declarative, modular, and continuous autom ation for distributed systems. We focus on four points of design: 1) a state-management approach that prescribes source-of-truth for configured and discovered system states; 2) a technique to solve the declarative unification problem for a class of automation problems, providing state convergence and modularity; 3) an eventual-consistency approach to state synchronization which provides automation at scale; 4) an event-driven architecture that provides always-on state enforcement. We describe the methodology, software architecture for the framework, and constraints for these techniques to apply to an automation problem. We overview a reference application built on this framework that provides state-aware system provisioning and node lifecycle management, highlighting key advantages. We conclude with a discussion of current and future applications.
Cloud service providers are distributing data centers geographically to minimize energy costs through intelligent workload distribution. With increasing data volumes in emerging cloud workloads, it is critical to factor in the network costs for trans ferring workloads across data centers. For geo-distributed data centers, many researchers have been exploring strategies for energy cost minimization and intelligent inter-data-center workload distribution separately. However, prior work does not comprehensively and simultaneously consider data center energy costs, data transfer costs, and data center queueing delay. In this paper, we propose a novel game theory-based workload management framework that takes a holistic approach to the cloud operating cost minimization problem by making intelligent scheduling decisions aware of data transfer costs and the data center queueing delay. Our framework performs intelligent workload management that considers heterogeneity in data center compute capability, cooling power, interference effects from task co-location in servers, time-of-use electricity pricing, renewable energy, net metering, peak demand pricing distribution, and network pricing. Our simulations show that the proposed game-theoretic technique can minimize the cloud operating cost more effectively than existing approaches.
269 - Runyu Zhang , Chaoshu Yang 2019
Existing path lookup routines in file systems need to construct an auxiliary index in memory or traverse the dentries of the directory file sequentially, which brings either heavy writes or large timing cost. This paper designs a novel path lookup me chanism, Content-Indexed Browsing (CIB), for file systems on persistent memory, in which the structure of directory files is an exclusive index that can be searched in $O(log(n))$ time. We implement CIB in a real persistent memory file system, PMFS, denoted by CIB-PMFS. Comprehensive evaluations show that CIB can achieve times of performance improvement over the conventional lookup schemes in PMFS, and brings 20.4% improvement on the overall performance of PMFS. Furthermore, CIB reduces the writes on persistent memory by orders of magnitude comparing with existing extra index schemes.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا