Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Monitoring, Analyzing, and Controlling Internet-scale Systems with ACME

67 0 0.0 ( 0 )

Download Cite

Added by David Oppenheimer

Publication date 2004

fields Informatics Engineering

and research's language is English

Authors David Oppenheimer - Vitaliy Vatkovskiy - Hakim Weatherspoon

Distributed Parallel and Cluster Computing Networking and Internet Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Analyzing and controlling large distributed services under a wide range of conditions is difficult. Yet these capabilities are essential to a number of important development and operational tasks such as benchmarking, testing, and system management. To facilitate these tasks, we have built the Application Control and Monitoring Environment (ACME), a scalable, flexible infrastructure for monitoring, analyzing, and controlling Internet-scale systems. ACME consists of two parts. ISING, the Internet Sensor In-Network agGregator, queries sensors and aggregates the results as they are routed through an overlay network. ENTRIE, the ENgine for TRiggering Internet Events, uses the data streams supplied by ISING, in combination with a users XML configuration file, to trigger actuators such as killing processes during a robustness benchmark or paging a system administrator when predefined anomalous conditions are observed. In this paper we describe the design, implementation, and evaluation of ACME and its constituent parts. We find that for a 512-node system running atop an emulated Internet topology, ISINGs use of in-network aggregation can reduce end-to-end query-response latency by more than 50% compared to using either direct network connections or the same overlay network without aggregation. We also find that an untuned implementation of ACME can invoke an actuator on one or all nodes in response to a discrete or aggregate event in less than four seconds, and we illustrate ACMEs applicability to concrete benchmarking and monitoring scenarios.

rate research

Analyzing Open-Source Serverless Platforms: Characteristics and Performance

97 - Junfeng Li , Sameer G. Kulkarni , K. K. Ramakrishnan 2021

Serverless computing is increasingly popular because of its lower cost and easier deployment. Several cloud service providers (CSPs) offer serverless computing on their public clouds, but it may bring the vendor lock-in risk. To avoid this limitation, many open-source serverless platforms come out to allow developers to freely deploy and manage functions on self-hosted clouds. However, building effective functions requires much expertise and thorough comprehension of platform frameworks and features that affect performance. It is a challenge for a service developer to differentiate and select the appropriate serverless platform for different demands and scenarios. Thus, we elaborate the frameworks and event processing models of four popular open-source serverless platforms and identify their salient idiosyncrasies. We analyze the root causes of performance differences between different service exporting and auto-scaling modes on those platforms. Further, we provide several insights for future work, such as auto-scaling and metric collection.

Distributed Parallel and Cluster Computing Networking and Internet Architecture

A Survey on Big Data for Network Traffic Monitoring and Analysis

106 - Alessandro DAlconzo , Idilio Drago , Andrea Morichetta 2020

Network Traffic Monitoring and Analysis (NTMA) represents a key component for network management, especially to guarantee the correct operation of large-scale networks such as the Internet. As the complexity of Internet services and the volume of traffic continue to increase, it becomes difficult to design scalable NTMA applications. Applications such as traffic classification and policing require real-time and scalable approaches. Anomaly detection and security mechanisms require to quickly identify and react to unpredictable events while processing millions of heterogeneous events. At last, the system has to collect, store, and process massive sets of historical data for post-mortem analysis. Those are precisely the challenges faced by general big data approaches: Volume, Velocity, Variety, and Veracity. This survey brings together NTMA and big data. We catalog previous work on NTMA that adopt big data approaches to understand to what extent the potential of big data is being explored in NTMA. This survey mainly focuses on approaches and technologies to manage the big NTMA data, additionally briefly discussing big data analytics (e.g., machine learning) for the sake of NTMA. Finally, we provide guidelines for future work, discussing lessons learned, and research directions.

Distributed Parallel and Cluster Computing Networking and Internet Architecture

Distributed Continuous Range-Skyline Query Monitoring over the Internet of Mobile Things

77 - Chuan-Chi Lai , Zulhaydar Fairozal Akbar , Chuan-Ming Liu 2019

A Range-Skyline Query (RSQ) is the combination of range query and skyline query. It is one of the practical query types in multi-criteria decision services, which may include the spatial and non-spatial information as well as make the resulting information more useful than skyline search when the location is concerned. Furthermore, Continuous Range-Skyline Query (CRSQ) is an extension of Range-Skyline Query (RSQ) that the system continuously reports the skyline results to a query within a given search range. This work focuses on the RSQ and CRSQ within a specific range on Internet of Mobile Things (IoMT) applications. Many server-client approaches for CRSQ have been proposed but are sensitive to the number of moving objects. We propose an effective and non-centralized approach, Distributed Continuous Range-Skyline Query process (DCRSQ process), for supporting RSQ and CRSQ in mobile environments. By considering the mobility, the proposed approach can predict the time when an object falls in the query range and ignore more irrelevant information when deriving the results, thus saving the computation overhead. The proposed approach, DCRSQ process, is analyzed on cost and validated with extensive simulated experiments. The results show that DCRSQ process outperforms the existing approaches in different scenarios and aspects.

Distributed Parallel and Cluster Computing Databases

How to Scale Exponential Backoff

925 - Michael A. Bender , Jeremy T. Fineman , Seth Gilbert 2014

Randomized exponential backoff is a widely deployed technique for coordinating access to a shared resource. A good backoff protocol should, arguably, satisfy three natural properties: (i) it should provide constant throughput, wasting as little time as possible; (ii) it should require few failed access attempts, minimizing the amount of wasted effort; and (iii) it should be robust, continuing to work efficiently even if some of the access attempts fail for spurious reasons. Unfortunately, exponential backoff has some well-known limitations in two of these areas: it provides poor (sub-constant) throughput (in the worst case), and is not robust (to resource acquisition failures). The goal of this paper is to fix exponential backoff by making it scalable, particularly focusing on the case where processes arrive in an on-line, worst-case fashion. We present a relatively simple backoff protocol~Re-Backoff~that has, at its heart, a version of exponential backoff. It guarantees expected constant throughput with dynamic process arrivals and requires only an expected polylogarithmic number of access attempts per process. Re-Backoff is also robust to periods where the shared resource is unavailable for a period of time. If it is unavailable for $D$ time slots, Re-Backoff provides the following guarantees. When the number of packets is a finite $n$, the average expected number of access attempts for successfully sending a packet is $O(log^2( n + D))$. In the infinite case, the average expected number of access attempts for successfully sending a packet is $O( log^2(eta) + log^2(D) )$ where $eta$ is the maximum number of processes that are ever in the system concurrently.

Distributed Parallel and Cluster Computing Networking and Internet Architecture

Toward Large-Scale Autonomous Monitoring and Sensing of Underwater Pollutants

54 - Huber Flores , Naser Hossein Motlagh , Agustin Zuniga 2020

Marine pollution is a growing worldwide concern, affecting health of marine ecosystems, human health, climate change, and weather patterns. To reduce underwater pollution, it is critical to have access to accurate information about the extent of marine pollutants as otherwise appropriate countermeasures and cleaning measures cannot be chosen. Currently such information is difficult to acquire as existing monitoring solutions are highly laborious or costly, limited to specific pollutants, and have limited spatial and temporal resolution. In this article, we present a research vision of large-scale autonomous marine pollution monitoring that uses coordinated groups of autonomous underwater vehicles (AUV)s to monitor extent and characteristics of marine pollutants. We highlight key requirements and reference technologies to establish a research roadmap for realizing this vision. We also address the feasibility of our vision, carrying out controlled experiments that address classification of pollutants and collaborative underwater processing, two key research challenges for our vision.

Distributed Parallel and Cluster Computing Signal Processing

comments

Fetching comments

AlHawash Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Monitoring, Analyzing, and Controlling Internet-scale Systems with ACME

Ask ChatGPT about the research

No Arabic abstract

Read More