No Arabic abstract
Disaggregating resources in data centers is an emerging trend. Recent work has begun to explore memory disaggregation, but suffers limitations including lack of consideration of the complexity of cloud-based deployment, including heterogeneous hardware and APIs for cloud users and operators. In this paper, we present FluidMem, a complete system to realize disaggregated memory in the datacenter. Going beyond simply demonstrating remote memory is possible, we create an entire Memory as a Service. We define the requirements of Memory as a Service and build its implementation in Linux as FluidMem. We present a performance analysis of FluidMem and demonstrate that it transparently supports remote memory for standard applications such as MongoDB and genome sequencing applications.
Web application performance is heavily reliant on the hit rate of memory-based caches. Current DRAM-based web caches statically partition their memory across multiple applications sharing the cache. This causes under utilization of memory which negatively impacts cache hit rates. We present Memshare, a novel web memory cache that dynamically manages memory across applications. Memshare provides a resource sharing model that guarantees private memory to different applications while dynamically allocating the remaining shared memory to optimize overall hit rate. Todays high cost of DRAM storage and the availability of high performance CPU and memory bandwidth, make web caches memory capacity bound. Memshares log-structured design allows it to provide significantly higher hit rates and dynamically partition memory among applications at the expense of increased CPU and memory bandwidth consumption. In addition, Memshare allows applications to use their own eviction policy for their objects, independent of other applications. We implemented Memshare and ran it on a week-long trace from a commercial memcached provider. We demonstrate that Memshare increases the combined hit rate of the applications in the trace by an 6.1% (from 84.7% hit rate to 90.8% hit rate) and reduces the total number of misses by 39.7% without affecting system throughput or latency. Even for single-tenant applications, Memshare increases the average hit rate of the current state-of-the-art memory cache by an additional 2.7% on our real-world trace.
Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. SC is being increasingly adopted by industry for a variety of applications. A significant obstacle to using SC for practical applications is the memory overhead of the underlying cryptography. We develop MAGE, an execution engine for SC that efficiently runs SC computations that do not fit in memory. We observe that, due to their intended security guarantees, SC schemes are inherently oblivious -- their memory access patterns are independent of the input data. Using this property, MAGE calculates the memory access pattern ahead of time and uses it to produce a memory management plan. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation.
It is widely acknowledged that the forthcoming 5G architecture will be highly heterogeneous and deployed with a high degree of density. These changes over the current 4G bring many challenges on how to achieve an efficient operation from the network management perspective. In this article, we introduce a revolutionary vision of the future 5G wireless networks, in which the network is no longer limited by hardware or even software. Specifically, by the idea of virtualizing the wireless networks, which has recently gained increasing attention, we introduce the Everything-as-a-Service (XaaS) taxonomy to light the way towards designing the service-oriented wireless networks. The concepts, challenges along with the research opportunities for realizing XaaS in wireless networks are overviewed and discussed.
Electroencephalography (EEG) is an extensively-used and well-studied technique in the field of medical diagnostics and treatment for brain disorders, including epilepsy, migraines, and tumors. The analysis and interpretation of EEGs require physicians to have specialized training, which is not common even among most doctors in the developed world, let alone the developing world where physician shortages plague society. This problem can be addressed by teleEEG that uses remote EEG analysis by experts or by local computer processing of EEGs. However, both of these options are prohibitively expensive and the second option requires abundant computing resources and infrastructure, which is another concern in developing countries where there are resource constraints on capital and computing infrastructure. In this work, we present a cloud-based deep neural network approach to provide decision support for non-specialist physicians in EEG analysis and interpretation. Named `neurology-as-a-service, the approach requires almost no manual intervention in feature engineering and in the selection of an optimal architecture and hyperparameters of the neural network. In this study, we deploy a pipeline that includes moving EEG data to the cloud and getting optimal models for various classification tasks. Our initial prototype has been tested only in developed world environments to-date, but our intention is to test it in developing world environments in future work. We demonstrate the performance of our proposed approach using the BCI2000 EEG MMI dataset, on which our service attains 63.4% accuracy for the task of classifying real vs. imaginary activity performed by the subject, which is significantly higher than what is obtained with a shallow approach such as support vector machines.
Co-location and memory sharing between latency-critical services, such as key-value store and web search, and best-effort batch jobs is an appealing approach to improving memory utilization in multi-tenant datacenter systems. However, we find that the very diverse goals of job co-location and the GNU/Linux system stack can lead to severe performance degradation of latency-critical services under memory pressure in a multi-tenant system. We address memory pressure for latency-critical services via fast memory allocation and proactive reclamation. We find that memory allocation latency dominates the overall query latency, especially under memory pressure. We analyze the default memory management mechanism provided by GNU/Linux system stack and identify the reasons why it is inefficient for latency-critical services in a multi-tenant system. We present Hermes, a fast memory allocation mechanism in user space that adaptively reserves memory for latency-critical services. It advises Linux OS to proactively reclaim memory of batch jobs. We implement Hermes in GNU C Library. Experimental result shows that Hermes reduces the average and the $99^{th}$ percentile memory allocation latency by up to 54.4% and 62.4% for a micro benchmark, respectively. For two real-world latency-critical services, Hermes reduces both the average and the $99^{th}$ percentile tail query latency by up to 40.3%. Compared to the default Glibc, jemalloc and TCMalloc, Hermes reduces Service Level Objective violation by up to 84.3% under memory pressure.