No Arabic abstract
The serverless computing model strengthens the cloud computing tendency to abstract resource management. Serverless platforms are responsible for deploying and scaling the developers applications. Serverless also incorporated the pay-as-you-go billing model, which only considers the time spent processing client requests. Such a decision created a natural incentive for improving the platforms efficient resource usage. This search for efficiency can lead to the cold start problem, which represents a delay to execute serverless applications. Among the solutions proposed to deal with the cold start, those based on the snapshot method stand out. Despite the rich exploration of the technique, there is a lack of research that evaluates the solutions trade-offs. In this direction, this work compares two solutions to mitigate the cold start: Prebaking and SEUSS. We analyzed the solutions performance with functions of different levels of complexity: NoOp, a function that renders Markdown to HTML, and a function that loads 41 MB of dependencies. Preliminary results indicated that Prebaking showed a 33% and 25% superior performance to startup the NoOp and Markdown functions, respectively. Further analysis also revealed that Prebakings warmup mechanism reduced the Markdown first request processing time by 69%.
Additive Runge-Kutta methods designed for preserving highly accurate solutions in mixed-precision computation were proposed and analyzed in [8]. These specially designed methods use reduced precision or the implicit computations and full precision for the explicit computations. We develop a FORTRAN code to solve a nonlinear system of ordinary differential equations using the mixed precision additive Runge-Kutta (MP-ARK) methods on IBM POWER9 and Intel x86_64 chips. The convergence, accuracy, runtime, and energy consumption of these methods is explored. We show that these MP-ARK methods efficiently produce accurate solutions with significant reductions in runtime (and by extension energy consumption).
In this paper, we propose the first optimum process scheduling algorithm for an increasingly prevalent type of heterogeneous multicore (HEMC) system that combines high-performance big cores and energy-efficient small cores with the same instruction-set architecture (ISA). Existing algorithms are all heuristics-based, and the well-known IPC-driven approach essentially tries to schedule high scaling factor processes on big cores. Our analysis shows that, for optimum solutions, it is also critical to consider placing long running processes on big cores. Tests of SPEC 2006 cases on various big-small core combinations show that our proposed optimum approach is up to 34% faster than the IPC-driven heuristic approach in terms of total workload completion time. The complexity of our algorithm is O(NlogN) where N is the number of processes. Therefore, the proposed optimum algorithm is practical for use.
Serverless computing is increasingly popular because of its lower cost and easier deployment. Several cloud service providers (CSPs) offer serverless computing on their public clouds, but it may bring the vendor lock-in risk. To avoid this limitation, many open-source serverless platforms come out to allow developers to freely deploy and manage functions on self-hosted clouds. However, building effective functions requires much expertise and thorough comprehension of platform frameworks and features that affect performance. It is a challenge for a service developer to differentiate and select the appropriate serverless platform for different demands and scenarios. Thus, we elaborate the frameworks and event processing models of four popular open-source serverless platforms and identify their salient idiosyncrasies. We analyze the root causes of performance differences between different service exporting and auto-scaling modes on those platforms. Further, we provide several insights for future work, such as auto-scaling and metric collection.
The rigid MPI programming model and batch scheduling dominate high-performance computing. While clouds brought new levels of elasticity into the world of computing, supercomputers still suffer from low resource utilization rates. To enhance supercomputing clusters with the benefits of serverless computing, a modern cloud programming paradigm for pay-as-you-go execution of stateless functions, we present rFaaS, the first RDMA-aware Function-as-a-Service (FaaS) platform. With hot invocations and decentralized function placement, we overcome the major performance limitations of FaaS systems and provide low-latency remote invocations in multi-tenant environments. We evaluate the new serverless system through a series of microbenchmarks and show that remote functions execute with negligible performance overheads. We demonstrate how serverless computing can bring elastic resource management into MPI-based high-performance applications. Overall, our results show that MPI applications can benefit from modern cloud programming paradigms to guarantee high performance at lower resource costs.
Recently, embedding techniques have achieved impressive success in recommender systems. However, the embedding techniques are data demanding and suffer from the cold-start problem. Especially, for the cold-start item which only has limited interactions, it is hard to train a reasonable item ID embedding, called cold ID embedding, which is a major challenge for the embedding techniques. The cold item ID embedding has two main problems: (1) A gap is existing between the cold ID embedding and the deep model. (2) Cold ID embedding would be seriously affected by noisy interaction. However, most existing methods do not consider both two issues in the cold-start problem, simultaneously. To address these problems, we adopt two key ideas: (1) Speed up the model fitting for the cold item ID embedding (fast adaptation). (2) Alleviate the influence of noise. Along this line, we propose Meta Scaling and Shifting Networks to generate scaling and shifting functions for each item, respectively. The scaling function can directly transform cold item ID embeddings into warm feature space which can fit the model better, and the shifting function is able to produce stable embeddings from the noisy embeddings. With the two meta networks, we propose Meta Warm Up Framework (MWUF) which learns to warm up cold ID embeddings. Moreover, MWUF is a general framework that can be applied upon various existing deep recommendation models. The proposed model is evaluated on three popular benchmarks, including both recommendation and advertising datasets. The evaluation results demonstrate its superior performance and compatibility.