No Arabic abstract
This paper has been withdrawn by the author
Cloud storage plays an important role in social computing. This paper aims to develop a cloud storage management system for mobile devices to support an extended set of file operations. Because of the limit of storage, bandwidth, power consumption, and other resource restrictions, most existing cloud storage apps for smartphones do not keep local copies of files. This efficient design, however, limits the application capacities. In this paper, we attempt to extend the available file operations for cloud storage service to better serve smartphone users. We develop an efficient and secure file management system, Skyfiles, to support more advanced file operations. The basic idea of our design is to utilize cloud instances to assist file operations. Particularly, Skyfiles supports downloading, compressing, encrypting, and converting operations, as well as file transfer between two smartphone users cloud storage spaces. In addition, we propose a protocol for users to share their idle instances. All file operations supported by Skyfiles can be efficiently and securely accomplished with either a self-created instance or shared instance.
Vehicular Cloud Computing (VCC) is a new technological shift which exploits the computation and storage resources on vehicles for computational service provisioning. Spare on-board resources are pooled by a VCC operator, e.g. a roadside unit, to complete task requests using the vehicle-as-a-resource framework. In this paper, we investigate timely service provisioning for deadline-constrained tasks in VCC systems by leveraging the task replication technique (i.e., allowing one task to be executed by several server vehicles). A learning-based algorithm, called DATE-V (Deadline-Aware Task rEplication for Vehicular Cloud), is proposed to address the special issues in VCC systems including uncertainty of vehicle movements, volatile vehicle members, and large vehicle population. The proposed algorithm is developed based on a novel Contextual-Combinatorial Multi-Armed Bandit (CC-MAB) learning framework. DATE-V is `contextual because it utilizes side information (context) of vehicles and tasks to infer the completion probability of a task replication under random vehicle movements. DATE-V is `combinatorial because it aims to replicate the received task and send the task replications to multiple server vehicles to guarantee the service timeliness. We rigorously prove that our learning algorithm achieves a sublinear regret bound compared to an oracle algorithm that knows the exact completion probability of any task replications. Simulations are carried out based on real-world vehicle movement traces and the results show that DATE-V significantly outperforms benchmark solutions.
Can cloud computing infrastructures provide HPC-competitive performance for scientific applications broadly? Despite prolific related literature, this question remains open. Answers are crucial for designing future systems and democratizing high-performance computing. We present a multi-level approach to investigate the performance gap between HPC and cloud computing, isolating different variables that contribute to this gap. Our experiments are divided into (i) hardware and system microbenchmarks and (ii) user application proxies. The results show that todays high-end cloud computing can deliver HPC-competitive performance not only for computationally intensive applications but also for memory- and communication-intensive applications - at least at modest scales - thanks to the high-speed memory systems and interconnects and dedicated batch scheduling now available on some cloud platforms.
Businesses have made increasing adoption and incorporation of cloud technology into internal processes in the last decade. The cloud-based deployment provides on-demand availability without active management. More recently, the concept of cloud-native application has been proposed and represents an invaluable step toward helping organizations develop software faster and update it more frequently to achieve dramatic business outcomes. Cloud-native is an approach to build and run applications that exploit the cloud computing delivery models advantages. It is more about how applications are created and deployed than where. The container-based virtualization technology, such as Docker and Kubernetes, serves as the foundation for cloud-native applications. This paper investigates the performance of two popular computational-intensive applications, big data, and deep learning, in a cloud-native environment. We analyze the system overhead and resource usage for these applications. Through extensive experiments, we show that the completion time reduces by up to 79.4% by changing the default setting and increases by up to 96.7% due to different resource management schemes on two platforms. Additionally, the resource release is delayed by up to 116.7% across different systems. Our work can guide developers, administrators, and researchers to better design and deploy their applications by selecting and configuring a hosting platform.
With the emergence of the big data age, the issue of how to obtain valuable knowledge from a dataset efficiently and accurately has attracted increasingly attention from both academia and industry. This paper presents a Parallel Random Forest (PRF) algorithm for big data on the Apache Spark platform. The PRF algorithm is optimized based on a hybrid approach combining data-parallel and task-parallel optimization. From the perspective of data-parallel optimization, a vertical data-partitioning method is performed to reduce the data communication cost effectively, and a data-multiplexing method is performed is performed to allow the training dataset to be reused and diminish the volume of data. From the perspective of task-parallel optimization, a dual parallel approach is carried out in the training process of RF, and a task Directed Acyclic Graph (DAG) is created according to the parallel training process of PRF and the dependence of the Resilient Distributed Datasets (RDD) objects. Then, different task schedulers are invoked for the tasks in the DAG. Moreover, to improve the algorithms accuracy for large, high-dimensional, and noisy data, we perform a dimension-reduction approach in the training process and a weighted voting approach in the prediction process prior to parallelization. Extensive experimental results indicate the superiority and notable advantages of the PRF algorithm over the relevant algorithms implemented by Spark MLlib and other studies in terms of the classification accuracy, performance, and scalability.