ﻻ يوجد ملخص باللغة العربية
Systematic checkpointing of the machine state makes restart of execution from a safe state possible upon detection of an error. The time and energy overhead of checkpointing, however, grows with the frequency of checkpointing. Amortizing this overhead becomes especially challenging, considering the growth of expected error rates, as checkpointing frequency tends to increase with increasing error rates. Based on the observation that due to imbalanced technology scaling, recomputing a data value can be more energy efficient than retrieving (i.e., loading) a stored copy, this paper explores how recomputation of data values (which otherwise would be read from a checkpoint from memory or secondary storage) can reduce the machine state to be checkpointed, and thereby reduce the checkpointing overhead. Specifically, the resulting amnesic checkpointing framework AmnesiCHK can reduce the storage overhead by up to 23.91%; time overhead, by 11.92%; and energy overhead, by 12.53%, respectively, even in a relatively small scale system.
Software Defined Networks (SDNs) have dramatically simplified network management. However, enabling pure SDNs to respond in real-time while handling massive amounts of data still remains a challenging task. In contrast, fog computing has strong poten
A non-invasive, cloud-agnostic approach is demonstrated for extending existing cloud platforms to include checkpoint-restart capability. Most cloud platforms currently rely on each application to provide its own fault tolerance. A uniform mechanism w
Mobile edge computing (MEC) has become a promising solution to utilize distributed computing resources for supporting computation-intensive vehicular applications in dynamic driving environments. To facilitate this paradigm, the onsite resource tradi
Transparently checkpointing MPI for fault tolerance and load balancing is a long-standing problem in HPC. The problem has been complicated by the need to provide checkpoint-restart services for all combinations of an MPI implementation over all netwo
Distributed Stream Processing systems are becoming an increasingly essential part of Big Data processing platforms as users grow ever more reliant on their ability to provide fast access to new results. As such, making timely decisions based on these