Do you want to publish a course? Click here

The study is researching the fault tolerance in the large distributed environments such as grid computing and clusters of computers in order to find the most effective ways to deal with the errors associated with the crash one of the devices in th e environment or network disconnection to ensure the continuity of the application in the presence of the faults.In this paper we study a model of the distributed environment and the parallel applications within it. Then we provide a checkpoint mechanism that will enable us to ensure continuity of the work used by a virtual representation of the application (macro dataflow) and suitable for the applications which uses work stealing algorithm to distribute the tasks which are implemented in heterogeneous and dynamic environment. This mechanism will add a simple cost to the cost of parallel execution as a result of keeping part of the work during fault-free execution. The study also provides a mathematical model to calculate the time complexity i.e. the cost of this proposed mechanism.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا