Do you want to publish a course? Click here

Public Cluster : parallel machine with multi-block approach

90   0   0.0 ( 0 )
 Added by L.T. Handoko
 Publication date 2007
and research's language is English




Ask ChatGPT about the research

We introduce a new approach to enable an open and public parallel machine which is accessible for multi users with multi jobs belong to different blocks running at the same time. The concept is required especially for parallel machines which are dedicated for public use as implemented at the LIPI Public Cluster. We have deployed the simplest technique by running multi daemons of parallel processing engine with different configuration files specified for each user assigned to access the system, and also developed an integrated system to fully control and monitor the whole system over web. A brief performance analysis is also given for Message Parsing Interface (MPI) engine. It is shown that the proposed approach is quite reliable and affect the whole performances only slightly.



rate research

Read More

92 - Z. Akbar , L.T. Handoko 2007
We present extended multi block approach in the LIPI Public Cluster. The multi block approach enables a cluster to be divided into several independent blocks which run jobs owned by different users simultaneously. Previously, we have maintained the blocks using single master node for all blocks due to efficiency and resource limitations. Following recent advancements and expansion of nodes number, we have modified the multi block approach with multiple master nodes, each of them is responsible for a single block. We argue that this approach improves the overall performance significantly, for especially data intensive computational works.
103 - Z. Akbar , L.T. Handoko 2007
A web-based interface dedicated for cluster computer which is publicly accessible for free is introduced. The interface plays an important role to enable secure public access, while providing user-friendly computational environment for end-users and easy maintainance for administrators as well. The whole architecture which integrates both aspects of hardware and software is briefly explained. It is argued that the public cluster is globally a unique approach, and could be a new kind of e-learning system especially for parallel programming communities.
111 - Z. Akbar , L.T. Handoko 2007
We introduce an optimization algorithm for resource allocation in the LIPI Public Cluster to optimize its usage according to incoming requests from users. The tool is an extended and modified genetic algorithm developed to match specific natures of public cluster. We present a detail analysis of optimization, and compare the results with the exact calculation. We show that it would be very useful and could realize an automatic decision making system for public clusters.
144 - Z. Akbar , L.T. Handoko 2009
An architecture to enable some blocks consisting of several nodes in a public cluster connected to different grid collaborations is introduced. It is realized by inserting a web-service in addition to the standard Globus Toolkit. The new web-service performs two main tasks : authenticate the digital certificate contained in an incoming requests and forward it to the designated block. The appropriate block is mapped with the username of the blocks owner contained in the digital certificate. It is argued that this algorithm opens an opportunity for any blocks in a public cluster to join various global grids.
CP tensor decomposition with alternating least squares (ALS) is dominated in cost by the matricized-tensor times Khatri-Rao product (MTTKRP) kernel that is necessary to set up the quadratic optimization subproblems. State-of-art parallel ALS implementations use dimension trees to avoid redundant computations across MTTKRPs within each ALS sweep. In this paper, we propose two new parallel algorithms to accelerate CP-ALS. We introduce the multi-sweep dimension tree (MSDT) algorithm, which requires the contraction between an order N input tensor and the first-contracted input matrix once every (N-1)/N sweeps. This algorithm reduces the leading order computational cost by a factor of 2(N-1)/N relative to the best previously known approach. In addition, we introduce a more communication-efficient approach to parallelizing an approximate CP-ALS algorithm, pairwise perturbation. This technique uses perturbative corrections to the subproblems rather than recomputing the contractions, and asymptotically accelerates ALS. Our benchmark results show that the per-sweep time achieves 1.25X speed-up for MSDT and 1.94X speed-up for pairwise perturbation compared to the state-of-art dimension trees running on 1024 processors on the Stampede2 supercomputer.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا