No Arabic abstract
WorldGRID is an intercontinental testbed spanning Europe and the US integrating architecturally different Grid implementations based on the Globus toolkit. The WorldGRID testbed has been successfully demonstrated during the WorldGRID demos at SuperComputing 2002 (Baltimore) and IST2002 (Copenhagen) where real HEP application jobs were transparently submitted from US and Europe using native mechanisms and run where resources were available, independently of their location. To monitor the behavior and performance of such testbed and spot problems as soon as they arise, DataTAG has developed the EDT-Monitor tool based on the Nagios package that allows for Virtual Organization centric views of the Grid through dynamic geographical maps. The tool has been used to spot several problems during the WorldGRID operations, such as malfunctioning Resource Brokers or Information Servers, sites not correctly configured, job dispatching problems, etc. In this paper we give an overview of the package, its features and scalability solutions and we report on the experience acquired and the benefit that a GRID operation center would gain from such a tool.
We outline design and lines of development of autonomous tools for the computing Grid management, monitoring and optimization. The management is proposed to be based on the notion of utility. Grid optimization is considered to be application-oriented. A generic Grid simulator is proposed as an optimization tool for Grid structure and functionality.
We describe R-GMA (Relational Grid Monitoring Architecture) which has been developed within the European DataGrid Project as a Grid Information and Monitoring System. Is is based on the GMA from GGF, which is a simple Consumer-Producer model. The special strength of this implementation comes from the power of the relational model. We offer a global view of the information as if each Virtual Organisation had one large relational database. We provide a number of different Producer types with different characteristics; for example some support streaming of information. We also provide combined Consumer/Producers, which are able to combine information and republish it. At the heart of the system is the mediator, which for any query is able to find and connect to the best Producers for the job. We have developed components to allow a measure of inter-working between MDS and R-GMA. We have used it both for information about the grid (primarily to find out about what services are available at any one time) and for application monitoring. R-GMA has been deployed in various testbeds; we describe some preliminary results and experiences of this deployment.
Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about Grid weather, and gives them more control over the decision making process. This paper presents a set of services that have been developed to provide more interactive resource management capabilities within the Grid Analysis Environment (GAE) being developed collaboratively by Caltech, NUST and several other institutes. These include a steering service, a job monitoring service and an estimator service that have been designed and written using a common Grid-enabled Web Services framework named Clarens. The paper also presents a performance analysis of the developed services to show that they have indeed resulted in a more interactive and powerful system for user-centric Grid-enabled physics analysis.
The GPCALMA (Grid Platform for Computer Assisted Library for MAmmography) collaboration involves several departments of physics, INFN sections, and italian hospitals. The aim of this collaboration is developing a tool that can help radiologists in early detection of breast cancer. GPCALMA has built a large distributed database of digitised mammographic images (about 5500 images corresponding to 1650 patients) and developed a CAD (Computer Aided Detection) software which is integrated in a station that can also be used for acquire new images, as archive and to perform statistical analysis. The images are completely described: pathological ones have a consistent characterization with radiologists diagnosis and histological data, non pathological ones correspond to patients with a follow up at least three years. The distributed database is realized throught the connection of all the hospitals and research centers in GRID tecnology. In each hospital local patients digital images are stored in the local database. Using GRID connection, GPCALMA will allow each node to work on distributed database data as well as local database data. Using its database the GPCALMA tools perform several analysis. A texture analysis, i.e. an automated classification on adipose, dense or glandular texture, can be provided by the system. GPCALMA software also allows classification of pathological features, in particular massive lesions analysis and microcalcification clusters analysis. The performance of the GPCALMA system will be presented in terms of the ROC (Receiver Operating Characteristic) curves. The results of GPCALMA system as second reader will also be presented.
Grid based systems require a database access mechanism that can provide seamless homogeneous access to the requested data through a virtual data access system, i.e. a system which can take care of tracking the data that is stored in geographically distributed heterogeneous databases. This system should provide an integrated view of the data that is stored in the different repositories by using a virtual data access mechanism, i.e. a mechanism which can hide the heterogeneity of the backend databases from the client applications. This paper focuses on accessing data stored in disparate relational databases through a web service interface, and exploits the features of a Data Warehouse and Data Marts. We present a middleware that enables applications to access data stored in geographically distributed relational databases without being aware of their physical locations and underlying schema. A web service interface is provided to enable applications to access this middleware in a language and platform independent way. A prototype implementation was created based on Clarens [4], Unity [7] and POOL [8]. This ability to access the data stored in the distributed relational databases transparently is likely to be a very powerful one for Grid users, especially the scientific community wishing to collate and analyze data distributed over the Grid.