Adarsh Grid

Research

If we knew what we were doing, it wouldn't be called research, would it? -- Albert Einstein
Invention is the mother of necessity --- Thorstein Veblen

What is Grid?

This question is asked by everyone who are starting, started and still doing research in Grid Computing. I was in the same ship when I started my research in Grid Computing. From the start of my research till date I have come across various dynamically changing "Grid Definitions". But currently what I say is Grid is social being. To still add to the fact (personally) Cloud Computing, Green Grid and other new "terminologies" are only the marketing hypes so that one can get financial aid for the research projects, to market new text books and to create a new buzz in the IT world. In case of a processor Core 2 Duo and Dual Core are one and the same.

Grid in need is Grid indeed

The question "What is Grid?" is as volatile as the Grid Environment. We should build a scheduler and load balancer to maintain the proper/correct definition of Grid from the start of its generation till some one comes out of a new definition that is accepted by researchers in academics and in industries. As you might have noticed, my definitions for the Grid on my website varies from time to time.
Cluster computing is not Grid Computing, as you need user accounts on the cluster to use it. In case of Grid, one cannot have his/her user account on all the machines that have been shared in the Grid Environment. This is overcome by having trust/certificates. Now one cannot question, how can everyone get certificates? if one can get a credit/debit/ration/social security card why can't he get a Grid usage card/certificate.

Grid is a volatile heterogeneous distributed system in which resources spread across multiple virtual organizations are able to select, share, collaborate and integrate based on common rules(trust).
I am administering a sub-cluster of 8 machines running Debian Testing, Globus 2.4, Distributive Interactive Engineering Toolbox (DIET), MPICH-G2. Hardware configuration includes Dell Optiplex 170, Pentium 4 2.6Ghz, 1 GB RAM, 40GB HDD, Gigabit LAN, all public IPs. Padriag, one of my colleague is also involved in tweaking and fine tuning of the cluster machines. Debian is a flavor chosen becacuse it is easy to maintain (apt-get package management). Click on the image to enlarge.

The word "Grid" is used in comparison with Electrical Grids. There is no proper definition for Grid. Its up to the ones understanding they call it as a Grid. Each and every organization have their own terminology and understanding for calling it as a Grid. Unification of 3 clusters in educational instruction is a Grid and also unification of 3 clusters in different parts of the country is also Grid and further unification of clusters across different parts of the world is also Grid. Grid is a terminology used with respect to unification of resources per say they may not only be the clusters, they might be standalone machines or databases scattered across the globe.

Initially, I started my work exploring different Job Management Systems (JMS) and Resource management systems. You can say Job or Resource management systems as one and the same. Different crowd use different terminology but the underlying working is the same. Installed most of the JMS on our machines tested them with some job/task execution. Detailed their characteristics (software built,platform dependent,centralized control, decentralized control and many other characteristics), differentiated their characteristics from each other. Job management systems were Codine(Sun Grid Engine), LSF, PBS etc. Myself and my colleague Brian along with our guides help and suggestion we wrote a survey paper comparing various JMSs with our at home research system/product WebCom (metacomputer) which is currently updated and gridified to call itself as WebCom-G. This was my first paper and its title was "Comparison of WebCom in the context of Job Management Systems" published in Proceedings of the 2nd International Symposium on Parallel and Distributed Computing (ISPDC), Iasi, Romania, July 17-20, 2002.

Effectively after surveying and trying out many JMSs, we found that most of the JMSs were starting to add "G" by the end of their name. For e.g. somename-g. Most of the "g" refers to Globus. Globus toolkit (a bag of services - Please refer Globus website) is a software to build Grid environments. Systems that were suffixed/hyphenated were Nimrod-G, Condor-G where "G" refers to Globus. So we researched and came with an idea of interoperating/leveraging existing JMSs (middlwares came into existent) with WebCom, we were able to interoperate with those systems. We wrote a paper regarding this title of the paper was "WebCom-G" (here G refers to Grid)and was published in Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2003), Las Vegas, Nevada, U.S.A., June 23-26, 2003. Then started the work on core middlewares, user level middlewares, web services and application development.

I was working on the Information gathering module with respect to grid running Globus and its integration with WebCom information services. WebCom is a core middleware or metacomputer developed by our group Centre for Unified Computing. In a Grid Environment resources such as PC's, clusters, databases , remote instruments owned by multiple domains are distributed across the world and we come to know about them through information services hosted by their respective domains. Everything that is involved in Grid is heterogeneous in architecture therefore real time information gathered by the core middleware's is important. Results of this research was published in the paper titled " The information gathering module of the WebCom-G operating system" and was published in Proceedings of the 2nd International Symposium on Parallel and Distributed Computing (ISPDC) - IEEE Press, Ljubljana, Slovenia., September, 2003

Presently experimenting and investigating with the grid to get collect the information about the resources hosted by multiple administrative domains, check the quality of information, but it has its respective branches depending on time and availability and how up to date is information gathered. Interested to resource forecast on a particular grid which would be similar to weather forecast for the next time scale or interval, this would give us an overall Quality of Service prediction for a particular Grid. This information would be incorporated with the WebCom system.

In between these research interests, we (Myself and Padraig) did an intensive survey about MPI and it variant flavors. We studied and tried out MPI, MPICH, MPICH-G, MPICH-G2, LAM-MPI, PACX-MPI and MPICH-GX. The main reason behind the survey was to find the working of these MPI flavours and the setup they require. We wanted to interconnect the communications between MPI process running on public IP cluster and an head node cluster where worker nodes connected to the head node have a private IPs. Found out some researchers in Asia (MPICH-GX researchers) are working on this. Our aim was to provide the communication layer without having root/administrator involved in installing/configuring/running message passing processes.

Grid is an open secure shop for processor cycles, memory or storage space and computation and presently it has been driving force for major players like IBM, HP, SUN etc. I am interested in using WebCom providing economic models built within it and providing quality of service for the end users.

In conjunction with the rest we are looking into modules which will interoperate with various middlewares and various versions of Globus Toolkit in particular. Globus seems to be diversifying itself from command line to web services; our grid group is looking into the interoperability with various layers or levels of Globus architecture with our WebCom system. Globus is the de facto toolkit for building Grid systems, but it just builds us the skeleton for us, rest of the flesh like Fault tolerance, Load Balancing, Scheduling will be exploited using WebCom as middleware.

Please do share your research ideas with me. I would be interested in collaborative work do write to me with comments and research interests.

Techy

1970-80s->Internet
1980-90s->Distributed
1990-95->Inital Grid I-way SC'95
1995-till date->maturing Grid

Gilder's Law

The total bandwidth of communication systems triples every twelve months

Moore's Law

Transistors on a chip would double every 18 months

Grid Computing Questions | Grid Computing Books | Grid Links | Contact

Thank you for visiting my website
Site Design by Adarsh Patil
Copyrights reserved © CUC & Adarsh