Image via Wikipedia
So what does this mean?
In the first place, computing is about achieving a piece of work, what the work consists of is irrelevant for the definition.
Grid computing is about achieving or completing a humongous piece of work which if given to a single computer would take an inordinately large amount of time and would also in all probability lock up the CPU cycles of the machine, leading to that notorious reaction ‘My computer froze”.
For people who are maybe not technically minded but are aware of SETI@home (The Search For Extra Terrestrial Intelligence – this is a volunteer computing project that utilizes the unused CPU cycles of volunteer home and work PCs to analyze radio signals emanating from space for signs of some sort of intelligent life out there. This seeks to answer that philosophical question “Are we the only ones out here on Planet Earth? It cannot be – there must be someone out there in the vast reaches of the universe”.
What this implies that each volunteer machine downloads a set of radio signal data, analyses it and sends the results back to the SETI project server. The SETI@home application is a screen-saver to be loaded onto the client machine.
This is what in technical terms is known as CPU scavenging and volunteer computing.
The term grid computing originated in the early 1990s as a metaphor for making computer power as easy to access as an electric power grid in Ian Foster‘s and Carl Kesselman‘s seminal work, "The Grid: Blueprint for a new computing infrastructure"
The crux of grid computing is very similar to what you do what when wish to complete a large chunk of work. You break the work down into smaller tasks and sub-tasks that are more controllable and easily achievable. Grid computing works on a very similar premise. Work to be performed is broken down into smaller discrete pieces that are parceled out to registered machines – machines designated to be part of the computing grid.
Once the work is completed on each registered machine, the results are then communicated to the master/server machine. Grid computing works on the premise that there is a master-slave relationship between machines at various levels. A slave may in turn be a master.
Grid computing is about distributed computing – work is distributed to computers either within or across departments, enterprises or globally.
Grid computing is about scalability – a grid scales very well to solve really large computing problems. Grid computing is a type of parallel computing except parallelism is achieved over different machines of different configurations – the heterogeneity of machines is not a significant barrier to creating a grid infrastructure.
IBM defines grid computing as “the ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across ‘multiple’ administrative domains based on their (resources) availability, capacity, performance, cost and users’ quality-of-service requirements”.
Grid computing is about quality of service. Quality of service is an important factor to consider while implementing and/or using a grid computing service or infrastructure.
A grid computing infrastructure can be commoditized – each node or machine can be purchased as a commodity hardware to form a huge supercomputer consisting of heterogeneous computers working in parallel. The only problem with grid computing that it is expected that each part parceled out to an individual machine is by itself complete and thus less reliance on network connectivity which can be several notches below in reliability and speed.
So why is grid computing gaining in popularity?
It has also been used for scientific modeling. But in current technical literature, it is more associated with how search engines do their automated tabulation of the various sites on the internet. Google’s software called MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Upwards of one thousand MapReduce jobs are executed on Google’s clusters every day. MapReduce is part of the Apache Hadoop project.
Grid computing, utility computing and cloud computing are all inter-linked. They cannot be separated from each other.
Grid computing now touches our everyday lives. The connectivity between different computing devices and their ability to communicate their status and information via networks whether wireless or otherwise lends itself to building our very own intelligent grids that serve our very own unique needs. Imagine being able to warm up your car 30 minutes before you set out to drive from the comfort of your armchair!
That sums up this discussion of grid computing! Have a great week!