It is no surprise that enterprise-level grid computing (GC) – spreading the organisation's processing requirements across a potentially heterogeneous collection of computer systems in order to minimise the time processors spend idle – is becoming increasingly popular. In a previous RTFM, we looked at enterprise grids but there's another aspect of GC that has been with us for a while. As you might expect, that's going through something of an evolutionary period too.
This is the area of public grid computing – utilising the spare processor capacity of the millions of PCs in people's homes and on workers' desks around the world. Most of us have come across the screensavers that do "useful" work by downloading data from the Internet and processing it instead of just bouncing pretty pictures around the screen: we're talking about things like [email protected] (which analyses radio-telescope data in the search for signals that indicate there's intelligent life out there in the universe) or the multitude of medical applications that analyse cell data for cancerous signs.
The complications of public grids
Public grid computing has a few complications that private grids don't. In a private installation you have some measure of control over what's going on in the network. You can be pretty sure about the availability and progress of each processor in the network, since if it's not in your own building, it's in the building of someone you know and can speak to at will. With a public grid application, you have no such guarantees – and so you can't actually be sure that if the system dishes out some work to do, it'll ever come back completed. Hence you have to devise an algorithm for re-allocating work after a sensible timeout in order to avoid stuff falling down a black hole.
The other main issue with a public grid, unlike a private one, is that you have to assume that the power of each component on the network is minimal. In a private grid you'd probably have a number of meaty machines connected with links running at hundreds of Mbit/sec at the very least; in a public network you have a depressingly large number of 300MHz PIIIs with maybe 64MB RAM, connected over dial-up links.
This performance issue has a huge influence over your choice of data unit size. Instead of chucking several megs of stuff to each processor, you'll probably only want to send a couple of hundred megs of work to each client machine, or it'll annoy the user. Also, you'll want to make sure that the volume of processing to do is such that it should only take a few days – given that there's every chance of a lump of work vanishing forever (e.g. if someone's hard disk crashes) the last thing you want is to have to wait six months before you can be sure the work segment needs to be reallocated.
Traditionally, public grid applications have been single-purpose programs. You install the software on your computer and it understands how to do its specific job; every so often it will fetch another chunk of data from the main server and will get on with processing that chunk of work.
This approach was fine when there were only a handful of public grid applications and the concept was quite new, as there was a reasonable pool of computer users out there for each new application. This isn't the case any longer – it's next to impossible to start a new proprietary public grid application because most of the people who might choose to run your particular instance are already occupied with someone else's – and it's too much like hard work to change.
The answer's obvious: try to bring the public grid applications onto a common processing platform that dishes out not just the data to be processed but the code as well. So each client machine has a common platform that can be given work to do on any problem, not just some specific task that its proprietary algorithm is designed to do.
The BOINC project is an ideal example of such an approach. Produced by the people who brought us [email protected], the aim is to migrate from the current mass of project-specific packages into a worldwide collection of generic public grid clients.
Of course, we're unlikely to see a single, generic public grid package. The nature of the Net is that although consolidation happens, we rarely end up with just a single "winner" in each architecture field. We will, however, most likely see a small handful of public grid execution systems, which will open the door for more application providers to utilise all that spare processor capacity that sits doing nothing on millions of computers worldwide.