The installation and updating of software remains one of the sysadmin's biggest headaches. But fortunately one of Google top administrator's, Marc Merlin, was willing to share how he approaches it.
Merlins platform of choice is Debian GNU/Linux, which he said has the most comprehensive software library available for Linux. Merlin developed "getupdates" - a program to keep client software synchronised with a central server.
"I wrote getupdates to keep Linux machines in sync with a repository," he explained. "I've deployed it on Red Hat at the two companies I've installed it at, even if Debian is my personal platform of choice. Getupdates is, however, distribution and even Unix-agnostic."
Merlins implementation of getupdates uses Debians apt-get software package management tool, but can be used with any other software installation mechanism. "Debian needs to deal with systems that could have more than 5,000 packages with different release cycles, and does a superb job in providing useful dependencies between them, he said. So you can pick and choose what you upgrade, and get the dependencies right without having to upgrade the rest of your system."
Google operates the worlds largest commercial Linux cluster, and admit to some 10,000 servers, although cyber pundits have estimated this number could now be as high as 80,000. Merlin declined to comment on how Google manages to keep its army of servers updated but said: "If you maintain more than 10 to 20 servers, it is just inconceivable not to have some kind of update mechanism."
"If you don't have a way to keep all your workstations and servers in sync, you can spend a huge amount of time fixing them and updating them by sneakernet," he said. "Basically, make the client expandable and keep all the important data [such as home directories] on NFS or SMB servers. Then, keep all the clients in sync as well as possible, and if the user has access to break them, and does break them, they can just be reloaded from a custom-made install CD or install image."
Although Merlin doesnt have experience with commercial software provisioning tools, he said flexibility is one advantage that open source has. "If you don't like the update mechanism, you can fix it, or make your own," he said. "In my case, it's mostly a question of how much expertise you have in-house versus how much you're willing to trust another company to do the work for you. If you have very specific needs and in-house expertise, it may be easier to just do the updates yourself. If your systems are closer to the stock distribution, it's easier and it may make more sense to pay another company to handle updates and software distribution for you."
When asked about using open source provisioning tools in a commercial environment, Merlin said acceptance depends on the company in question. "If the company is modern enough to embrace, or at least allow the use of open source, then this is not really a problem," he said. "Some other places do not really understand open source, and can have irrational fears, with results like banning its use entirely."
IBMs global vice president of Tivoli software sales, Wally Casey, said he doesnt know if Google has done a cost analysis of its infrastructure, adding there is probably a good opportunity for consolidation. "How does a company deal with provisioning? Its a little more complex than just ordering provisioning software," Casey said. "For example, Nokia and the IRS in the US use our configuration manager to blast an update down whether you want it or not."
If automated provisioning reduces human error there may be tens of millions in savings waiting to be taken advantage of, Casey said.