This article is excerpted from the book Linux Patch Management: Keeping Linux Systems Up to Date , published by Prentice Hall Professional, as part of the Bruce Perens' Open Source Series, in January, 2006. Copyright 2006 Pearson Education.

This is the second part of a two-part article. The first part was published yesterday.

Amount of data

Hard drives don't always have to be large. If you're only storing updates on a Linux patch management repository, all you need on a proxy server is room for the operating system and the package updates. With today's Linux distributions, that amounts to less than 10GB of data (even for SUSE Linux).

But this may grow quickly depending on your needs. As suggested earlier, if you have more than one type of system architecture, such as Intel 32-bit, 64-bit, PowerPC, and so on, data requirements can grow exponentially.

In addition, if you have more than one version of Linux that requires updates, you'll need to keep separate repositories for each. In other words, not only do you need separate repositories for any RHEL, SUSE, and Debian computers, you'll need separate repositories for RHEL 3, RHEL 4, SUSE 9.0, SUSE 9.1, SUSE 9.2, SUSE 9.3, SUSE 10.0, Fedora Core 1 through 5, and more.

Source packages

The size of your repositories could easily double if your users need access to the source code.

Source code is readily available because of the requirements of the GPL. This allows any developer to take the source code and modify it for their own needs. Developers can modify and redistribute GPL source code, as long as it is still released under the GPL.

But unless you're administering a network with Linux developers, you might think that you don't have to worry about Linux source packages. You don't have to be a developer to need Linux source code. Anyone who wants to customise the Linux kernel needs the source code. While some distributions make the source code available in a binary package, Red Hat releases its build of the kernel source code as a source RPM.

Linux drivers for many hardware components are still under development. Experimental drivers often work well. But to install many of these drivers, you need the kernel source code.

The need for source code is not limited to the kernel. Regular users can modify the source code of many GPL-licensed programs. All they need is the source code, and they can revise, compile, and rebuild the program of choice.

More than one repository

Even if all the Linux computers on your network run the same version of the same distribution using the same CPU, you may need more than one repository. As noted earlier, Fedora Core 4 includes configuration files for six different repositories. While most users don't need to access the Red Hat development, or repositories, you never know when one user might be so desperate for the latest features that he is willing to try a development version of his favourite program.

You might need separate repositories on a Linux patch management server for the following reasons:

  • Distribution brands; each distribution brand builds its packages differently and therefore maintains separate repositories.

  • Distribution versions; for example, if you have a reliable server on RHEL 2.1, you might not want to upgrade to RHEL 4, even though a subscription makes this possible. Red Hat has committed to support each of its enterprise distributions for at least five years, which means you might need to maintain local repositories for RHEL 2.1 through at least 2007.

  • Number of architectures; for each architecture where you have Linux installed on your network, you'll need a separate repository.

  • Source requirements; if some of your users are Linux developers, you might need to download source packages into a separate repository.

  • Development packages; if you're monitoring the progress of a program, you'll want access to the development packages so you can monitor improvements as they're made.

  • Testing packages; some groups make development packages available in a separate repository before declaring them stable and ready for a production environment.

  • Independent repositories; some developers keep independent repositories on their Web and FTP servers. Many are available for public use.

Keeping your repository updated

There are two basic methods you might use to keep a local repository up to date. Some distributions allow you to store download packages and their headers on a local computer. You can then share the directory with those download packages and point the other computers on your network to that local repository. Red Hat and SUSE make that possible in some cases with their Subscription systems.

Alternatively, you can synchronise your repositories with those available online. This is where the rsync command can help. (I detail how you can do this in later chapters).

Some administrators prefer to wait until updates are available in a more convenient format. For example, RHEL makes quarterly updates of its distributions available, which administrators can configure into a local repository using the techniques described in the last half of this book.

After you've determined how you'll keep your repositories up to date, automate the process so that it happens during off-peak hours. Remember, you might be downloading hundreds of megabytes of packages, so make sure that the download does not interfere with other scheduled work, such as:

  • Computing-intensive programs. For example, intensive database programs require so much in computer and network resources that they're often run while most users are not at work.

  • Administrative programs. Backups, log rotations, and more in Linux are often configured in scripts in the /etc/cron.daily directory.

  • Other downloads. Many Linux users download other distributions. It's best if you can limit large downloads by regular users at least to specific time periods.

  • Power outages. Some networks may limit available power due to costs, availability, or local conditions.