Multi-tasking is something we take for granted on our computer systems, as it's the concept that allows us to run a number of applications at the same time, with no apparent loss of performance. The generally accepted "proper" way to do this in a desktop or server operating system is called "pre-emptive multi-tasking". This term is, sadly, bandied about in conversation by many a bluffer ("Version n+1 is much better than version n because it has pre-emptive multi-tasking") without the speaker actually comprehending what it means.
Any task, or "process", running on a computer system requires certain resources in order to be able to run. The simplest processes have merely a requirement for some time in the CPU and a bit of system memory, while more complex processes may need extra resources (a serial port, perhaps, or a certain file on a hard disk, or a tape drive).
The basic concept of multi-tasking, often called "multi-programming", is that of allowing the operating system to split the time that each system resource is available into small chunks, and allowing each of the running processes, in turn, to have access to its desired resources. The main problem with this technique is that different processes have different requirements. So, for example, a CPU-heavy process won't be able to do as much work in a given time as a process that doesn't do much work, but each will be given the same amount of time in the CPU's schedule. The actual algorithms used to decide which process to give resource to next are therefore more complex than a simple round-robin. The decision is additionally based on concepts such as the complexity of the process's code and the priority it has been given relative to its peers.
The real meat of a multi-tasking system, then, is in the cleverness of the algorithm that decides what processes get what resources, at what time. As computers have become more powerful, so it has become possible to implement increasingly powerful and complex resource allocation algorithms without compromising the speed of the computer itself. Early desktop computer systems had very basic scheduling algorithms. This was because the processors weren't particularly powerful. The operating system designer's goal was to provide usable multi-tasking but not such that the machine spent more time figuring out whose turn it was next, rather than letting the various application processes run. As processors have grown in power, so this constraint has been lifted and the scheduling algorithms have become more clever.
The main problem that multi-tasking introduces in a system is "deadlock". Deadlock occurs when two or more processes are unable to continue running because each holds some resources that the other needs. Imagine you have processes A and B and resources X and Y. Each process requires both X and Y in order to run, but process A is holding onto resource X, and B is holding onto Y. Clearly, neither process can continue until it can grab the missing resources and so the system is said to be in deadlock. Fortunately, there are a number of ways of avoiding this.
Option one is to force each process to request all of the resources it's likely to need in order to run, before it actually begins processing. If all resources are not available, it has to wait and try again later. This works but is inefficient (a process may be allocated a resource but not use it for many seconds or minutes, during which time the resource is unavailable).
Alternatively, we could define an ordering for resources such that processes are required to ask for resources in the order stipulated. In our example above, we could say that resources must be requested in alphabetical order. So processes A and B would both have to ask for resource X before asking for Y; we could never reach the situation where process B is holding resource Y, because it would have had to be granted access to resource X first (which it couldn't have because A already has it).
The problems suffered by NASA's Mars Pathfinder vehicle in 1996 were a classic example of how deadlock can occur in a multi-processing system. In short, a high priority process couldn't run because a low priority process was hanging onto a resource that was needed by its more important peer. Although this is a pretty noddy failure, at least the system builders rescued the situation by building sufficient resilience into their code such that although it didn't avoid deadlock, it was at least capable of seeing something was wrong and resetting the system without loss of data. The fix was, fortunately, an easy one; though one would have hoped that in applications such as this, where sending a field engineer out is tricky, such apparently trivial things would have been tested on the ground first.
Option three is to allow processes to grab resources in any order, but to stipulate that if a process already has certain resources and then makes a request for a new resource that can't be satisfied, it has to release (or "pre-empt") all the resources it's holding. The next time it gets a chance to run, it has to re-request all the stuff it used to hold along with the new resources it was asking for.
Pre-emption, then, is merely a way of ensuring that the system doesn't grind to a halt due to deadlock. It just so happens that, in the average case, this is a more efficient way to work than the other two deadlock avoidance schemes mentioned above – hence its apparent attraction for multi-tasking systems. The fact that pre-emption isn't actually the big deal that people who don't understand it make out, goes a long way to answering questions like: "How has Apple managed to get away with not implementing pre-emption until MacOS X?" The simple answer being that, particularly with modern processors, pre-emption will usually be the best way but there are other mechanisms for deadlock avoidance that are only slightly slower in the average case.