In a previous article, we talked about multi-tasking (see here) – implementing a computer operating system in such a way that it "slices" the CPU's time into small chunks and allocates chunks of time in some intelligently-calculated way in order that a number of programs can appear to run concurrently.
In most modern operating systems, though, there's a second level of concurrency available to the program developer. This is called multi-threading and it's used widely by software on both Unix-style and Windows operating systems.
What is a process?
A process is a self-contained collection of program code and data storage space. Processes execute independently of each other, and although the allocation of resources such as RAM to processes is handled by the operating system, once a chunk of memory has been allocated to a process, it belongs to that process and is inaccessible to any others (actually, it is possible to declare memory as shareable with one or more processes, but as a developer you have to go out of your way to do so – by default, your program's memory is protected from other processes).
It's possible to run more than one instance of any process at once. This is typical with, say, web server programs – if you have 120 concurrent connections to your web server, these may well be served by 120 separate instances of the web server application, each of which looks after its own specific connection.
So what's a thread?
A simple process has a single "execution pointer" – the system variable that tells the OS what command to run next. When you start a process, the execution pointer starts at a given place (the first line of the main program code) and moves about based on the flow of the program code. In a multi-threaded program, on the other hand, you can have multiple execution pointers within the same process – so you can have a number of bits of the same program all executing at once. Each thread has a single execution pointer, so the number of things being done at once equals the number of threads being run.
The main difference between a thread and a process is that a multi-threaded program may have a number of threads (and thus several execution pointers) working on the same set of instruction code and the same data structures. On the face of it, this sounds quite dangerous, as you could have two or more threads modifying the same piece of data and trampling over each other's work. And this is true – the biggest pain when you're writing a multi-threaded program is ensuring that you control the synchronisation between the various threads and making sure that they don't muck up each other's execution.
There are two immense benefits with multi-threading, though, of which the first is efficiency. By having a single copy of the program code and data that's shared by all the threads, you can potentially save a great deal of RAM (if you have 100 threads running in a 200k program, it takes 200k; if you had 100 processes instead, that would be 100 x 200k, or about 20MB). You also save on process setup and tear-down time – it does take time and effort on the operating system's part to create a new process, allocate the memory, load the code and data in there, and so on each time you create a process, and yet more time and effort is taken to dispose of everything afterwards.
The main bonus with multi-threading, though, is that as long as you're careful, you can have multiple lumps of code executing over the same data. As this isn't obviously a benefit, think about Microsoft Word. Whenever you type a word, the program's spell checker matches that word against its internal dictionary, and if it doesn't match, a little wiggly line appears below the offending word to alert you to the fact that it might be misspelt. If Word were executing as just one big thread, your keyboard would seem to freeze each time you'd completed a word, since the program would have to stop listening to you and go off and execute the spellchecker. Because the spellchecker is running as a separate thread within Word, though, the bit of the program that handles the user's typing simply notifies the spellchecker thread that there's a new word for it to check, and gets on with letting you type more while the spellchecker does its matching in the background (and perhaps then notifies the editing software that it needs to flag the word as suspect).
Multi-threading provides the programmer with the ability to run a number of different functions over the same data concurrently, and allows each thread to communicate with the others that exist within the process. It can be tricky to get to grips with the issues of multi-threading, because instead of letting the operating system prevent one lump of code from trampling over another, this is left to the programmer. Once the developer has got to grips with this issue, though, it provides a powerful way to split up program code execution and make applications perform better and with less disruption of the interaction between it and the user.