When Intel introduced its first pure 64-bit processor line, the Itanium, grand forecasts were made for it. However, as it turned out there was still plenty of life left in 32-bit and hybrid 32/64-bit architectures, not least because the first generation Itanium's performance was disappointing, and software required reworking to gain what performance advantage there was to be had.

Today, other processors such as Intel's Xeon and AMD's Opteron look much more cost effective at the low end - under $25,000 per system. Intel claims 85 percent of this segment but where it really wants to be is in high end servers, where margins and processor counts are significantly higher.

It claims to have 47 percent of the $25,000-plus market today, the rest being various RISC processors. However, most of its presence is in the $25,000-$50,000 region.

Still, Intel has had some successes in large scale multiprocessing. The 24th edition of the TOP500 list was released last month, listing the world's most powerful supercomputers, and showed Intel climbing up the rankings fast. It notes that 320 of the 500 fastest systems are now Intel-based, up from 287 six months ago, and only 189 a year ago. (By comparison, 31 of the 500 are AMD-based.)

A focus on HPC
As a result, while Intel is happy to hype the many system builders who offer dual-processor (DP) Itanium machines, a lot of its effort is now being directed elsewhere - specifically at high-performance computing (HPC) and supercomputers, primarily for scientic and techical applications. It has been working hard to score design wins here, and says that 84 of the TOP500 list are based on Itanium, with two of the top five being Itanium-2 clusters.

The Itanium-2 line now includes the Madison chip running at 1.6GHz and with 9MB cache for multiprocessor servers (up from 1.5GHz and 6MB), plus 3MB cache versions at 1.6GHz for dual-processor (DP) systems or at 1.3GHz for low-voltage bladeserver use.

The second half of 2005 should see Madison supplanted by chips codenamed Montecito for multiprocessor systems, and Millington for dual-processors, to be followed by Montvale which will also have a DP version.

The current Itanium-2 features hyperthreading but these upcoming chips will also have double-cores and support up to four threads per core. Montecito will also have up to 24MB of on-die L3 cache (12MB per core) and a number of other new technologies, including hardware support for virtualisation (Silvervale, which support and protects processor partitioning) and several ways of saving power, such as demand-based switching.

Multicore Itanium
Beyond them, Intel is lining up technologies called Tukwila for multiprocessors and Dimona for DP, in a generation which will have more than two cores per processor. It says this generation will have a common platform architecture, plus enhanced I/O and memory, and self-provisioning and self-healing capabilities, among others.

Montecito is a big chip - some 1.7 billion transistors and about 500mm square - and it willl consume up to 100W, so you might think it's no wonder that Intel is worried about power saving. However, this is rather less than Madison consumes for a single core, and Intel still claims Montecito will outperform Madison by 50 to 100 percent, even on its lower power budget.

Intel says it installed 100,000 Itanium processors in 2003 and says that 2004's shipments are 60 percent up on that. Meanwhile, SGI says it has delivered more than 1000 Altix systems, containing over 31,000 Itaniums (Itania?), which must makes it a very significant sales channel for the processor.

A third of those have gone to a single customer, namely the US space agency NASA, which has acquired 20 of SGI's largest Altix 3000-series Linux based supercomputers, each with 512 processors. These have been put to work on projects such as the main engine upgrade and re-entry analysis for the Space Shuttle, and climate modelling, and the total cluster with its 10,160 processors is rated number two in the TOP500.

Developers wanted
The next big challenge for Intel is software support. It says that over 2000 third-party applications have been ported to or recompiled for Itanium, this sounds impressive but puts it way behind competing architectures, whether they be Sun Sparc (12,000) or a Palm PDA (20,000).

As Intel pushes into supercomputing there is the added problem that many developers and potential users do not have the resources to buy this sort of high end hardware, so Intel must provide them with systems to work on instead. The latest example is a 32-way Altix 3000 at Intel's UK head office in Swindon - together with SGI, Intel will make the system available to developers who need to test their software on a high end parallel processor, either on-site or remotely.

The deal also solves SGI's problem of how to deal with major European customers and software houses. "We have systems to ship to developers worldwide, but those are smaller," said Christian Tanescu, SGI's director of engineering for simulation. He added that his company chose to use Itanium-2 because it can address more memory than any other processor. "Today we can offer 16TB of memory, it will be 192TB next year," he says.

The Altix 3000 family takes processors in blocks of eight: the 32-way machine at Intel's office has four of these, plus half a TB of disk in the main rack and 2TB more in a second rack. Larger systems have multiple racks up to 512 processors. At present, SGI can run a single Linux image on 256 processors, with 512 planned for next year.