At a pre-briefing about IBM's X3 four-way chipset, we asked senior brand manager Donn Bullock for the X server division for his vision for the company's new Hurricane chipset, said to deliver a 38 per cent performance improvement, and how it came into being. The Hurricane processor/memory logic sub-system sits within IBM's new X366 four-way 64-bit Xeon server, which is the only such product using a non-Intel chipset.
Q: What's your involvement with the project?
A: Over last four years, I've been worldwide product manager for IBM's 440 class servers, following which I've have been working on the X3 server announcement.
Q: Tell me what has contributed to the lower pricing of the X3.
A: Let's go back to the processors. Today's new Xeon is codenamed Cranford, another is Potomac. Cranford is unique because it does not have an L3 cache. Most Intel processors have had L3 caches -- Itanium has 9MB currently, the multi-processor Xeon has 4MB and is going up to 8MB.
The beauty of the Hurricane chipset -- the third generation of IBM's enterprise architecture chipsets -- is that we don't need the L3 cache to achieve the highest performance. What we find is that in a four-way system, L3 cache doesn't contribute to the performance of the system -- it just contributes price.
So in the 366 we will use the two Cranford processors. By eliminating L3 cache, it means the processors have one less place to look before they get to main memory. So with the X3 chipset, we have very high performance going through to main memory faster. We can match the performance of large cache processors without using L3.
As for pricing, if you know the price of those large cache processors, you can by extension calculate how much less expensive the 366 is going to be. We also achieve a 38 per cent performance improvement. It's absolutely phenomenal.
Q: More cache is usually a good thing. What's different this time?
A: our third generation chipset removes the need for the cache. Customers needed the cache because system platforms have had to rely on the processor for performance, because it had to manage its own cache.
We managed the cache to increase performance by introducing what we call XL4 server accelerator cache -- the L4 cache in our 445. Now we've removed the need for it because the chipset helps the processor to run faster. It actually reduces latency -- the L3 cache gets in the way of four-way processing.
At design time, there was a maniacal focus on latency reduction. When you can cut the time it takes it gets from one point to the next you can increase performance, so chipset latency has been cut by two and half times -- down from 265 nanoseconds to 108 nanoseconds.
The way we do that is through snoop bus filtering. It looks across to the other bus -- because the system uses two buses, two per CPU -- and the snoop filter does intelligent caching. It can see what's in the other cache without having to send traffic across the FSB to find out. Other chipsets cannot do that and need the L3. And if you don't need L3 cache you shouldn't have to pay the premium to buy it.
We've done nothing different than others could do, we've just been smarter. It's a technique that's been in other IBM server products for a long time -- its taking mainframe-inspired technology and bringing it down to an industry standard server.
We also have the benefit of faster components, faster processors, faster I/O, faster memory - -all these things are additive which reduces bottlenecks and means we can do more transactions.
Q: What will the customer see in terms of performance?
A: We used our portfolio of benchmarks that simulate multiple types of workloads and in the real world commercial apps nothing comes close, we use TPCC, TPC App, SAP, SPEC, Oracle and others.
Q: What's been the most exciting and challenging part of the project?
A: Creating the value proposition. I've enjoyed seeing the system grow into a high performance opportunity to run 32 and 64-bit applications. 64-bit is going mainstream. What keeps me awake at night is trying to take advantage of all the opportunities we've created.
Q: How much of this technology can you bring down to other servers?
A: The X architecture grew from a desire to scale higher than four-way capability, we'll continue to do that using our chipsets and other technologies. But we don't want to penalise the customers of our one and two-way servers with functions they don't want or aren't willing to pay for, as they balance cost and function.