Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

[AMD vs. Intel] Opteron and Itanium: Two Roads to 64-bit Computing
Ace's Hardware ^ | July 5, 2002 | Johan De Gelas

Posted on 07/05/2002 10:05:53 PM PDT by JameRetief

Opteron and Itanium: Two Roads to 64-bit Computing
By Johan De Gelas
Friday, July 5, 2002 7:51 AM EDT

A flood of articles have already been written about AMD's Opteron, otherwise known as Sledgehammer and Clawhammer DP. Quite a few editorials believe it will become a very popular server and workstation CPU which will force Intel to follow in AMD's footsteps and introduce 64-bit extensions in their current 32-bit x86 line. At the same time, Intel and many industry analysts claim that 64-bit CPUs for the workstation and desktop are more of a marketing gimmick than anything else, at least for the present.

You may also recall Intel's comments from our CEBIT coverage:

Intel really wants to bring HyperThreading technology to the home desktop, as they believe it will add more value than 64-bit addressing. Intel speculates that the current desktop PC might be a sort of home server in the future, with many thin home appliances connected to it. In other words multi-threading will become more and more important as each of these home appliances (MP3 music player, MPEG4 movie box.... ) will run different tasks and threads on the central PC. Therefore, Prescott, which will probably feature 1 MB of L2-cache (not confirmed), will bring HyperThreading to the home desktop market.

The reasoning behind "64-bit is not necessary for desktops and workstations" is the claim that more than 4 GB of RAM is only a necessity in high-end servers. The majority of desktop PCs today ship with 256 MB and high-end PCs have 512 MB RAM. The amount of RAM in a typical PC doubles at most every year. In 1999, we typically had 64-128 MB of RAM in our PC, in 2000 it was 128 MB, in 2001, 128-256 MB, and today we are moving towards 512 MB. Workstations, on the other hand, are typically shipped with 512 MB to 1024 MB of RAM.

Therefore Intel and others claim that support for more than 4 GB of RAM is not necessary for anything but high-end servers for 2002 and 2003. Let's investigate this statement.

64-bit CPUs: Only For Large Servers?

At first glance, this seems accurate. Most x86 workstations do indeed ship with 1 GB of RAM and the typical workstation chipsets (i860) support no more than 4 GB. In fact, a lot of x86 workstations are limited to 2 GB. But does this mean that workstation users do not need more than 4 GB? John, a high end workstation user, answered this question with a resounding "NO!" on our General Message Board:

"I work on individual parts which have hundreds of thousands of edges (aircraft radar antennas). My assemblies require 2-3 gigs of RAM just to pull up. Ever since the Geforce cards came out, it has been feasible to work in shaded mode, so we usually do.... since most models are more pleasant to look at that way.... especially curved surfaces."

I started asking around and it seems that a lot of workstations users share his view and are not happy at all with 2 GB in a workstation. High-end CAD and 3D Animation applications typically use between 2 and 3 GB of memory, and several have indicated that they would be more productive with 4 GB and more. Even with 4 GB of memory, the current 32-bit versions of Windows 2000 can not satisfy their needs:

"Windows 2000 Pro is our baseline OS, with the high-end users running Advanced server . Advanced Server has a tweak which allows an individual process to use up to 3 GB of memory, while Windows 2000 Pro only allows 2 GB per process."

You can imagine that these users are looking forward to a 64-bit version of Windows which can use more than 4 GB of memory and offer more space for their most demanding processes.

When we asked John whether he was looking forward to to Hammer and what his feelings where about x86-64, he answered:

"Boy howdy! x86-64 addresses two of the biggest problems I have:
a) I don't have to wait for my software vendor to change anything to get great performance immediately (in 32-bit mode)
b) 64 bit memory addressing will allow me to spend less time working around the 3 GB limit of Windows, and improve the collaborative engineering process by letting be more parametric.

So let it be clear that current workstations users are working in stained circumstances, and there is a lot of interest for a 64-bit x86 solution. So, will Hammer DP conquer the workstation by storm? Does Intel have anything to counter this?

Intel Solution #1: Xeon and Physical Address Extensions (PAE)

Many of our readers are hardware veterans and will point out that the current 32-bit Xeon CPUs from Intel are not limited to 4 GB of memory. Indeed, Intel's latest E7500 chipset for the Pentium 4 Xeon supports up to 16 GB of RAM, and the slightly older Profusion chipset for the Pentium III XEON supports up to 32 GB of RAM.

Intel's Xeon CPUs feature Physical Address Extensions (PAE) which can use a 36-bit address bus. So, in theory a Xeon (or any current Intel CPU for that matter) can access up to 64 GB. So the problem is solved, no point of migrating to a new 64 bit platform and investing in new software? The first problem is that you need proper software support to access more than 4 GB on a 32-bit PAE CPU.

First of all you need Windows 2000 Advanced Server, which can address up to 8 GB, or the extremely expensive Windows 2000 Datacenter for up to 64 GB. Then you also need software that makes use of Microsoft's Address Windowing Extensions (AWE) API. Only then you can use more than 4 GB of memory. By the way the Linux kernel 2.4 also supports PAE and thus more than 4 GB of memory total, but each process can only use 4 GB of memory.

So, let investigate Windows AWE a bit more. AWE works with an AWE window that exists within the 4 GB address space. As 32-bit CPUs can only address 4 GB of memory at any given time, every time you need something that the OS has put above the 4 GB RAM limit, the AWE window needs to be remapped to the location where that data is stored.

As you can see in the model above, AWE comes with a lot of overhead. As Windows has to keep track of the pages of the memory, such mapping of AWE memory on the AWE window is a very memory intensive and slow operation that doesn't take a few nanoseconds, but tens of microseconds! So, if you have to access a lot of different locations in the memory above 4 GB, a lot of remapping and book keeping must be done. AWE memory might still be quite a bit faster than accessing the hard disk, but it is 10 to 100 times slower than normal memory use. Optimizations in an application to improve data locality can minimize the need to shuffle AWE windows around too much, therefore improving performance with AWE memory to some degree.

The result is that, for the most part, AWE memory is only interesting for caching databases, to avoid accessing the disk system. But a workstation user who needs to work fluidly and efficiently with massive datasets has no use for it.

So, for high end workstations users, the 32-bit Xeon with or without PAE looks much less attractive than AMD's Opteron. Even without 64-bit x86-64 workstation applications, AMD's flagship CPU should do very well as long as the x86-64 version of Windows arrives in a timely fashion. In that case, Hammer users will be able to assign 4 GB to their favorite applications (and use the rest for the OS and other applications), while Xeon users will still be limited to 2-3 GB per application.

Intel Solution #2: Deerfield

Long before marketing decided to name Intel's IA-64 processor "Itanium," the IA-64 project was codenamed "P7." If you consider that the Pentium Pro had codename "P6" and Willamette was codenamed "P68," it is clear that years ago Intel thought IA-64 would take over around the time that 64-bit addressing was necessary to compete in the workstation, servers and even desktop markets.

Looking at the massive Itanium modules, with their rather mediocre performance, it is pretty hard to imagine that Intel expects IA-64 will take over from 32-bit x86 in the near future.


Dual Itanium - a massive module

Right now it seems that IA-64 is a total failure. Today this is true from a commercial point of view, but from a technical point of view, Itanium still has merits.


The Itanium chip itself is only 25 million transistors, you can see the separate L3-cache chips above

Yes, in total, an Itanium module features 325 million transistors and at 130 Watts, it is a power guzzling beast. Nevertheless, it must be noted that the Itanium core itself (including 32 KB L1 and 96 KB L2 caches) features only 25 million transistors, while the four L3 cache chips are good for 75 million transistors each. In other words, IA-64 has kept at least one promise: it saves transistors in the decoding and scheduling part, and should theoretically offer better IPC by using those transistors for more registers and execution units. But IA-64's first implementation in Itanium seems to be a failure in achieving significantly higher IPC, as Itanium delivers less performance per clock than essentially all of the RISC competitors it seeks to replace, according to SPECint2000. Indeed, in terms of performance per clock, the Itanium is behind the MIPS R1x000 series, HP's PA-RISC 8x000 chips, IBM's POWER3 and POWER4 architectures, Fujitsu's SPARC64 GP, the Alpha 21264, Sun's UltraSPARC II and III, and Intel's own Pentium III. IPC is only one factor in overall performance, however, but while Itanium does pull ahead of the x86 world's best in SPECint/GHz, the Pentium 4 and Athlon, it cannot match the clockrate of either chip.

Itanium/Merced is just one implementation of IA-64, and all indications are that Itanium 2, aka McKinley, will be significantly improved. The last we spoke with Intel, they pointed towards Deerfield as Intel's upcoming 64-bit workstation CPU. Deerfield is a low-cost version of Madison, the 0.13µ version of Intel's improved IA-64 McKinley processor. Essentially, it is a 0.13µ McKinley, while Madison increases the on-chip cache sizes. Mike Magee of the Inquirer reported that Deerfield would already be launched in second quarter of 2003:

"Madison and Deerfield are slated for Q2 of next year, with 3MB and 4MB caches respectively, and again using the E8870 chipset."

In other words, Deerfield could be a significant, though slightly late and more expensive competitor to AMD's Opteron.

As Deerfield is based on the McKinley core, which offers much higher performance than Itanium. According to a report at the Inquirer, the 1 GHz Itanium 2 ("Mckinley") outperforms the current 800 MHz Itanium by 90% in both SpecInt and SpecFP (760 vs 400, 1350 vs 701). Intel itself, meanwhile, indicates that McKinley will deliver 70% better performance in SPECint2000 and 75% better performance in SPECfp2000 (see page 6 of this document for details). The biggest problem for Intel, however, will not be performance, but getting the major ISVs to produce IA-64 versions of their software. At this point of time, we have yet to see one major workstation application for the Itanium, as Intel's IA-64 CPU has been positioned towards the high-end server market for the most part.

Intel Solution #3: Prescott

Prescott is Intel's next-generation Pentium 4 processor line, which will have close to 100 million transistors and which will be produced on a 90-nanometer process. The high-end desktop chip (still 478 pin) has been rumored to include "Yamhill" technology, a sort of "Intel x86-64."

Our sources confirmed that Prescott does indeed have a lot more cache on board - a logical evolution after Northwood - most likely a larger 1 MB L2 cache. We also received some confirmation that Prescott has quite a few architectural improvements, but no indication of x86-64 support so far. Apparently, Intel has studied x86-64, but canceled the project. It is still possible that the x86-64 extensions are in the chip, but not activated.

Two independent sources on the Internet seem to confirm this. First of all, the Inquirer reported that Paul Otellini underlined the importance of IA-64:

"A REPORT QUOTED senior Intel executive Paul Otellini as saying the firm would not produce a 64-bit backward processor compatible with 32-bit code.

Paul Otellini, speaking at a meeting in New York earlier today, said Intel's future was firmly in the Itanium camp and he confirmed earlier INQUIRER reports that Madison is slated for next year and will include 3MB and 6MB caches."

Secondly, an online CV of an Intel Engineer, Andy Glew says:

"IA32+ - yet another canceled project, which proposed to extend the IA32/x86 instruction set to 64 bits. I worked on 64 bit page tables, obtaining a patent on variable page size encoding."

This is not really surprising. After all, an Intel version of x86-64 would be devastating for IA-64 software support. Why would any ISV invest large amounts of money in a IA-64 software development if there is a much easier alternative to develop for.

As our CEBIT report indicated, Intel wants to bring the HyperThreading technology to the home desktop, and Prescott will be the first desktop processor which will improve performance with HyperTreading. Also, Prescott will feature a 667 MHz FSB around Q2 2003, which will be fed by dual channel PC2700 DDR SDRAM.

With the evidence we have today, we must conclude Prescott will still face the 4 GB limit (without PAE), and as such is not a solution for those workstation users who need more than 4 GB. Prescott is a very dangerous competitor to AMD's desktop Clawhammer, and Nocona (the "Xeon Prescott") for; AMD's Opteron, but it still leaves a window of opportunity open for the Opteron in the high-end workstation market which will need more than the 3 GB RAM process space that the 32-bit versions of Windows (XP) can offer.

Intel Solution #4: Tejas

Little is known about Tejas, which is the next incarnation of the Pentium 4 line. Tejas seems to be scheduled to launch in the first half of 2004. At that time, the majority of workstation users will probably need more than 4 GB, so Intel needs to do something to break the 4 GB barrier. Relying on Deerfield alone seems very risky. At first I could not believe my eyes when some of our sources indicated IA-64 support within Tejas!

But then I read the CV of Andy Glew, the Intel Engineer, a bit further:

"Tejas - evaluated support for IA64 within a mainstream IA32 processor."

Tejas is too far off to speculate whether or not Intel would integrate IA-64 in a 32-bit x86 CPU, let alone how. But it is clear that Intel is very serious about migrating to IA-64. Furthermore, Tejas is rumored to feature a faster FSB (1200 MHz?). However, considering all the differences between x86 and IA-64, the chances of a true hybrid processor seem remote at best.

Conclusion

Considering that Intel is on the verge of launching the Itanium 2 and has another four IA-64 cores in the pipeline (Madison, Deerfield, Montecito (90 nm), Chivano (90 nm)), it is very unlikely that Intel would kill off ISV investments in IA-64 software by introducing their own version of x86-64.

In the high-end server market, Intel's Itanium 2 should start to pick up, as it performs pretty well, and software support should begin to improve. The original Itanium hasn't exactly paved the way for its successor, though, so Itanium 2 will still be starting from effectively ground zero in terms of software support and userbase. Meanwhile, AMD's Opteron will probably only gain marketshare slowly, as this market is less susceptible to performance but more to software support, reputation, perceived reliability and the force of habit ("nobody gets fired for choosing Intel based servers"). Let us not forget that by volume, 89% of all servers ship with Intel CPUs!

And it remains to be seen how quickly the big database vendors (Oracle, Sybase, Microsoft) will produce fully 64-bit x86-64 versions for the Opteron.

However, in the workstation market, the Opteron can be a very effective weapon. As we have reported before, the Opteron's platform is very scalable and the hypertransport is a very elegant way of interconnecting the ASICs making motherboard very flexible and less expensive. Running a 64-bit version of Windows, the Opteron can offer 4 GB to each 32-bit process without any performance hit. Being significantly improved over the Athlon MP, we expect the Opteron to perform exceptionally well in workstation applications, and these two advantages might increase AMD's popularity in the workstation market. In the longer term, this might encourage workstation ISVs to develop and launch x86-64 versions of their software.

To counter the threat of Opteron, Intel's will most likely opt to push Deerfield in the high-end workstation market, while a Xeon version of Tejas could smooth the migration from 32-bit x86 to 64-bit IA-64. Nevertheless, there is an important Window of opportunity for AMD in 2003. This wormhole to the workstation galaxy will probably collapse in 2004 when Tejas enters the market, and Intel has gathered enough software support for Deerfield.

For the immediate future, however, we have Itanium 2 to look forward too, and beyond that in 2003 is Opteron. We'll be taking a look at both in the third-part of Chris' Volume Multi-Processor Systems series. If you aren't familiar with it yet, you may want to read over Part 1 and Part 2.

All Content is Copyright (C) 1998-2002 Ace's Hardware. All Rights Reserved.


TOPICS: Business/Economy; Culture/Society; Extended News; News/Current Events; Technical
KEYWORDS: amd; barriers; clawhammer; competition; desktop; intel; processors; servers; techindex

1 posted on 07/05/2002 10:05:53 PM PDT by JameRetief
[ Post Reply | Private Reply | View Replies]

To: *tech_index; Ernest_at_the_Beach
.
2 posted on 07/05/2002 10:41:57 PM PDT by Libertarianize the GOP
[ Post Reply | Private Reply | To 1 | View Replies]

To: JameRetief
bookmark bump
3 posted on 07/05/2002 10:42:36 PM PDT by Cacique
[ Post Reply | Private Reply | To 1 | View Replies]

To: Libertarianize the GOP; JameRetief; *tech_index; Mathlete; Apple Pan Dowdy; grundle; beckett; ...
Thanks for the ping.

Have not read all of the article , but looks like a great article.

To find all articles tagged or indexed using tech_index

Click here: tech_index

4 posted on 07/05/2002 10:50:25 PM PDT by Ernest_at_the_Beach
[ Post Reply | Private Reply | To 2 | View Replies]

To: JameRetief
Shades of the way that memory was paged by the early '86 CPUs.
5 posted on 07/06/2002 1:42:55 AM PDT by HiTech RedNeck
[ Post Reply | Private Reply | To 1 | View Replies]

To: Ernest_at_the_Beach
Is there a tech ping list? If so, who is running it? Like to be on it. Interesting article.
6 posted on 07/06/2002 3:45:07 AM PDT by KeyWest
[ Post Reply | Private Reply | To 4 | View Replies]

To: JameRetief
This article ignores Intel buying the Alpha chip and future itanics will have a large portion of its "soul" incorporated into it. Dec/Compaq/HPaq are a sinking ship of fools.
7 posted on 07/06/2002 5:10:54 AM PDT by ozone1
[ Post Reply | Private Reply | To 1 | View Replies]

To: JameRetief
Thanks for the post. That HyperThreading technology sounds more attractive to me than 64-bit. It would be like getting a multi-processor system within a single CPU.
8 posted on 07/06/2002 5:24:07 AM PDT by Scutter
[ Post Reply | Private Reply | To 1 | View Replies]

To: Ernest_at_the_Beach
thanks for the ping
9 posted on 07/06/2002 9:01:58 AM PDT by Free the USA
[ Post Reply | Private Reply | To 4 | View Replies]

To: JameRetief
Good infor - thank you.

Happy Birthday President Bush!

Don't miss this one.

10 posted on 07/06/2002 9:04:16 AM PDT by lodwick
[ Post Reply | Private Reply | To 1 | View Replies]

To: JameRetief
So how would this affect gamers and typical users?
11 posted on 07/06/2002 11:42:34 AM PDT by Bogey78O
[ Post Reply | Private Reply | To 1 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson