Replies

The console version is heavily dependent on the processor, and not very much dependent on the video card.

Console F@H is optimized to take advantage of the very fast interconnects inside a CPU, and the GPU F@H is heavily dependent on the very fast interconnects inside a GPU.

The newest Penryns are blazingly fast for F@H because the chips are fairly efficient at crunching the F@H code, but the real speed comes from the 6 - 12 MBs of level 1 and 2 cache that is found inside the chip die.

It does help that the NVidia GPU has 1.4 Billion transistors inside the GPU. Think of GPUs as very specialized CPUs with tons of interconnects and specialty registers (about 800 on the latest ATI GPU).

Here is a comment from Dr Pande regarding the SMP F@H, to illustrate the speed issues with SMP.

"There was a good question in the forum that I thought others would be curious to hear:

From Vijay's blog entries it would seem that the SMP client has some fundamental advantages over running multiple singlecore client, but I can't really think of how that might be. Do you know of some architectural overview of how the MPI stuff is being used in this context?

We could just run multiple independent clients, but this would be throwing away a lot of power. What makes an SMP machine special is that it is more than just the sum of the individual parts (CPU cores), since those cores can talk to each other very fast. In FAH, machines talk to each other when they return WUs to a central server, say once a day. On SMP this happens once a millisecond or so (or faster). That 86,000,000x speed up in communication can be very useful, even if there isn't 100% utilization in the cores themselves.

The easy route would have been to run multiple single-CPU FAH-cores (this is what other projects do), but that would be a big loss for the science, as this throws away a very, very powerful resource (fast interconnects between CPUs). Indeed, it is this sort of fast interconnect which makes a supercomputer "super", since the CPUs in supercomputers (eg BlueGene) are pretty slow, but the communication between cores is very, very fast.

We've done a lot to develop algorithms for FAH-style internet connections between CPUs, but there are some calculations which require fast interconnects, and that's where the FAH/SMP client is particularly important. By allowing us to do calculations that we couldn't do otherwise, the science is pushed forward significantly (and we thus reward SMP donors with a points bonus due to this extra science done and the extra hassle involved in running the SMP client).

I guess it remains to be seen if we can pull off MPI on FAH to the point where it works effortlessly, but so far Lin and OSX look pretty good, so we're close. The A2 core should hopefully seal the deal. Now, the main task is getting Windows/SMP behaving well ..."