Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

CHINA’S 1.5 EXAFLOPS SUPERCOMPUTER CHASES GORDON BELL PRIZE – AGAIN
The Register ^

Posted on 09/17/2023 8:06:37 PM PDT by FarCenter

...

NASA tossed down a grand challenge nearly a decade ago to do a time-dependent simulation of a complete jet engine, with aerodynamic and heat transfer simulated, and the Wuxi team, with the help of engineering researchers at a number of universities in China, the United States,m and the United Kingdom have picked up the gauntlet. What we found interesting about the paper is that it confirmed many of our speculations about the Oceanlite machine.

The system, write the paper’s authors, had over 100,000 of the custom SW26010-Pro processors designed by China’s National Research Center of Parallel Computer Engineering and Technology (known as NRCPC) for the Oceanlite system. The SW26010-Pro processor is etched using 14 nanometer processes from China’s national foundry, Semiconductor Manufacturing International Corp (SMIC), and looks like this:

The Sunway chip family is “inspired” by the 64-bit DEC Alpha 21164 processor, which is still one of the best CPUs ever made; the 16-core SW-1 chip debuted in China way back in 2006.

There are six blocks of core groups in the processor, with each core group having one fatter management processing element (MPE) for managing Linux threads and an eight by eight grid of cores comprising a compute processing element (CPE) with 256 KB of L2 cache. Each CPE has four logic blocks, which can support FP64 and FP32 math on one pair and FP16 and BF16 on another pair. Each of the core groups in the SW26010-Pro has a DDR4 memory controller and 16 GB of memory with 51.4 GB/sec of memory bandwidth, so the full device has 96 GB of main memory and 307.2 GB/sec of bandwidth. The six CPEs are linked by a ring interconnect and have two network interfaces that link them to the outside world using a proprietary interconnect, which we ave always thought was heavily inspired by the InfiniBand technology used in the original TaihuLight system. The SW26010-Pro chip is rated at 14.03 petaflops at either FP64 or FP32 precision and 55.3 petaflops at BF16 or FP16 precision.

The largest configuration of Oceanlite that we have heard of had 107,520 nodes (with one SW26010-Pro comprising a node) for a total of 41.93 million cores across 105 cabinets, and the paper just announced confirmed that the machine had a theoretical peak performance of 1.5 exaflops, which matches the performance we estimated (1.51 exaflops) and almost perfectly matches the clock speed (2.2 GHz) we estimated almost two years ago. As it turns out, the MPE cores run at 2.1 GHz and the CPW cores run at 2.25 GHz.

We still think that China may have built a bigger Oceanlite machine than this, or certainly could. At 120 cabinets, the machine would scale to 1.72 exaflops peak at FP64 percision, which is very slightly bigger than the 1.68 exaflops “Frontier” supercomputer at Oak Ridge National Laboratory, and at 160 cabinets, Oceanlite would have just under 2.3 exaflops peak at FP64. As noted in the comments below, the Wuxi team will be presenting the Oceanlite machine during a session at SC23 in November, and that session says the machine has 5 exaflops of mixed precision performance across 40 million cores. That implies a 2.5 exaflops peak performance at FP64 and FP32 precision.

Those latter numbers are important if China wants to be a spoiler and try to put a machine in the field that bests the impending “El Capitan” machine at Lawrence Livermore National Laboratory, which is promised to have in excess of 2 exaflops of FP64 oomph.


TOPICS: News/Current Events
KEYWORDS:

1 posted on 09/17/2023 8:06:37 PM PDT by FarCenter
[ Post Reply | Private Reply | View Replies]

To: FarCenter
"Imagine a Beowulf cluster of them …"

2 posted on 09/17/2023 8:10:49 PM PDT by Governor Dinwiddie
[ Post Reply | Private Reply | To 1 | View Replies]

To: FarCenter

.


3 posted on 09/17/2023 8:12:29 PM PDT by sauropod (I will stand for truth even if I stand alone.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Governor Dinwiddie

Wait…wut?


4 posted on 09/17/2023 8:25:48 PM PDT by EEGator
[ Post Reply | Private Reply | To 2 | View Replies]

To: FarCenter

1.5 exeflops? How many floating point operations per second is that?


5 posted on 09/17/2023 8:53:52 PM PDT by The people have spoken (Proud member of Hillary's basket of deplorables)
[ Post Reply | Private Reply | To 1 | View Replies]

To: The people have spoken

mega = 1,000,000
giga = 1,000,000,000
tera = 1,000,000,000,000
peta = 1,000,000,000,000,000
exa = 1,000,000,000,000,000,000

So 1.5 exaflops is 1,500,000,000,000,000,000 floating point operations per second.

I believe the next threshold is “shitaloadaflops.”


6 posted on 09/17/2023 9:02:34 PM PDT by Flatus I. Maximus (Everything I need to know about Islam, I learned on 9/11.)
[ Post Reply | Private Reply | To 5 | View Replies]

To: The people have spoken

1.5 quintillion floating point operations per second (flops)

or

1,500,000,000,000,000,000 flops


7 posted on 09/17/2023 9:07:32 PM PDT by sten (fighting tyranny never goes out of style)
[ Post Reply | Private Reply | To 5 | View Replies]

To: Governor Dinwiddie; EEGator
> "Imagine a Beowulf cluster of them …"

Wow, man, been a while since I heard that one get dragged out and dusted off. LOL, good one.

8 posted on 09/17/2023 9:46:14 PM PDT by dayglored (Strange Women Lying In Ponds Distributing Swords! Arthur Pendragon in 2024)
[ Post Reply | Private Reply | To 2 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson