E nabling
T echnologies and
A pplications


.... to the PetaFLOPS Architecture and Systems Structure Home Page!

Here we will discuss the latest thoughts on what will be involved in building computing hardware to function at the peta(Fl)ops level. Please feel free to contact the editoral board with any updates or additional information that is relevant.

The organization of this material is as follows:

Besides updating the above material as time goes on, we will also add additional discussion areas dealing with key system issues such as most appropriate performance metrics, I/O, backing storage, packaging, etc.

How far are we away from a PetaFLOPS Today?

Sandia National Labs reported the most recent world record of 281 GF on the widely referenced Linpack benchmark, utilizing two parallel processors with a total of 6,768 separate microprocessors. A PetaFLOPS is about 3,500 times larger. Also note that this 281 GF is between 30% and 40% of the PEAK PERFORMANCE of the same hardware.

What might a PetaFLOPS machine cost today?

As discussed below, such machines are heavily memory driven, so if we assumed that a mass produced 100 MF processor board with 100 MB of memory on it might cost approximately $10,000, then today we would need 10 million such boards, for a total of about $100 Billion.

Besides performance what are the other metrics that would drive the design of a PetaFLOPS machine?

There are two "rules of thumb" that have developed over the years about very high end processors:

Thus, one PetaFLOPS of performance would require 1 PB of accessible memory and 8 PB per second between that memory and the hardware performing the computations. Both of these heavily drive the overall parameters of any PetaFLOPS machine, regardless of architecture. In the 1993 PetaFLOPS Workshop, the surfacing of these two metrics (particularly storage) caused the applications subgroup to come to the conclusion that there is an important class of problems (basically three dimensional plus time simulations) where a rule of N GF requiring N**3/4 GB might be more appropriate. This would translate into a useable PetaFLOPS system with a mere 32 TB of main memory - a 30X reduction in needed memory.

If we continue to use CMOS as a base technology, what might be a reasonable set of technology assumptions for a PetaFLOPS machine twenty years from now?

The 1993 Workshop based their work on the 1992 SIA technology projections through the year 2007, and extrapolated from that to assume that by 2013 CMOS would support 0.05 micron feature size chips at 0.9 Vdd and internal clocks running at 2000MHz. This translates into logic chips with up to 80 million gates, DRAMs of 256 Gbits (32GB), and SRAMs of 64 Gbits (8GB). The most recent 1994 SIA projections (see for example COMPUTER DESIGN Magazine, May 1995, p. 50) tend to agree with these projections, with the exception that the onchip high performance logic curves may level out to only somewhat above 1100 MHz.

What are the major architectural approaches that appear to be capable of scaling to PetaFLOPS levels?

In the 1993 Workshop three separate machine configurations (pictured below) were studied. Click on each one for more detail. Although all three seemed to be capable of reaching a PetaFLOPS, for cost reasons it was suggested that building one of each, with individual performance in the 400 TF range might be more cost effective for a broader class of problems.

(Special Note: we are in the process of converting these graphics for posting to the WWW)

Return to the P.E.T.A. Directory

Authorizing NASA Official: Bill Feiereisen, Program Manager, NASA HPCC Program
Senior Editor: Thomas Sterling

Curators: Michele O'Connell (Michele.OConnell@hq.nasa.gov), Lawrence Picha (Larry.Picha@hq.nasa.gov),

Revised: 21 AUGUST 96 (lpicha)