Dec 1, 2000 12:00 PM

Monsters in a Box

Think you know what a supercomputer is? Think again: The real thing will blow your mind. Supercomputing visionary Seymour Cray died in a 1996 car crash, and it’s often assumed that his company – and the type of mammoth processing powerhouses that were synonymous with his name – went the way of the dinosaurs, replaced […]

Think you know what a supercomputer is? Think again: The real thing will blow your mind.

Supercomputing visionary Seymour Cray died in a 1996 car crash, and it's often assumed that his company - and the type of mammoth processing powerhouses that were synonymous with his name - went the way of the dinosaurs, replaced by super-speedy desktop PCs that made big iron obsolete.

In fact, Cray Inc. lives on, and supercomputers as a class are more relevant than ever. These days, modern versions of the monster machines - which once earned their keep almost exclusively in university labs and Cold War defense departments - are being deployed in the service of business and industry as well as science and government R&D. The newest supercomputers are massively parallel devices of astonishing power, linking hundreds or thousands of processors with a lightning-fast network and taking a variety of forms. Some are developed by IBM under multimillion-dollar contracts, but a growing number are so-called Beowulf clusters, low-cost systems that challenge the oppressive Grendel of traditional hardware. Put together by do-it-yourself IT teams, Beowulf clusters consist of hundreds of PCs wired into muscular arrays controlled by free Linux software.

Apple's sexy TV spots give the impression that supercomputing power now comes in an 8-inch cube for $1,799. Think different, says University of Tennessee professor Erich Strohmaier. One of three computer scientists who maintain the Top 500 Supercomputer Sites list (www.top500.org), Strohmaier says Apple's claim is hype, based on government-export standards for supercomputers that are years out of date. Apple's 1-gigaflop performance (a gigaflop equals 1 billion floating-point operations per second) is still 50 times slower than the last-place entry on Strohmaier's current list, where top speeds are measured in teraflops (trillions of calculations per second).

The machines on the pages that follow represent a thorough cross-section of supercomputing's state-of-the-art in terms of processing oomph and breadth of applications. Thousands of times faster than Cray's most legendary models, the new supercomputers boast a bang-to-buck ratio that's moving them out of the lab and into everyday life. More than half of the latest Top 500 sites now run enterprise or industrial apps. And IBM, which dominates the list in number of systems and total installed power, now sells 70 percent of its supercomputers to corporations instead of to governments or universities.

The new breed of supermachines can design new cars, hunt for oil, or process the world's credit card transactions in real time. Back at the lab, they sift through genetic data and simulate the behavior of aging nuclear weapons. Even the company Seymour Cray founded is rebounding: Recently freed after four years of troubled SGI ownership, Cray Inc. (now merged with Tera) is rolling out new models for customers like Ford and Phillips Petroleum.

In many ways, the box on your desk really does outperform the biggest mainframes of a decade ago. But to stay ahead in the number crunching game, you still gotta bring in the heavy metal.

Ford Engineering Computer Center

APPLICATION: Car crash and vehicle dynamics simulations HARDWARE: 2 Cray T-90s, 3 C-90s, and 5 SV1s POWER: 260 gigaflops PRICE: $100 million - $150 million

The largest customer for Cray's top-of-the-line machines, Ford Motor Company also runs a wide range of other brands that crash constantly - in simulations, that is. To save the lives of thousands of crash-test dummies, not to mention humans, the car and truck maker puts new designs through their paces at engineering computer centers in Dearborn, Michigan; Dunton, England; and Merkenich, Germany. The results of these digital demolition derbies, plus noise tests and vehicle dynamics analyses, are stored on 17 terabytes of disk space.

Ford's goal: To shorten development cycles through the increased use of computer techniques. Says Nick Smither, director of product development systems: "All of our analysis and design work is done based on mathematical models, although we still crash cars to get final verification."

Celera Data Center

APPLICATION: DNA assemly, gene hunting, and protein sequencing HARDWARE: 6 Paracel GeneMatchers, 900 Compaq AlphaServers POWER: 1.3 teraflops

For sifting through reams of genetic data to identify the genes in strands of human DNA, Celera Genomics has built the world's most hardcore nongovernmental supercomputing site. Its 6,000-square-foot data center in Rockville, Maryland, is jammed with more than 10,000 processors and still growing.

Mapping the human genome is among the most ambitious computing projects in history. But the next step is even bigger: Finding new genetic weapons to fight disease through proteomics - the study of protein chemical properties and expression, which involves complex data mining on the genome map. To power up, Celera recently acquired Paracel, whose computers analyze documents for government clients. Paracel's text-searching technology is well-suited for figuring out the sequences of protein-forming amino acids, says Marshall Peterson, Celera's VP of infrastructure technology.

But to tackle proteomics over the next few years, Celera needs 1,000 times more than its current 1.3 teraflops of power, and around 1 petabyte (1 million Gbytes) of storage space. The data center currently deciphers up to 2,500 protein sequences daily, but Peterson is shooting for a million.

ASCI White

APPLICATION: Simulation of nuclear warhead explosions HARDWARE: IBM RS/6000 with 8,192 microprocessors POWER: 12.3 teraflops PRICE: $110 million

Next summer's Top 500 list will be topped by an unchallenged pillar of computing might. The 106-ton ASCI White, now in its final stages of installation for the Department of Energy's Accelerated Strategic Computing Initiative, takes up a room bigger than two basketball courts at California's Lawrence Livermore National Laboratory. Consuming as much energy as 10,350 average homes, ASCI White is theoretically capable of 12.3 trillion calculations per second, making it more than five times faster than the current undisputed world champions, Sandia's ASCI Red and the lab's own ASCI Blue Pacific, and fast enough to model nuclear explosions in 3-D.

That's the whole point of the ASCI project. In 1994, President Clinton, faced with signing the Comprehensive Test Ban Treaty, called in top weapons designers to talk about the nation's post-ban computing needs. David Cooper, LLNL's associate director for computation and a number crunching veteran of NASA's moon missions, says the issue was straightforward: How much simulation power would eliminate the need for real-life testing of aging nuclear stockpiles? The answer: an unheard-of 100 teraflops. How long would it take? Ten years. With $1 billion in funding, the project is on track to deliver that amount in 2004. Cooper says weapons designers are already "lined up, ready to fight for access to this machine."

VisaNet

APPLICATION: Financial transaction processing HARDWARE: 21 IBM and Amdahl mainframe computers

Visa card users generate more than 3,000 transactions per second, and they're all handled at four VisaNet installations - supercenters in McLean, Virginia, and Basingstoke, England, plus smaller outfits in Yokohama, Japan, and San Mateo, California.

When you make a purchase at one of Visa's 21 million acceptance locations, the transaction request is transmitted to a VisaNet data center via global phone networks. Upon arrival, the numbers are routed to the bank that issued your credit card. An approval - you hope - is fired back through VisaNet to the merchant in less than five seconds. At the end of the financial day, a clearing and settlement process begins that balances accounts to the penny: Authorized transactions are sorted and sent to the issuer of your credit card, who pays the merchant and mails you a monthly bill.

The 50 million lines of code that run VisaNet's transaction engine are based on a recycled and continually modified 1970s-era airline-reservation operating system still used by many air carriers and hotels. Rick Knight, senior vice president at Inovant, Visa's IT subsidiary, says, "The trick is keeping the system running smoothly when you're constantly changing the engine while in flight" - an average of 20,000 routine software changes each year. Racks of batteries and multiple diesel generators stand by, ready to provide backup power, and each center's workload can be instantly shifted to another. During the holiday shopping season, extra mainframes are sometimes added to the network. Unlike research scientists, retailers and customers refuse to accept a moment's downtime.

"It's very simple to run a system at 99.5 percent availability," Knight says. "But 100 percent is a whole new ball game."

Pixar Renderfarm

APPLICATION: Computer animation HARDWARE: 250 Sun Servers (2,000 750-MHz UltraSparc-3 processors) POWER: 2 teraflops

Things are getting hairy over at Pixar's barely finished Renderfarm complex in Emeryville, California. Two hundred and fifty flashy new Enterprise Servers, each packing eight of Sun's most powerful processors, are rendering the furry beasts that live under a child's bed in the studio's current film project, Monsters Inc. A massive 25 terabytes (25,000 Gbytes) of storage space serve as supercomputer scratch paper in this 4,000-square-foot data center, where the computers are mounted on industrial I-beams to match the building's converted- warehouse decor. "Not only are I-beam racks cool, they're cheap," notes Greg Brandeau, Pixar's vice president of computer operations.

This hardware aims for the middle ground between a cluster of off-the-shelf computer processors and a monolithic supercomputer. "Two thousand boxes containing single CPUs would be a nightmare to manage," Brandeau says. "On the other hand, if we had one supercomputer and it broke with two weeks to go on a film, we'd be hosed."

Rendering is a two-step process. An eight-CPU computer will be dedicated to modeling. For example, animators will use it to determine how a patch of hair will move when a monster brushes up against an object. Once the model has been created, each frame is handed off to a single processor for top-speed rendering of each frame, which can take up to 10 hours. But Pixar's animators are quickly learning how complex they can make a frame of film while retaining the ability to render it in 60 minutes, apparently the time limit on creative patience.

"Creativity expands to fill all available CPU cycles," Brandeau says, "but only if you can get your frames back in an hour."

Genetic Programming Beowulf Cluster

APPLICATION: Evolution of computer programs via natural selection HARDWARE: 1,000 Pentium II 350-MHz processors POWER: 1 teraflop PRICE: $1 million

Day and night, evolutionary processes play out on a home-brewed supercomputer jammed into a converted lunchroom in Mountain View, California. Obsessively organized on Home Depot-style shelves are 1,000 Pentium II 350-MHz processors in minitower boxes, free software running on the Linux operating system, and a handful of hubs and switches, all tied together into a Beowulf cluster to crank out a teraflop of processing power. (See "Try This at Home.") A monthly "offering" of $3,000 to Pacific Gas & Electric nourishes the software breeding ground of Genetic Programming, a research firm where only the fittest computer programs survive.

John Koza, a pioneer in the field of genetic programming, plays Darwin here, presiding over a process that begins with what Koza calls a "primordial ooze of thousands of randomly created computer programs" living in RAM. A high-level definition of a problem is entered - for example, the mathematical equivalent of "build a better automobile cruise control circuit" - for which the system begins breeding algorithms to meet the described set of results. As bits of code reproduce into variations of their ancestors, each program in the population is rated on how well it tackles the designated problem. After generations of mutation and natural selection, Koza and his team hopes one program will rise from the ooze to solve the original problem. The team has evolved circuits sophisticated enough to infringe on old corporate patents. But, Koza explains, "We're trying to produce human-competitive results. The more computing power we can bring to bear, the better chance we have of that."

PLUS

Building Your Own Beowulf Cluster