Intel and AMD Follow in Footsteps of Mysterious Google Switch

Over the past year, Intel and AMD have spent more than half a billion dollars acquiring a grab bag of companies that are reinventing the way computers connect to each other, making them faster and more power-efficient. In other words, they're trying to Googlize the rest of the world's networks.
Image may contain Human Person Clothing and Apparel
Why did Intel spend $140 million for Cray's networking tech? You can find the answer in the mysterious Google "Pluto Switch."Image: networking-forum.com

The mysterious Pluto Switch has shone a light on Google's mysterious data centers, but there's another way to look at this networking device from another planet. Take it as a sign of how the complex computer systems that power everything from Gmail to Apple's iCloud are being remade to handle the massive bunches of data caroming around today's internet.

When you're adding servers by the truckload and those devices are spending most of their time talking to each other, you get a pretty clear idea of what you need a networking switch to do, and long ago, Google realized it had to build its own. And the general-purpose gear sold by the Ciscos and Junipers of the world simply wasn't cutting it.

Read More:
Mystery Google Device Appears in Small-Town IowaThe problem? Essentially, the data center is becoming a kind of supercomputer, and the networks they use to connect have to be reinvented to make this supercomputer work as efficiently as it should. And that applies not only to Google but to all sorts of other online operations.

Don't believe us? Ask Intel and AMD. Over the past year, the two chipmakers have spent more than half a billion dollars acquiring a grab bag of companies that are reinventing the way computers connect to each other, making them faster and more power-efficient. Rather than building new networking gear, they're building new networking tech directly into chips and servers.

The chipmakers are playing catch-up with their customers. The big data centers have been building their own switches and their own slimmed-down versions of the Ethernet network fabric for years, according to Andrew Feldman, manager of AMD's Data Center Server Solutions group. He once sold networking gear to Google, back when he worked at a company called Force10 Networks. He says that over the past few years all of the big internet companies have been using specialized networking technology, or fabric.

"They pared back what Ethernet was, and the result was a specialized thing that no longer looked like the general-purpose Ethernet," he says. "What you'd arrived at was a new type of fabric."

Google was the pioneer in this area, but as Google's networking staffers jumped ship, this type of networking knowledge has circulated to the Microsofts, Facebooks, and Zyngas of the world.

Five years ago, Feldman co-founded SeaMicro, a company that crammed hundreds of low-power servers into a big iron box and then connected them all using its own super-fast networking fabric. AMD paid $334 million for SeaMicro in February. Less than two months later, Intel bought the network fabric technology from Cray for $140 million. Over the past year or so, Intel has also snatched up Ethernet switch chipmaker Fulcrum Microsystems, as well as fabric technology from another company called QLogic.

Once again, it's an echo of the work going on at Google. The web giant's research staff includes networking gurus who once worked for the likes of Cray and Juniper.

Why would chip-makers spend hundreds of millions of dollars acquiring networking fabric? Maybe it's because they can do something with these technologies that even Google and Microsoft cannot do: they can work them right into their microprocessors.

That's Intel's plan, and it's something that could save customers a lot of money, says Raj Hazra, the general manager of Intel's high performance computing group. In some data centers, networking gear can guzzle as much as 40 percent of the energy costs. But it doesn't have to be that way, Hazra says.

If a computer wants to send a message to another computer on the network it has to have a back and forth conversation with its Ethernet card before it can packing up some data and ferry it along. This conversation takes just microseconds -- not enough time to be noticeable on a home network -- but in a massive supercomputer or a big data center, this overhead starts to add up.

Intel thinks it can solve the problem by building the computer's networking chip right into the microprocessor. "We believe that there is no other way to solve some of these challenges without starting to integrate the fabric in the CPU platform," says Raj Hazra, Intel vice president and general manager of high performance computing.

This is something that the supercomputer crowd has been begging chipmakers to deliver for years, says John Shalf, the department head for computer science at Lawrence Berkeley National Lab. But companies like Intel don't sell enough processors into the supercomputing market to make the kind of changes that Hazra is talking about worthwhile. Add companies like Google, and Apple, Microsoft, and others to the mix -- Internet giants that collectively snatch up an estimated 10 percent of the server market just to power their data centers -- and things look a little different.

And that's what seems to be happening. The kind of network traffic you see inside of Google's data centers is starting to look more and more like the kind of traffic you'd see inside one of the world's largest supercomputers.

For John Shalf, this change hit home as he sat in the audience this May at a technical conference in Santa Fe, New Mexico. It happened during a technical presentation given by Google networking whiz Bikash Koley.

Shalf had always thought that the Google's of the world needed to connect their servers to a lot of different computers all over the internet. But as Koley described it, Google's traffic patterns were a lot more like Lawrence Berkeley's. In fact, the vast majority of traffic on Google's networks doesn't go to the outside world. It's routed between computers inside the data center.

"When he said that 80 percent of their traffic was internal-facing, that just was like a lighting bolt through my brain that, wow, they're starting to look like us."