Hortonworks Teams With Teradata on Hadoop

Hortonworks -- the Yahoo spinoff dedicated to Hadoop -- has joined forces with analytics outfit Terradata to help big businesses make use of the increasingly popular open source data-crunching platform. On Tuesday, the two companies announced that they will offer a reference architecture for building Hadoop clusters, while also helping customers build additional tools atop the platform.
Image may contain Animal Wildlife Mammal Elephant Human and Person

Hortonworks -- the Yahoo spinoff dedicated to Hadoop -- has joined forces with analytics outfit Teradata to help big businesses make use of the increasingly popular open source data-crunching platform.

On Tuesday, the two companies announced that they will offer a reference architecture for building Hadoop clusters, while also helping customers build additional tools atop the platform. Based on Google's back-end infrastructure -- and named for a yellow stuffed elephant -- Hadoop is a means of crunching large amounts of data across a collection of dirt-cheap commodity servers.

"One of the broader trends we're seeing is really the specialization of data analytics," Shaun Connolly, vice president of corporate strategy at Hortonworks, tells Wired. "Hadoop is bringing that capability to the enterprise, particularly with unstructured data and large scale volumes." Connolly describes Hadoop as a "data refinery" for the fields of "data oil" facing today's businesses -- something suited to organizing the unstructured information streaming off the internet.

Though built from research papers released by Google in 2004, Hadoop was actually bootstrapped by Yahoo. Today, the platform underpins not only Yahoo but also Facebook, eBay, Twitter, and many other big-name internet services, and it has long been touted a platform for businesses beyond the big web players. EMC, Oracle, and IBM are all offering tools based on the platform, and Hortonworks is battlling another Silicon Valley outfit -- Cloudera -- to be the king of the Hadoop startups.

Teradata has long offered tools that let businesses analyze data, and with Hadoop, it's making a new play in the world of unstructured data -- data that's not easily stored in the neat rows and columns of a relational database. This idea, the company says, is help businesses gain insight from things like e-mail and Twitter data.

Like Cloudera, Hortonworks is dedicated to improving and expanding the open source Apache version of Hadoop, and it will make its money by offering services and perhaps additional software around that open source core.