With several high-profile cracking efforts in the last year, government agencies are slowly but surely battening down the hatches. For its part, NASA has recently jacked up its computer crime countermeasures by building a new low-cost supercomputer that can cut through data at lightning-fast speed.
"It's a brand-new program -- no one's done this before in law enforcement," said Thomas J. Talleur, an executive with NASA's Office of Inspector General advanced technology programs. The National Aeronautics & Space Administration's Computer Crimes Division (CCD) built the new system, which is a DIY supercomputer used to scan seized hard drives, network server logs, and communications transmission data for evidence.
In the old days, investigations were carried out using DOS-based machines, but they worked fine for the job -- a seized machine might only have a 300 MB hard drive that one agent could scan through himself. But today, the volume of seized data might run into the hundreds of gigabytes, and using the old methods -- even with dual Pentium-processors and multiple drives -- could take weeks to find answers.
Enter Beowulf, a system that uses a parallel-processing architecture and off-the-shelf machines running the freely available Linux operating system. One machine is the server node, and distributes a processing job to all of the other machines, which are client nodes. All of these CPUs work on part of a task at the same time. Distributed processing allows the computer to complete a job much faster than a single, even much more powerful, processor could. To exponentially increase the power of a Beowulf cluster, an engineer could simply add more nodes.
The total hardware cost for CCD's 24-node Beowulf cluster was US$57,000 -- as compared to most commercial supercomputers today, which cost between $10 million and $30 million. The cluster gives 2.4 gigabytes per second throughput, which means that a 200 GB hard drive can be scanned in only 20 seconds. While it took five to seven weeks to analyze the evidence of several intruders in the recent Israeli hacker case, Talleur said it would have only taken a few hours with Beowulf.
Given the speed of the new supercomputer, the CCD has changed the methodology behind its computer-crime investigations.
"The goal of this cluster is not to provide automated data analysis 10 times faster -- it's to do it interactively," said Dan Ridge, CCD's Beowulf guru.
Three or four years ago, Ridge said, you could cull the most interesting 1 percent of text from a seized 100 MB hard drive, and one agent could eyeball it himself. But now, culling that 1 percent of even a 23 GB drive is too much for one person to visually analyze -- instead, an agent uses the robust collection of GNU utilities to make interactive queries about the data.
"You're interested in a different set of things if you can have the answer in two seconds than if you have to go to lunch before you get the answer," Ridge said. "It lets you make ad hoc queries and follow nonintuitive discovery paths to find out what's going on."
What agents look for are text strings, such as the keystrokes a cracker might use to perform a break-in, or telltale entries in a system log that show that a cracker had visited a particular site. These kinds of searches are integer-intensive, disk bandwidth-intensive operations; with Beowulf, an agent can make off-the-wall queries that might only show up rarely and weren't done with the old methodology, where it would take hours to get results.
"But if you can ask it in two seconds," Ridge said, "you're gonna ask it."
The Beowulf project was developed at NASA by Thomas Sterling and Donald Becker in the summer of 1994; today, anyone can buy a Beowulf CD-ROM -- Red Hat Software's Extreme Linux package -- for $29.
"I think it reflects good expenditure of public taxpayer funds," he said. "We're not going out and investing millions of dollars in some vendor's proprietary system -- we have the power to control our own destiny, in-house. As a manager and as an executive, this is what I like about it."
Talleur said that he hopes the project will set an example to other organizations, because this system has the flexibility to be used for other purposes as well.
"NASA's usage of Extreme Linux to speed the search process of data involved in computer crime cases demonstrates the technology in an area of interest to business -- high-speed searching of large data sets," said Robert Hart, Red Hat's director of support services.
"The financial sector is generally fairly technologically conservative, but their increasingly sophisticated models are driving them to ever-higher-cost monolithic hardware," Hart said. "Extreme Linux clusters offer them a potential way out of this problem."