The release of corrupt zone files by InterNIC Wednesday night that caused headaches for confused netizens, was caused by human error. But once the mistake was caught, the systems engineered into the Internet allowed the problem to be corrected quickly, considering its scope.
With complex engineered systems controlling everything from Net access to securities trading, the fallout from system failures and outages is becoming increasingly costly. And the question that observers are asking is: Should our imperfect species relinquish control over computer networks to the computers themselves?
"It's OK to use a human as a fail-safe, but not as a decision-maker," says Perry Metzger, whose consulting firm, Piermont Information Systems, builds and operates network systems for Fortune 500 companies similar to the one that failed Wednesday night. "Remember, computers are reasonably predictable, but humans can get tired, and they can get distracted."
If this debate is sounding a little like a chess rematch with Deep Blue, consider what happened at 2:30 a.m. EDT Thursday at the Virginia-based Network Solutions, the company that maintains most of the domain name system: A computer issued a warning that zone files that update the system nightly were corrupted. A human ignored the warning, and the computer sent the corrupted .com and .net domain files out, causing one of the most widespread service interruptions in the Net's short history.
It's pretty safe to say that the human did not do his or her job. But Metzger, whose systems are run by a checks-and-balances computer system devoid of human interference, says that any mere human would have slipped up - it's just a matter of when.
"Systems have to be properly paranoid by design," he says. "People now depend on the Net for day-to-day business. What happens to Amazon.com if the Net goes down for a day? They lose big money. On Wall Street, if you don't construct the system right and people lose money, you will never work again, and that's the way all systems should be."
But the idea of trusting computer networks enough to eliminate mankind from the process makes some uncomfortable.
"I like the idea of having human eyes on these things," says Don Heath, president of the Internet Society. "Maybe we just need more than one pair of eyes on the networks - like turning two keys" in detonating nuclear weapons.
Heath says that although he thinks more could be done to prevent network snafus, he was impressed at Network Solutions' rapid response in getting the situation under control.
It seems that for its part, Network Solutions has learned its lesson. Spokeswoman Aggie Nteta said the company was taking immediate steps to change how the domain updates are administered. Echoing Heath’s suggestion, Network Solutions will change the updates to a two-person operation, as well as having "more senior personnel" perform the updates.
Amid the many frustrations of the past week, including two power failures at the MAE-West interhange point and three cuts of major fiber-optic pipelines in the Los Angeles and Washington, DC, areas, at least one person saw the sunny side of the street.
On the mailing list for the North American Network Operator’s Group, Randy Bush, director of network engineering at Internet access carrier Verio Inc., wrote: "Considering the level of problems we have had today and earlier in the week, I am impressed by how resilient the Net actually has been. Not that I like the heavy hits we've been taking. But the old beastie seems pretty tough."