Poet Allen Ginsberg once called the prodigious, sprawling life's work of poet Walt Whitman "a mountain too vast to be seen." For the approximately 30,000 people who post to Usenet newsgroups each day - and the roughly 1 million who read them - Usenet is the same: a decentralized, chaotic global conversation stretching off beyond the digital horizon, distributed across hundreds of thousands of news servers.
With topics of interest ranging from Celtic music to programmers' tools to the possible impeachment of President Clinton, UCLA graduate sociology student Marc Smith sees the newsgroups as the purest example of the messy self-governed anarchy he calls "the core of the utopian dream of the Internet."
Smith is not content, however, to hike around in the foothills of Usenet with his sociologist's notebook, generating theories from a random sampling of spam and flamage. He wants to see the whole mountain.
And he's written a program that delivers that vision: Netscan.
Now Smith is giving the code away, for free, to other academics. His dream is to "see Netscan servers pop up all over the planet," so that by triangulating the images of the mountain from various points on the globe, "we can get a more complete picture of the whole ... to make visible the invisible crowds in cyberspace."
Smith brings to the online world a classic sociological approach to exploring the "social dilemma": Given a choice, will people act selfishly to maximize short-term gain, or will they serve the greater good?
In a sense, Netscan is just another way of putting a manageable front-end on the newsgroups. Like Deja News, Netscan eats its way through a wide swath of newsgroup activity every day - about 6 gigabytes worth - and spits out useful information. Where Deja News is primarily concerned with who said what in which newsgroup, however, Netscan looks for patterns of interaction that can help researchers begin to describe the myriad forms of community flourishing on the Net.
Netscan can tell you, for instance, how many posts were made to a newsgroup during a given time period, how many different people made those posts, and how many of those posts were crossposted to other newsgroups. By clocking those kinds of variables, Smith is able to generate informed hypotheses about the social dynamics in the newsgroups and what kinds of experiences each group offers its participants.
The simple bar graphs and numbers that Netscan churns out can clue in sociologists to subtle levels of interaction, Smith says. In the groups in the comp.lang (computer language) hierarchy, discussion threads tend to be short - four or five posts - with a core group of 20 to 25 experts weighing in daily to answer questions. Compare that to alt.flame, where threads can run into hundreds of posts. Both serve the needs of their respective communities, Smith observes.
One astonishing statistic Smith says he has uncovered: About half the messages posted to Usenet are "cancel" messages - typically, messages sent out to remove crossposted unsolicited advertising.
"Imminent death of Usenet predicted!" Smith jokes, referring to the doomsday pronouncements made about Net news every couple of years.
The pollution of the commons is no joke, however. As more options for discourse become available, some of the "indigenous flora and fauna," Smith predicts, will flee a spam-ridden Net for "more attractive ecological niches" - private mailing lists and password-protected news servers.
By examining crossposting patterns, sociologists can learn how various communities on the Net defend their boundaries - a significant index of the health of a community, says Smith.
One vivid example of boundary protection on the Net comes from two newsgroups concerned with computer hacking: alt.2600 and alt.hacking. Alt.2600 is an open, unmoderated group. Alt.hacking, however, is a curious phenomenon: a moderated group without a moderator. To post to alt.hacking, you have to have the skills to forge a Usenet post and pose as the moderator for the group. Such a boundary, Smith observes, helps determine the quality of the interactions on either side of it.
The most common question Smith gets asked is, "Where are the good groups?"
One particularly healthy group, he points out, is misc.kids.pregnancy.
Despite occasional thrashes about such hot-button issues as circumcision, Smith says, the group continues to offer a positive experience to its users because "you're getting everything that Howard Rheingold says are the good things about virtual communities - assistance, help with real problems, and advice from third parties, like when people say, 'My mother-in-law says...'"
Some of the most successful groups, Smith contends, are those - like gaming groups - where people come together to exchange things of real value, like technical advice and tips. "Groups where the questions can actually be answered," he notes.
Smith is a rare bird, but one that is becoming increasingly less rare: a social scientist who can cut code, a Perl adept who is able to see both trees and forest.
"Marc is one of the real bright lights," says Don Mitchell of Microsoft's computer graphics research division.
After some initial resistance from upper-tier managers who feared that a social scientist would bring "fuzzy-headed" thinking with him into the non-fuzzy precincts of the Microsoft labs, the software giant tapped Smith for its Virtual Worlds project. The company loaned him the hardware for Netscan, Mitchell says, because it believes that Smith's understanding of the dynamics of "social cyberspaces" will give Microsoft the edge in creating online environments in which people actually want to talk to one another.
"The sociological aspects of the virtual world problem is the thing that people aren't thinking about," Mitchell explains. "The VRML people think that if you ship polygons to people, you've solved the problem. No. Marc is valuable because he's got a handle on how people interact with people through computers, rather than just how they interact with computers."
Will Hill, a researcher at AT&T Labs who created his own Net news-analysis program with a focus on hunting for useful Web resources, PHOAKS, thinks Smith's research into what makes newsgroups thrive or perish is especially needed now that the social fabric of Usenet has been tested by stresses like spam.
"For years, Net newsgroups had a kind of immune system that would defend them against things like spam. In 1997, that immune system broke down," says Hill. "The kind of work that Marc does creates a way to look at the sociological order in newsgroups that will help people have a more fruitful experience."
One of the trends that Smith notes is that efforts to hype enormous, profit-driven online communities are losing ground to small-scale, bottom-up efforts.
In the age when every NT workstation offers the tools to create your own private newsgroup, Smith wonders, "What's to stop you from being AOL for your community? From saying, 'I want a discussion group for my fly-fishing club. I'll launch it from my desktop.'"
Even spoof newsgroups can yield information about the deep human needs that are driving the expansion of the online world when examined with tools like Netscan, Smith observes.
"Alt.fan.bigfoot is a stupid newsgroup," he says, "but to the extent that there are 20 16-year-olds around the planet who post there, they've got a project, they've got a clubhouse. They're building something."