W3C Adopts Semantic Standard for Web Data

The web’s governing body wants to make it easier for researchers to find the data they’re seeking using web-based tools. The World Wide Web Consortium (W3C) has a whole department, the Semantic Web group, dedicated to integrating data from different sources under a set of common formats. On Tuesday, the group adopted a set of […]

The web's governing body wants to make it easier for researchers to find the data they're seeking using web-based tools.

The World Wide Web Consortium (W3C) has a whole department, the Semantic Web group, dedicated to integrating data from different sources under a set of common formats. On Tuesday, the group adopted a set of standardized organizational tags that anyone publishing data on the web should start using.

The model, called the Simple Knowledge Organization System, or SKOS, is a set of schema for categorizing data by topic in a way that's human-readable. But it's also machine readable, making the process of researching the same topic within different data stores using search and other common tools much easier.

Here's what SKOS is, from the W3C's Overview:

The Simple Knowledge Organization System is a common data model for knowledge organization systems such as thesauri, classification schemes, subject heading systems and taxonomies. Using SKOS, a knowledge organization system can be expressed as machine-readable data. It can then be exchanged between computer applications and published in a machine-readable format in the Web.

A practical example, via the W3C Semantic Web group's statement, released Tuesday:

A useful starting point for understanding the role of SKOS is the set of subject headings published by the US Library of Congress (LOC) for categorizing books, videos, and other library resources. These headings can be used to broaden or narrow queries for discovering resources. For instance, one can narrow a query about books on "Chinese literature" to "Chinese drama," or further still to "Chinese children's plays."

Library of Congress subject headings have evolved within a community of practice over a period of decades. By now publishing these subject headings in SKOS, the Library of Congress has made them available to the linked data community, which benefits from a time-tested set of concepts to re-use in their own data. This re-use adds value ("the network effect") to the collection. When people all over the Web re-use the same LOC concept for "Chinese drama," or a concept from some other vocabulary linked to it, this creates many new routes to the discovery of information, and increases the chances that relevant items will be found.

See also: