Final Exam

"The central problem of test theory," the Psychometric Society noted in 1961, is "the relation between the ability of the individual and his score on the test." Like alchemists in search of the philosophers' stone, researchers have tried to close the gap between the report-card grade and "true knowledge." History has broken test theorists into […]

"The central problem of test theory," the Psychometric Society noted in 1961, is "the relation between the ability of the individual and his score on the test." Like alchemists in search of the philosophers' stone, researchers have tried to close the gap between the report-card grade and "true knowledge." History has broken test theorists into two camps: left brainers, who try to quantify students' analytical skills, and right brainers, who hope to assess creativity and unique real-world abilities. In the near future, though, right-brain/left-brain dichotomies may be transcended by exams that measure the cognitive processes and give rise to all kinds of thinking.

1905 Alfred Binet, the father of assessment, begins creating intelligence quotient (IQ) tests to determine "mental age." Math, memory, and vocabulary problems are tailored for specific age groups.

1917 The first standardized mental tests, which emphasize math and language skills, are developed for the military by academics Ben Wood and Carl Brigham.

1926 Eight thousand high school students take the first edition of the multiple-choice Scholastic Aptitude Test, which is hand-scored by clerks.

1930s Ben Wood and IBM founder Thomas Watson try to develop a machine that can score thousands of standardized tests. Wood envisions an education system in which students advance solely on the basis of demonstrable skills.

1933 Reynold Johnson builds a machine to score standardized tests with an electrical conductor that detects pencil markings. He soon sells the technology to IBM.

1948 The Educational Testing Service is founded in Princeton, New Jersey. Company president Henry Chauncey believes ETS's tests will be used to develop "a census of the population" that can come in handy to fulfill workforce staffing needs.

1968 Frederic Lord and Melvin Novick publish Statistical Theories of Mental Test Scores, steering a generation of test developers toward "probability-based inferences." Based on students' driving patterns, the system can calculate how likely they are to cause accidents.

1983 Harvard cognitive psychologist Howard Gardner introduces the theory of "multiple intelligences," adding five categories of intelligence - musical, bodily kinesthetic, spatial, interpersonal, and intrapersonal - to the linguistic and mathematical skills measured by traditional exams. Gardner's method assesses, for instance, a child's ability to interpret music or recognize a familiar face.

1984 Computerized adaptive-testing technology arrives with the commercial release of MicroCAT - test-development software created for the US Marine Corps. Tailoring itself to a student's abilities, a CAT test evaluates reading, grammar, and math skills with fewer questions than traditional exams.

1991 Grant Wiggins, cofounder of the Center on Learning Assessment and School Structure, develops "authentic testing," which asks subjects to tackle relatively long-term projects. Instead of taking a standardized test, for example, a student might be asked to write a newspaper article.

1992 Yale psychologist Robert Sternberg introduces the Sternberg Triarchic Abilities Test. Using methods that put creative and street-smart intelligence on a par with analytical skills, teachers measure a student's reasoning in real-world situations, such as following a train schedule.

1995 ETS and the US Air Force report on their collaboration to create HyDrive, a probability-based multimedia testing system that monitors students as they troubleshoot an aircraft hydraulic system. Tests allow students to deduce the effectiveness of their work, decisions, and reasoning.

1998 Compass, an adaptive-testing program, is linked with Addison Wesley Longman textbooks to create individual study plans for students. Test results are used to recommend specific problems to solve or chapters to read.

1999 ETS expects testing groups to begin using e-rater, an automated essay-scoring system. Basing scores on a training set of manually graded essays, e-rater focuses on syntax, vocabulary, and organization.

2000 Self-assessment may come to the classroom if states employ task-oriented curricula like ThinkerTools, under review by the US Department of Education. Students deconstruct their own thought processes - considering, for example, whether their models of Newtonian force were developed through careful reasoning or through creative insight.

2005 UCLA immunologist Ron Stevens believes that schools nationwide may adopt neural-net-based exams like his Immex system, which can be used to compare students' problem-solving skills against a model of expert research techniques.

2010 Lockheed Martin suggests that schools could adapt military technology, like the Combined Arms Tactical Trainer, for classroom use. Interactive software will simulate work environments, geographical locations, or artistic settings like the symphony for real-time evaluation.

2070 The ultimate cheatsheet: Testing is abolished altogether as visionaries on both sides of the fence turn to the "brain chip," uploading information on the fly. Fused with tissue in the brain at birth, the chip eliminates the need to measure what you know. Instead, the fundamental assessment becomes, What can you do with the knowledge you own?