Senegal: The Computer's Choice

Will Argentina win the World Cup? Brazil? Argue all you want: It turns out that a computer may predict the odds more accurately than humans. It has already picked one big upset. By Kendra Mayfield.

Almost no one could have foreseen defending World Cup champion France's stunning 1-0 defeat at the hands (well, actually the feet) of its former colony, Senegal, on the opening day of the tournament last week.

A computer, though, got it right.

Henry Stott, a mathematician at the University of Warwick, developed the Glover Automated Results Indicator (GARI), a statistical model designed to predict the odds of every individual match in the 2002 World Cup.

BBC's Radio Five commissioned Stott and other University of Warwick risk researchers to produce the model.

So far, Stott's model has continued to outperform bookies and television pundits' predictions.

"On the bookie front, we are doing very well," Stott said. "At present, for every £1 we would have bet we have got £1.2 back. Furthermore, at no point of the competition have we been down.

"We are beating the bookmakers, both as measured by a probabilistic likelihood function and by just plain betting outcome."

Stott's model rates teams on two dimensions: "strength," which quantifies how teams measure up against each other, and "patchiness," which charts a team's unpredictability.

While France initially ranked above Brazil on strength, it placed below Brazil in the probability of winning the Cup because of its higher patchiness.

"Accounting for unpredictability is important as some teams are noticeably more volatile in their performance," Stott said. "'Patchiness' is a real phenomena in the tournament run-up data."

Stott's model tends to highlight the strengths of the underdogs, rather than making calls for stronger teams.

The tendency for bookies to underestimate the threat posed by underdogs could explain why GARI bet on Senegal, when most bookies bet on the reigning champion.

"But flaws in bookies' odds shouldn't surprise us, since these odds represent where money is being placed for bets, rather than what the bookies themselves believe," Stott said. "As such, they are an aggregate of human beliefs and consequently they are subject to all those systemic human inference fallacies."

The base linear model was built using data from all 900 qualifying games and friendlies that led up to the 2002 tournament.

The model uses a method similar to techniques used by bankers to assess financial risks. This includes a "Monte Carlo" simulation of hundreds of thousands of virtual matches to forecast each team's chances of winning the entire tournament.

"The use of Monte Carlo simulation to analyze how the odds for individual matches scale up to make predictions about the overall tournament is identical to the way a financial institution would manage a risk portfolio," Stott said.

So who will win the coveted Cup? Argentina is the current favorite with 16-percent chances of winning, according to GARI.

Another statistical model, created by Peter O'Donoghue, sports studies lecturer at the University of Ulster, predicts that Brazil will win the tournament.

O'Donoghue used a computer to simulate the World Cup matches based on FIFA world ranking, distance traveled to compete (to measure home advantage), recovery time between matches and the effects of a team switching between Korea and Japan for games.

The computer simulated the tournament 2,000 times with Brazil winning 24.8 percent of the simulated tournaments.

"So it is possible, but not probable (that Brazil will win)," O'Donoghue said.

Unlike Stott's model, O'Donaghue's random number generator didn't distort the random numbers to allow for how erratic a team may be based on fluctuations in FIFA's world ranking.

O'Donoghue will compare the computer's calculations with a group of football enthusiasts' forecasts at the end of the tournament to find out whether computers are better than humans at making predictions.

Since the tournament began, Stott's model has reflected changes in results.

Already, the probability of a World Cup win for some of the top teams has changed. England joined Argentina and Brazil as one of the three countries most likely to win the competition following its 1-0 victory over Argentina.

O'Donoghue's model also accounted for France stumbling early on, since it was placed in a tougher first-round group. The simulator gave France a 60-percent chance to get to the quarterfinals, while it gave Brazil an 85-percent chance of advancing.

While teams like France and Portugal have fallen, Argentina has strengthened its position as the potential winner, according to Stott's model.

Computers can perform tasks with such depth and complexity that might seem daunting to humans, like impartially simulating thousands of permutations to cascade odds about how the tournament might develop.

However, the computer predictions are far from being completely accurate.

"We are depressingly far from perfect," Stott said. "Whilst we beat the bookies, we still only score in the 35-38-percent zone. So there is a long, long, long way to go before we can really say that we are good at predicting football matches."

A computer can't understand the impact of the Irish team captain being sent home or a key player suffering a groin injury. Yet these factors are quickly understood and integrated into a sports commentator's match analysis. It could be decades before a computer can match that ability.

"In some sense, these forecasting models are destructively reductive," Stott said. "In reality, the point of football and its commentary is for us to marvel at the underlying chaos and unpredictability of the game."