Mar 19, 2013 6:30 AM

Ex-Googlers Train Machine Army to Sift Out Crooks

Here’s a tidbit for the online retailers out there: If a shopper on your website is using Firefox with Windows XP, the odds of him being a fraudster go up sixfold. That’s a trend mined by the machine learning geeks at Sift Science, a San Francisco startup that’s taking some of the same techniques that Google uses to cut down on abuse on its ad network and making them available to smaller websites, such as Airbnb, Uber, and Listia. All three of these are early customers.

Image may contain Human Person and White Board

Here's a tidbit for the online retailers out there: If a shopper on your website is using Firefox with Windows XP, the odds of him being a fraudster go up sixfold.

That's a trend mined by the machine learning geeks at Sift Science, a San Francisco startup that's taking some of the same techniques that Google uses to cut down on abuse on its ad network and making them available to smaller websites, such as Airbnb, Uber, and Listia. All three of these are early customers.

"The point of this is really to make online commerce safer and more efficient," says Brandon Ballinger, Sift's founder and chief technical officer. "Machine learning lets you adapt to the different fraud patterns you see on different websites."

Fraud is a big problem for internet merchants, who often bear the financial cost of fraudulent credit card charges. The problem is that many fraud detection services rely on a small number of tried-and-true rules to spot scammers. The criminals quickly figure them out, and it's often tricky to stay on top of new techniques.

"Anybody who runs a website online is constantly being attacked by fraudsters," says Ballinger. "And you and me, as consumers, pay more because of fraudsters."

But Sift Science uses Amazon's cloud to spin up giant compute farms that, er, sift though mountains of data and pull out the emerging fraud trends that other people might miss. Here's another example: if a web surfer that happens to have spent $4 or less online in the past week, the odds of them being a fraudster go up 78-fold.

Any website can sign up for Sift's service in a few minutes, and start getting a small number of fraud scores -- 5,000 per month -- for free. The site installs a snippet of Javascript code that collects the same type of information that gets processed by Google Analytics, and then generates a fraud score. After the 5,000 free scores are used up, it's 10 cents per user score.

There's a bit of a renaissance in machine learning going on right now. IBM is trying to turn its Jeopardy winning Watson technology into a line of business computers that can do everything from helping doctors diagnose patients to inspiring chefs with new machine-generated recipe ideas. And Google said recently that it used a modeling technique known as neural networks to boost its voice recognition software's accuracy by 25 percent.

In fact, more than half of Sift's nine-person crew comes from Google. And that's where founder Brandon Ballinger and his Google mentor, Sean Gerrish (also an employee), got their machine learning chops, detecting scammers on Google's advertising network. And Gerrish was mentored by another Sift Science employee, Doug Beeferman.

He's a final fraud tip. Ballinger says that right now if anybody tries to buys something using a six-character email address -- something like robert@wired.com -- the chances of fraud drop by 40 percent. But that's not likely to last after we post this in our story.

"I guarantee that within a week after publication, this particular pattern will be highly correlated with fraud," Ballinger says.

Graphic: Ross Patton