IBM Jeopardy Challenge, Night 2: Watson Runs Wild

If you happened to miss out on the first half of Round 1 last night, shame on you. You can catch up here.

When we last left our heroes, Watson and the undefeated Jeopardy champion Brad Rutter were tied for the lead with $US5000 apiece. Ken Jennings, meanwhile, trailed with $US2000. What ever kinks or “jitters” Watson had last night were overcome for the second half of Round 1, as the IBM Supercomputer demolished its opponents so thoroughly, Jennings looked like he wanted to cry by the end. Let’s rejoin the action…

Early on in the match, Watson went on a spree, correctly answering questions about diseases and classical music left and right. After running its score up to $US14,600 (and denying the other two even a chance to chime in), Watson hit the daily double. Unlike last night, when it could only wager a max of $US1000, it had some freedom to play around with his money total. How much did it risk? $US6435. That drew a solid laugh from the crowd and prompted Trebek to proclaim, “I won’t ask. I won’t ask.” Predictably, Watson produced the correct response (a question about architecture).

And then, Watson hit a hiccup. It picked the category dealing fine art, and was asked the following:

Latching on to the the keywords “3 others,” Watson answered “Picasso.” The correct response, which required the contestants to complete the museum name, was “modern art.” But the other two were as puzzled by the clue as Watson was, thinking they had to name a more specific era. So no harm no foul.

A couple of questions later, Watson hit the Daily Double again, this time in the fine art category. This time around he wagered $US1246. While it understood what the clue was asking for this time (the city from which some art was stolen), his confidence percentages were shockingly low across the board, which his most confident response (in this case Baghdad) coming in at 32 per cent. He even went as far as to mention it was guessing before answering. But Watson got it right.

From there, it was mostly a trivia bloodbath. Jennings and Rutter could hardly get a word in, as Watson wiped out the board full of clues regarding hedgehogs, Cambridge university, and terms including the words “church” or “state”. By the end of double Jeopardy, Jennings and Rutter could only give the camera looks which sat somewhere between vexed and nonplussed. That’s because Jennings had $US2400, Rutter had $US5400 and Watson had $US36,681.

For Final Jeopardy, the three challengers were given the category US Cities and asked to name the city which has one airport named after a WWII hero, and another named for a WWII battle. Jennings and Rutter both answered correctly with Chicago. What was Watson’s answer? Take a look at the top image. It was totally confused to the point where it didn’t consider the restrictions set by the category itself.

But how much did it wager? In typical Watson form, he only bet $US947, likely realizing it could wager $US0 and still win. So while there’s still a whole other round to be played tomorrow night the truth is that with $US4800 and $US10,400, respectively, Jennings and Rutter will be hard-pressed to catch up to Watson’s $US35,734. Barring a catastrophic meltdown, tomorrow night’s show will be a victory lap, celebrating the triumph of machine over man.

But now that we’ve been able to watch Watson in action for an entire round, we have a better idea of his strengths and weaknesses when it comes to answering open questions.

What Watson Does Well

Memory: Watson doesn’t have to worry about forgetting anything. Whatever is loaded into his system is retained in a perfectly (the advantage of running on all 0s and 1s).

Reaction times: As an emotionless machine, Watson is better suited to react to the signal telling contestants to buzz in. excels when clues are phrased as directly as possible. When given simple sentence structures clearly asking for who, what, when or where, Watson is unstoppable. We, as humans, can’t compete with that.

Wagering: When Watson hits Daily Doubles and the Final Jeopardy stage, he is able to analyze his confidence in the given category (or other similar ones) and his overall probability of winning, then uses those two factors to determine an optimal dollar amount. If he’s winning big or is not as confident in the catergory, he’ll tend to wager more conservatively. If he is down, or very confident in the category he will wager more. Because he can compute the factors with a numerical preciseness humans cannot, his wagers take on strange dollar amounts. Watson has also been programmed with a historical knowledge of where Daily Doubles are most commonly found, so he determines the most probable locations where they’ll be located.

What Watson Doesn’t Do well

Complex Syntax: When sentence structures become complex, or the question is asking contestants to consider two indirectly related factors or ideas, Watson tends to get confused. His confidence drops and his reaction times slow.

Art: For whatever reason, Watson doesn’t know a damn thing about Art. It strugged with nearly ever clue in the category tonight, incorrectly responding to one clue, getting beat to the punch on another clue and failing to buzz in on another. And the one it got right? It had a confidence of 32%.

Eliminating previous wrong responses: IBM programmers didn’t think Watson would ever have an issue with using the same incorrect response or wrong response structure as a contestant answering before him. Well he ran into the problem twice last night when he repeated one of Jenning’s incorrect responses, then failed to realise he had to include the word missing when replying to a clue about about a gymnast. with a missing leg.

While the suspense seems to be (mostly) taken out tomorrow night’s final throwdown (which airs on a local network at 7 or 7:30 EST/PST, depending on where you live), it’s still pretty amazing to see a digital machine interpret and respond to the human language.


The Cheapest NBN 50 Plans

It’s the most popular NBN speed in Australia for a reason. Here are the cheapest plans available.

At Gizmodo, we independently select and write about stuff we love and think you'll like too. We have affiliate and advertising partnerships, which means we may collect a share of sales or other compensation from the links on this page. BTW – prices are accurate and items in stock at the time of posting.