IBM systems have defeated chess grandmasters and Jeopardy champions but when one of its AI projects was entered into a debate the company met its match.

The man to conquer the machine was Harish Natarajan, a grand finalist at the 2016 World Debating Championships and winner of the 2012 European Debating Championship.

© IBM
© IBM

The 31-year-old graduate of the universities of Oxford and Cambridge has defeated some of the world's top human debaters but this was his first contest against a computer.

His opponent at the IBM Think conference in San Francisco was a two-metre-tall black box called Project Debater, which spoke in an American female voice through a blue, animated mouth. IBM claims that it's the first AI system that can debate humans on complex topics.

Fifteen minutes before the contest started, each side was given the debate topic titled 'we should subsidise pre-school'. Project Debater was chosen to support the motion and Natarajan to oppose it. They quickly prepared their arguments and then took turns to make four-minute opening statements, four-minute rebuttals, and two-minute summaries. The audience voted for one side or the other before the debate started and then again once it ended.

"I have heard you hold the world record in debate competition wins against humans, but I suspect you have never debated a machine," said Project Debater in its opening salvo. "Welcome to the future."

But the futuristic system was no match for the present. 

Before the debate, 79 percent of the audience agreed that preschool should be subsidised while 13 percent did not. At its end, 62 percent of the crowd agreed while 30 percent disagreed, a 17 percent swing that gave Natarajan the win.

The decisive factor in Natarajan's triumph was the quality of his responses. If Project Debater is confident that it understands the opponent it will respond to their specific point, but if it doesn't fully grasp them it will make a more general argument, as it did at times in this contest.

"That first four-minute speech which was it constructing the reasons to support preschool education, that was pretty strong," a jetlagged Natarajan tells Techworld shortly after flying back to London.

"The second speech where it had to comprehend everything I said, work out what was relevant and come up with responses was still impressive given that it picked up on some of the arguments made, but it wasn't picking up on the most subtle arguments I made. That might be a factor related to the kinds of sentence structures that I was using at that point as I think in natural language you'll end up using slightly complicated sentence structures and its ability to rapidly formulate responses would probably be the second area it could improve.

"In terms of the other aspects of persuasiveness, potentially the improvement there will come over time as it becomes better and better at mimicking human emotion. Obviously it will never quite have that, but lots of people when they are giving public speeches will mimic human emotion rather that actually have it, but it's another error where it could become better over time. But I think it's really that second bit which is going to be the interesting challenge."

Building Project Debater

Automated debating was a new challenge for IBM's researchers. While chess and Jeopardy contests are won by making objectively effective decisions, the nuances of language and the rhetorical techniques required to win over an audience are more subjective, emotional and open-ended skills that lack clearly defined rules that determine the victor. They instead require emotional persuasion through oratory skills that are tricky for machines to master.

Project Debater aims to overcome this by comprehending spoken language, modelling human dilemmas and delivering data-driven speeches. The system analyses the debate's topic and the argument of its opponent and then stitches together a library of 10 billion sentences from newspaper articles and academic journals to form a response.

In 2018, the system made its debut in an exhibition match against two Israeli debaters who had already worked with it, but the showdown in San Francisco was its first competitive bout against a champion debater.

"It was very impressive," says Natarajan. "The way I would think about it is debating involves at least three separate skills. The first one of these is collating the relevant information which you have and working on what is relevant. For a human that isn't overly difficult because 15 minutes don't have a great deal of knowledge and relevance isn't necessarily a massive challenge, but for a machine that can be quite difficult.

"The second part is to explain that information you have in a way which is clear and provides context, but at the same time oversimplify the issue to a huge audience. And the third aspect of debating is the human aspect of persuasiveness; everything from your rhetoric to your use of language, to your ability to use emotion and use intonations when it comes to your voice.

"What the machine was very good at doing was the first one it excelled in. This is somewhat unsurprising. It had 10 billion pieces of information about what was relevant. More impressive was its ability to explain that in a very straightforward and clear way.

"What it fell down on and I guess why it doesn't yet quite compare with the best human debaters is its ability to do the purely human aspects of persuasion. How exactly you connect with a human audience was still a little bit limited."

Debating the future

The machine's voice was more monotonic than a human debater but its oratory skills surprised Natarajan, particularly when it appealed to the audience's emotions, as it did by asking the following series of rhetorical questions:

"For starters, I sometimes listen to opponents and wonder: what do they want? Would they want poor people on their doorsteps begging for money? Would they live well with poor people without heating or running water?"

Natarajan countered by arguing that the subsidies would largely help middle-income and upper-income families as their children would be more likely to attend preschool.

"At the end of this debate, I don't think Project Debater has helped those individuals she identifies as the most important, but in reality, has hurt them," he concluded.

After the debate, Natarajan gave credit to his opponent but added that it still had much to learn.

"After a couple of minutes, you stop thinking that you're debating against a machine. It's similar enough to a human being," he recalls. "But I think once you look back on it, certainly in terms of how it presents itself, it still very much does have those machine attributes: it has a very smooth voice - a very smooth version of Alexa might be a way of putting it. It has a very consistent tone that's very clear but doesn't really vary its tone or pace that much."

He was struck by the potential for combining the system's capabilities with the work of humans by quickly analysing vast troves of data and providing them with the most relevant contextual information. This could assist people working in medical research, for instance, or as an economic risk analysis executive such as Natarajan.

"The ability that the technology has to facilitate rapid high-quality credible research is something well beyond the scope of any reasonable human or even large number of humans," he says.

His views reflect those of Garry Kasparov, the former world chess champion who decades after his 1997 series loss to IBM's Deep Blue said that humans and machines should work together to develop augmented intelligence, but Natarajan believes his discipline presents a very different challenge for an AI to that of Kasparov.

"What you want to try and do is convince an audience, but what is convincing isn't the same as capturing a king in chess or getting more area than your opponent in Go, which makes it a huge challenge for IBM," he says.

"I think there are then two things which over the next few years could potentially change to make this more digestible. It's already impressive how far they've gone given the difficulties inherent in debating but I think as you get more and more data, more and more ability to synthesise it, that will improve the machine. And second, one area where it was good but not yet excellent was its ability to respond."