In August, researchers from the Allen Institute for Artificial Intelligence, a lab based in Seattle, unveiled an English test for computers. It examined whether machines could complete sentences like this one:
<em>On stage, a woman takes a seat at the piano. She</em>
<em>a) sits on a bench as her sister plays with the doll.</em>
<em>b) smiles with someone as the music plays.</em>
<em>c) is in the crowd, watching the dancers.</em>
<em>d) nervously sets her fingers on the keys.</em>
For you, that would be an easy question. But for a computer, it was pretty hard. While humans answered more than 88 per cent of the test questions correctly, the lab’s AI systems hovered around 60 per cent. Among experts — those who know just how difficult it is to build systems that understand natural language — that was an impressive number.
Then, two months later, a team of Google researchers unveiled a system called BERT. Its improved technology answered those questions just as well as humans did — and it was not even designed to take the test.
BERT’s arrival punctuated a significant development in artificial intelligence. Over the past several months, researchers have shown that computer systems can learn the vagaries of language in general ways and then apply what they have learned to a variety of specific tasks.
Built in quick succession by several independent research organizations, including Google and the Allen Institute, these systems could improve technology as diverse as digital assistants like Alexa and Google Home as well as software that automatically analyzes documents inside law firms, hospitals, banks and other businesses.
“Each time we build new ways of doing something close to human level, it allows us to automate or augment human labour,” said Jeremy Howard, founder of Fast.ai, an independent lab based in San Francisco that is among those at the forefront of this research. “This can make life easier for a lawyer or a paralegal. But it can also help with medicine.”
It may even lead to technology that can — finally — carry on a decent conversation.
But there is a downside: On social media services like Twitter, this new research could also lead to more convincing bots designed to fool us into thinking they are human, Howard said.
Researchers have shown that rapidly improving AI techniques can facilitate the creation of fake images that look real. As these kinds of technologies move into the language field as well, Howard said, we may need to be more skeptical than ever about what we encounter online.
These new language systems learn by analyzing millions of sentences written by humans. A system built by OpenAI, a lab based in San Francisco, analyzed thousands of self-published books, including romance novels, science fiction and more. Google’s BERT analyzed these same books plus the length and breadth of Wikipedia.
Each system learned a particular skill by analyzing all that text. OpenAI’s technology learned to guess the next word in a sentence. BERT learned to guess missing words anywhere in a sentence. But in mastering these specific tasks, they also learned about how language is pieced together.
If BERT can guess the missing words in millions of sentences (such as “the man walked into a store and bought a ____ of milk”), it can also understand many of the fundamental relationships between words in the English language, said Jacob Devlin, the Google researcher who oversaw the creation of BERT. (BERT is short for Bidirectional Encoder Representations from Transformers.)
The system can apply this knowledge to other tasks. If researchers provide BERT with a bunch of questions and their answers, it learns to answer other questions on its own. Then, if they feed it news headlines that describe the same event, it learns to recognize when two sentences are similar. Usually, machines can recognize only an exact match.
BERT can handle the “common sense” test from the Allen Institute. It can also handle a reading comprehension test where it answers questions about encyclopedia articles. What is oxygen? What is precipitation? In another test, it can judge the sentiment of a movie review. Is the review positive or negative?
This kind of technology is “a step toward a lot of still-faraway goals in AI, like technologies that can summarize and synthesize big, messy collections of information to help people make important decisions,” said Sam Bowman, a professor at New York University who specializes in natural language research.
In the weeks after the release of OpenAI’s system, outside researchers applied it to conversation. An independent group of researchers used OpenAI’s technology to create a system that leads a competition to build the best chatbot that was organized by several top labs, including the Facebook AI Lab. And this month, Google “open sourced” its BERT system, so others can apply it to additional tasks. Devlin and his colleagues have already trained it in 102 languages.
Sebastian Ruder, a researcher based in Ireland who collaborates with Fast.ai, sees the arrival of systems like BERT as a “wake-up call” for him and other AI researchers because they had assumed language technology had hit a ceiling. “There is so much untapped potential,” he said.
The complex mathematical systems behind this technology are called neural networks. In recent years, this type of machine learning has accelerated progress in subjects as varied as face recognition technology and driverless cars. Researchers call this “deep learning.”
BERT succeeded in part because it leaned on enormous amounts of computer processing power that was not available to neural networks in years past. It analyzed all those Wikipedia articles over the course of several days using dozens of computer processors built by Google specifically for training neural networks.
The ideas that drive BERT have been around for years, but they started to work because modern hardware could juggle much larger amounts of data, Devlin said.