Summary: A group of researchers from Stanford has been working on deep learning models that can make sense of whole sentences at a time, and has recently trained its models on a large collection of online movie reviews.
Stanford Ph.D. student Richard Socher appreciates the work Google and others are doing to build neural networks that can understand human language. He just thinks his work is more useful — and he’s going to share his code with anyone who wants to see it.
Along with a team of Stanford researchers that includes machine learning expert and Coursera co-founder Andrew Ng, Socher has developed a computer model that can accurately classify the sentiment of a sentence 85 percent of the time. The previous state of the art for this task — essentially, discerning whether the overall tone of a sentence is positive or negative — peaked at about 80 percent accuracy. In a field where improvements usually come fractions of a percent at a time, that 5 percent jump is a big deal.
It’s also a big deal to businesses, which are trying harder than ever to automate the task of figuring out what people are saying about them online. Almost every tweet, review, blog post or other piece of content expresses an opinion, but employing a human being to scan every one and instigate some sort of response or enter them into a database isn’t exactly efficient. Early approaches to sentiment analysis or social media monitoring have been kind of crude, often focusing on individual words that don’t account for context at all.
Socher’s team pulled off its accomplishment by focusing not just on single words, but on entire sentences. It took nearly 11,000 sentences from online movie reviews (from research database culled from Rotten Tomatoes, specifically) and created what the team has dubbed the Sentiment Treebank. What makes the Sentiment Treebank so novel is that the team split those nearly 11,000 sentences into more than 215,000 individual phrases and then used human workers — via Amazon Mechanical Turk — to classify each phrase on a scale from “very negative” to “very positive.”
The team then built a new model it calls a Recursive Neural Tensor Network (it’s an evolution of existing models called Recursive Neural Networks), which is what actually processes all the words and phrases to create numeric representations for them and calculate how they interact with one another. When you’re dealing with text like movie reviews that contain linguistic intricacies, Socher explained, you need a model that can really understand how words play off each other to alter the meaning of sentences. The order in which they come, and what connects them, matters a lot.
A simple example of what Socher means would be a sentence like “There are slow and repetitive parts, but it has just enough spice to keep it interesting.” “Usually,” he said, “what comes after the ‘but’ dominates what comes before the ‘but,’” and that’s something a model focusing on single words or even single phrases might not be able to pick up.
That sample sentence and the visual representation actually come from a website Socher’s team built to show off and help train its model. The site includes a link to the research paper, as as well a live demonstration of the model on whatever sentences people enter, and a tool for exploring the Sentiment Treebank to see how it has classified sentences containing specific words. The code for the model will be available for download on the site in late October.
Over time and with more sample sentences, Socher thinks his model could reach upward of 95 percent accuracy, but it will never be completely perfect. This is because there are always certain word combinations, sentence structures and jargon that don’t appear enough to let the model effectively determine patterns in how they’re used. The movie review training set, for example, didn’t include many emoticons, so Socher’s team is working on adding them to its system.
It also had to develop algorithms to analyze the morphology of words. For example, Socher noted, the word “absurdly” is used infrequently, but an algorithm is able to figure out that adding “ly” to a word doesn’t create a wholly new word with different sentiment.
The new model and Sentiment Treebank by Socher and his team come as deep learning is catching on more broadly, thanks in part to research that companies such as Google, Facebook and Microsoft (Socher is actually a Microsoft Research Ph.D. fellow) have been publicizing in fields such as image recognition (or computer vision), speech recognition and even language understanding. Earlier this week, IBM announced a research partnership with four high-profile universities that focuses in part on deep learning.
Socher acknowledged the impressive work done elsewhere, but he’s not convinced there’s much commercial utility in focusing too much on image recognition (at least right now) or on single words. (Google and others would probably disagree, maybe quite strongly, and probably could probably raise some very good points.) So he and his Stanford colleagues have been focusing on phrases and sentences, and aside from sentiment analysis, he says their models are pushing the state of the art in areas such as machine translation, grammatical analysis and logical reasoning.
“You’ll never care about translating a single word to another single word,” he said. ”We’re actually able to put whole sentences and longer phrases into vector spaces without ignoring the order of the words.”
Along with a team of Stanford researchers that includes machine learning expert and Coursera co-founder Andrew Ng, Socher has developed a computer model that can accurately classify the sentiment of a sentence 85 percent of the time. The previous state of the art for this task — essentially, discerning whether the overall tone of a sentence is positive or negative — peaked at about 80 percent accuracy. In a field where improvements usually come fractions of a percent at a time, that 5 percent jump is a big deal.
It’s also a big deal to businesses, which are trying harder than ever to automate the task of figuring out what people are saying about them online. Almost every tweet, review, blog post or other piece of content expresses an opinion, but employing a human being to scan every one and instigate some sort of response or enter them into a database isn’t exactly efficient. Early approaches to sentiment analysis or social media monitoring have been kind of crude, often focusing on individual words that don’t account for context at all.
Socher’s team pulled off its accomplishment by focusing not just on single words, but on entire sentences. It took nearly 11,000 sentences from online movie reviews (from research database culled from Rotten Tomatoes, specifically) and created what the team has dubbed the Sentiment Treebank. What makes the Sentiment Treebank so novel is that the team split those nearly 11,000 sentences into more than 215,000 individual phrases and then used human workers — via Amazon Mechanical Turk — to classify each phrase on a scale from “very negative” to “very positive.”
The team then built a new model it calls a Recursive Neural Tensor Network (it’s an evolution of existing models called Recursive Neural Networks), which is what actually processes all the words and phrases to create numeric representations for them and calculate how they interact with one another. When you’re dealing with text like movie reviews that contain linguistic intricacies, Socher explained, you need a model that can really understand how words play off each other to alter the meaning of sentences. The order in which they come, and what connects them, matters a lot.
A simple example of what Socher means would be a sentence like “There are slow and repetitive parts, but it has just enough spice to keep it interesting.” “Usually,” he said, “what comes after the ‘but’ dominates what comes before the ‘but,’” and that’s something a model focusing on single words or even single phrases might not be able to pick up.
That sample sentence and the visual representation actually come from a website Socher’s team built to show off and help train its model. The site includes a link to the research paper, as as well a live demonstration of the model on whatever sentences people enter, and a tool for exploring the Sentiment Treebank to see how it has classified sentences containing specific words. The code for the model will be available for download on the site in late October.
Over time and with more sample sentences, Socher thinks his model could reach upward of 95 percent accuracy, but it will never be completely perfect. This is because there are always certain word combinations, sentence structures and jargon that don’t appear enough to let the model effectively determine patterns in how they’re used. The movie review training set, for example, didn’t include many emoticons, so Socher’s team is working on adding them to its system.
It also had to develop algorithms to analyze the morphology of words. For example, Socher noted, the word “absurdly” is used infrequently, but an algorithm is able to figure out that adding “ly” to a word doesn’t create a wholly new word with different sentiment.
The new model and Sentiment Treebank by Socher and his team come as deep learning is catching on more broadly, thanks in part to research that companies such as Google, Facebook and Microsoft (Socher is actually a Microsoft Research Ph.D. fellow) have been publicizing in fields such as image recognition (or computer vision), speech recognition and even language understanding. Earlier this week, IBM announced a research partnership with four high-profile universities that focuses in part on deep learning.
Socher acknowledged the impressive work done elsewhere, but he’s not convinced there’s much commercial utility in focusing too much on image recognition (at least right now) or on single words. (Google and others would probably disagree, maybe quite strongly, and probably could probably raise some very good points.) So he and his Stanford colleagues have been focusing on phrases and sentences, and aside from sentiment analysis, he says their models are pushing the state of the art in areas such as machine translation, grammatical analysis and logical reasoning.
“You’ll never care about translating a single word to another single word,” he said. ”We’re actually able to put whole sentences and longer phrases into vector spaces without ignoring the order of the words.”