Software that turns data into written text could help us make sense of a coming tsunami of data.
The writing software, called Quill, was developed by Narrative Science, a Chicago company set up in 2010 to commercialize technology developed at Northwestern University that turns numerical data into a written story. It wasn’t long before Quill was being used to report on baseball games for TV and online sports outlets, and company earnings statements for clients such as Forbes.
Quill’s early career success generated headlines of its own, and the software was seen by some as evidence that intelligent software might displace human workers. Narrative Science CEO Stuart Frankel says that the publicity, even if some of it was negative, was a blessing. “A lot of people felt threatened by what we were doing, and we got a lot of coverage,” he says. “It led to a lot of inquiries from all different industries and to the evolution to a different business.”
Narrative Science is now renting out Quill’s writing skills to financial customers such as T. Rowe Price, Credit Suisse, and USAA to write up more in-depth, lengthy reports on the performance of mutual funds that are then distributed to investors or regulators.
“It goes from the job of a small army of people over weeks to just a few seconds,” says Frankel. “We do 10- to 15-page documents for some financial clients.”
An investment from In-Q-Tel, the CIA’s investment division, led the company to work from multiple U.S. intelligence agencies. Asked about that work, Frankel says only that “The communication challenges of the U.S. intelligence community are very similar to those of our other customers.” Altogether, Quill now churns out millions of words per day.
The software’s output can be impressive for software, but it can’t write without some numerical data for inspiration. It performs statistical analysis on that data, looking for significant events or trends, and it draws on knowledge about key concepts such as bankruptcy, profit, and revenue, and how such concepts are related.The following paragraph, from an investment report, shows that Quill can write passable text for such a document, but it can still feel as if it were written by a computer.
Quill is programmed with rules of writing that it uses to structure sentences, paragraphs, and pages, says Kristian Hammond, a computer science professor at Northwestern University and chief scientist at Narrative Science. “We know how to introduce an idea, how not to repeat ourselves, how to get shorter,” he says.
Companies can also tune Quill’s style and use of language based on what they need it to write. It can accentuate the positive in marketing copy, or go for exhaustive detail in a regulatory filing, for example.
Quill can also take an “angle” for a piece of writing. When writing about sports for an audience likely to favor a particular team, for instance, Quill can write a story that softens the blow of a loss.
Narrative Science doesn’t publish technical details of how Quill works. But Michael White, an associate professor at Ohio State University, says that its ability to finesse the angle and arc of a piece sets it apart from previous examples of such software.
What is known as “natural language generation” software has been a research topic for years, but it has recently begun to show more commercial promise, says White. “There’s growing awareness that masses of data and visualizations are not really helpful if they can’t be explained and made relevant,” says White. “I’d say the time has finally become ripe for natural language generation to have commercial success.”
Other companies working on the technology include Arria, a U.K.-based company that was spun out of research from the University of Aberdeen, in Scotland. A Pittsburgh startup called OnlyBoth, founded last year, plans to launch its first writing software products later in 2015.
All those companies are so far focused on serving businesses. But Hammond says that as cars, health gadgets, and home appliances become connected to the Internet, the simple charts and symbols they use to communicate with humans may not be enough. “Most households are not going to be able to do the data science to make to their thermostats and cars and other data intelligible,” says Hammond. “This technology is going to be a descriptive voice of everything that has data.”