Science —

IBM to set Watson loose on cancer genome data

Project will focus on treatment, rather than discovery.

A cancer mutation is shown on a cell protein pathway from genome sequencing.
A cancer mutation is shown on a cell protein pathway from genome sequencing.

NEW YORK—Earlier today, IBM announced that it would be using Watson, the system that famously wiped the floor with human Jeopardy champions, to tackle a somewhat more significant problem: choosing treatments for cancer. In the process, the company hopes to help usher in the promised era of personalized medicine.

The announcement was made at the headquarters of IBM's partner in this effort, the New York Genome Center; its CEO, Robert Darnell called the program "not purely clinical and not purely research." Rather than seeking to gather new data about the mutations that drive cancer, the effort will attempt to determine if Watson can parse genome data and use it to recommend treatments.

Darnell said that the project would start with 20 to 25 patients who are suffering from glioblastoma, a type of brain cancer with a poor prognosis. Currently, the median survival time after diagnosis is only 14 months; "Time, frankly, is not your friend when you have glioblastoma," as Darnell put it. Samples from those patients (including both healthy and cancerous tissue) would be subjected to extensive DNA sequencing, including both the genome and the RNA transcribed from it. "What comes out is an absolute gusher of information," he said.

It should theoretically be possible to analyze that data and use it to customize a treatment that targets the specific mutations present in tumor cells. But right now, doing so requires a squad of highly trained geneticists, genomics experts, and clinicians. It's a situation that Darnell said simply can't scale to handle the patients with glioblastoma, much less other cancers.

Instead, that gusher of information is going to be pointed at Watson. John Kelly of IBM Research stepped up to describe Watson as a "cognitive system," one that "mimics the capabilities of the human mind—some, but not all [capabilities]." The capabilities it does have include ingesting large volumes of information, identifying the information that's relevant, and then learning from the results of its use. Kelley was extremely optimistic that Watson could bring new insights to cancer care. "We will have an impact on cancer and these other horrific diseases," he told the audience. "It's not a matter of if, it's a matter of when—and the when is going to be very soon."

IBM's Ajay Royyuru points to a drawing of the chemical formula for DNA at IBM Research headquarters in Yorktown Heights, New York.
IBM's Ajay Royyuru points to a drawing of the chemical formula for DNA at IBM Research headquarters in Yorktown Heights, New York.

Teaching a computer biology

Kelly noted that IBM shifted Watson to a cloud-based system (it was originally run on a dedicated cluster), which means that any success of the initial endeavor would be easier to use elsewhere. Perhaps more significantly, however, Kelly said "We had to teach Watson the language of healthcare and medicine." How does one go about teaching a computer cancer biology?

We talked a bit to IBM's Ajay Royyuru about the process. To get the machine started, Royyuru said they took advantage of the fact that the National Institutes of Health has compiled lists of biochemical pathways—signaling networks and protein interactions—and placed them in machine-readable formats. Once those were imported, Watson's text analysis abilities were set loose on the NIH's PubMed database, which contains abstracts of nearly every paper published in peer-reviewed biomedical journals.

Over time, Watson will develop its own sense of what sources it looks at are consistently reliable. Royyuru told Ars that, if the team decides to, it can start adding the full text of articles and branch out to other information sources. Between the known pathways and the scientific literature, however, IBM seems to think that Watson has a good grip on what typically goes on inside cells.

How is it going to use that knowledge? So far, most of the cancer genome sequencing efforts have focused simply on identifying which mutations are likely to be the key to driving uncontrolled cell growth. This effort has reinforced the notion that there are a handful of core genes that trigger a variety of cancers, along with a larger array that are either involved in specific cell types or in promoting specific aspects of cancerous growth like invasiveness. (Although Darnell was quick to point out that discoveries in this area are ongoing.)

In goes data, out comes a treatment

Ultimately, Watson's not going to be used for that. Instead, it's going to be used to help design treatments. Given the results of the DNA and RNA sequencing—the geyser Darnell mentioned earlier—Watson will figure out which mutations are distinct to the tumor, what protein networks they affect, and which drugs target proteins that are part of those networks. The net result will be a picture of the biochemical landscape inside the tumor cells, along with some suggestions on how clinicians might consider intervening to change the landscape.

Royyuru told Ars that Watson will be aware of two key concepts. One is that these signaling networks often have redundancies—if you block one protein with a drug, then others in the network will make up for its absence. To avoid this, Watson can suggest combinations of drugs that target multiple arms of the network. The other thing we've found is that tumors are a population of different cells, not all of which have the same combination of mutations. This heterogeneity can be identified in the sequencing results, and Watson should be able to pick drugs that can target distinct populations within the tumor.

Ultimately, Watson won't do more than make a set of suggestions and provide a convenient interface for clinicians to explore the data that led to those suggestions. (As Royyuru's colleague Raminderpal Singh told me, "You don't want to be making the decisions, because then you're in the medical business.") But, as Darnell noted in his introduction, right now those recommendations take a team of specialized researchers about a week to make. Watson can generate them almost as soon as the sequencing data is ready.

And that, everyone hopes, will be the start of our ability to tailor treatments to the genetic changes that make each case of cancer a distinctive medical challenge. A panel of 25 patients won't be enough to really know how effective this is (though Watson will ask the clinicians why they chose specific treatments and add that information to the data it considers in the future). But the program, which has already gotten approval to enroll human patients, involves a partnership with all the major New York City hospitals, as well as one each in Long Island and Buffalo. If the first results are promising, the program could be expanded rapidly.

Channel Ars Technica