What are my chances?
Using in-game win probability techniques to predict the course of disease
It was late in the evening in 2012, and oncologist Ash Alizadeh, MD, PhD, was emotionally exhausted. He had just lost a young patient to lymphoma, and the death had come as a shock.
“Despite our best efforts, and a very strong, but ultimately mistaken, impression of successful treatment, the patient had relapsed unexpectedly,” said Alizadeh, an associate professor of medicine (oncology). “It was a time of very raw emotion.”
Although the majority of adults with the most common blood cancer, diffuse large B-cell lymphoma, are cured with the standard treatment protocol of six cycles of chemotherapy, about one-third will ultimately die from their disease.
But it’s difficult to know which patients will do well and which will do poorly. In fact, even the most experienced physicians making their best guess about whether any one patient will be cured of the disease or die of it are correct only about 60% of the time — just slightly more accurate than a coin toss.
That evening, Alizadeh and then-oncology fellow David Kurtz, MD, PhD, lingered to talk over the day’s events.
“One of the most challenging things for patients is dealing with the unknown,” said Kurtz, now an instructor in oncology.
“They want to know, ‘What are my chances? How long am I going to live?’ It’s one of the most difficult questions to answer, but one that we face at every patient visit. And for most patients the most honest answer is ‘I don’t know.’ It’s very frustrating.”
During the past several years Alizadeh, Kurtz, associate professor of radiation oncology Maximilian Diehn, MD, PhD, and postdoctoral scholars Mohammad Esfahani, PhD, and Florian Scherer, MD, have developed a computer algorithm to integrate many different types of predictive data to generate a single, dynamic risk assessment at any point in time during a patient’s course of treatment — including a tumor’s response to treatment and the amount of cancer DNA circulating in a patient’s blood during therapy.
The data would include a tumor’s response to the treatment and the amount of cancer DNA circulating in a patient’s blood during therapy. Such an assessment could be deeply meaningful for patients and their doctors. Alizadeh, Kurtz and Diehn treat patients at the Stanford Cancer Center.
“When we care for our patients, we are walking on eggshells for a profound period of time while we try to determine whether the cancer is truly gone, or if it is likely to return,” said Alizadeh. “And patients are wondering, ‘Should I be planning to attend my child’s wedding next summer, or should I prioritize making my will?’ We are trying to come up with a better way to predict at any point during a patient’s course of treatment what their outcome is likely to be.”
The researchers have also found that the approach, which they’ve termed CIRI for Continuous Individualized Risk Index, may help doctors pinpoint those people who might benefit from early, more aggressive treatments as well as those who are likely to be cured by standard methods. They published their results on July 4 in Cell.
They are planning a clinical trial to test CIRI’s utility in driving treatment decisions for lymphoma and are collaborating with investigators around the world to extend the CIRI concept to other common tumor types, including breast cancers and chronic lymphocytic leukemia.
But that night in 2012, CIRI was just the pipe dream of two exhausted physicians who wanted better answers for their patients and a way to prevent the wrenching situation Alizadeh had been part of earlier that day.
Statisticians, bookies, pundits and late nights: The birth of an algorithm
Although Alizadeh had the terrible experience of patients dying unexpectedly, he had also occasionally treated people who experienced seemingly miraculous recoveries from severe and sometimes undertreated disease.
He and Kurtz had begun to wonder whether every patient really needed six cycles of chemotherapy and their associated unpleasant side effects. “Treating every patient with the same protocol didn’t seem to make sense,” Kurtz said. “After all, each patient is different, and they come to us with varying disease statuses.”
At the time, Alizadeh and Diehn were developing sensitive methods to identify and quantify minute amounts of tumor DNA released into a patient’s blood when cancer cells die. Tracking changes in the levels of circulating tumor DNA, or ctDNA, might help clinicians monitor the progress of the disease and predict whether it was likely to recur.
They published their technique, termed CAPP-Seq, in 2014. But they weren’t done yet.
“Once we had CAPP-Seq in hand, we could start to think about the next step — how to incorporate all these various types of data, including ctDNA levels, into a tool that will actually help clinicians make better predictions for patients,” said Kurtz.
To do so, the researchers used a technique that might be more at home in Las Vegas than in the clinic. For decades, bookies and pundits attempting to predict the outcome of the next hotly contested sports match or election have scoured every evolving scrap of relevant information and sifted through mountains of associated data. Their predictions can change on a dime, however, based on a player’s poor pass or a candidate’s stellar debate performance.
Statisticians refer to this technique of incorporating a variety of continuously generated information — who is on the bench, who was injured in the first half of the match, who polled well in Iowa yesterday — as calculating in-game win probability, and it’s been used for decades.
Would this approach work for patients in the clinic, they wondered? It seemed that it could be better than the snapshot-in-time approach on which clinicians currently rely.
When a diffuse large B-cell lymphoma patient is diagnosed, for example, clinicians assess the initial symptoms, the cell type from which the cancer originated, and the size and location of the tumor after the first imaging scan to generate an initial prognosis.
More recently, some use CAPP-Seq to measure the levels of ctDNA after the first one or two rounds of therapy to determine how the tumor is responding and estimate a patient’s overall risk of dying from the disease.
But each situation gives a risk based on a snapshot in time rather than aggregating all available data to generate a single, dynamic risk assessment that can be updated throughout a patient’s treatment.
“What we’re doing now is somewhat like trying to predict the outcome of a football game after watching the kickoff or by checking the score at halftime,” Diehn said, “when in reality we know that there are any number of things that could have happened during the first half that we aren’t taking into account. We wanted to learn if it’s best to look at the latest information available about a patient, the earliest information we gathered, or whether it’s best to aggregate all of this data over many time points.”
So, the team gathered data on more than 2,500 diffuse large B-cell lymphoma patients from 11 previously published studies for whom the three most common predictors of prognosis were available. Esfahani used the data to train a computer algorithm to recognize patterns and combinations likely to affect whether a patient lived for at least 24 months after seemingly successful treatment without experiencing a disease recurrence.
They also included information from 132 patients for whom data about circulating tumor DNA levels were available before and after the first and second rounds of treatment.
The researchers next tested CIRI’s performance on data from previously published panels of people with a common leukemia and another panel on breast cancer patients.
Although the prognostic indicators varied for each disease, they found that, by serially integrating the predictive information over time, CIRI outperformed standard methods. If an experienced physician predicts a patient’s outcome correctly about 60% of the time, CIRI got it right about 80% of the time. Not perfect, but a significant improvement.
Furthermore, the study suggested that CIRI could identify patients who might need more aggressive intervention within one or two rounds of treatment rather than waiting to see if the disease recurs.
A clinical trial is necessary to learn whether CIRI can be used to guide the course of treatment in real time. But Alizadeh and Kurtz have seen tantalizing results when they’ve applied CIRI to some of their past patients with diffuse large B-cell lymphoma.
“What I didn’t initially expect was that aggregating all this information through time may also be predictive,” Alizadeh said. “It might tell us, ‘You’re going down the wrong path with this therapy, and this other therapy might be better.’”
One patient with advanced disease chose not to complete the recommended six rounds of chemotherapy and checked himself out of the hospital after one round of treatment.
“At that point, I was sure I’d soon be arranging for end-of-life care for him,” said Alizadeh. “He had the ugliest of the ugliest types of lymphoma — a huge burden of disease and the presence of all five of the standard risk factors we use to determine prognosis.” More than four years later, however, the man is alive and has no evidence of cancer.
“It was completely shocking,” said Alizadeh. “Conventional wisdom tells us there’s no way we could achieve a curative outcome for this patient with just one cycle of chemotherapy.”
Conversely, a young woman with few risk factors who appeared to be cured of her disease relapsed less than two years later and is now undergoing a much more aggressive form of treatment to try to save her life.
“All my training as an oncologist indicated that she should have had a full recovery,” Alizadeh said. “If only we could have known 30 days into treatment that she would relapse, rather than two years later, we would have had more options for her.”
When Alizadeh and Kurtz fed the patients’ information into CIRI after the fact, the algorithm saw patterns that had escaped the seasoned physicians. It correctly predicted that the young woman would do poorly and the man might do well.
“For both of these patients, my professional judgment and CIRI’s judgment were soberingly different,” said Alizadeh. “Now we have to learn how to use this tool in a way that will be meaningful for patients and their doctors. What we know, however, from our retrospective studies is that if you compare CIRI’s accuracy versus using any one risk factor alone, CIRI is head and shoulders better.”
In short, the whole is better than any one component alone. And that’s likely to help doctors help patients.