By Kristin Sainani
Illustration by Jason Holley
When a baby girl, just days old, went into cardiac arrest in late January, her mother rushed her to the local fire station, where they defibrillated her and saved her life. She was sent to Oakland Children’s Hospital and then to Lucile Packard Children’s Hospital, where her heart stopped and had to be restarted multiple times. Her doctors put her on a heart-lung machine for a week, then implanted an internal defibrillator and removed nerve cells that make the heart jumpy. In all this turmoil, it’s unlikely that anyone gave a thought to the role of biostatisticians in the girl’s care. Yet, biostatistics played a part — helping to establish the effectiveness of the treatments that saved her daughter’s life. And now biostatistics is playing a more overt role in her care.
Puzzled by the baby’s condition, her doctors turned to Euan Ashley, MD, who is involved in several research projects to sequence the genomes of young patients with unexplained heart attacks. Sequencing a patient’s genome now takes just three to four weeks — even faster than it takes to get the results of clinical tests for known genetic disorders, says Ashley, an assistant professor of cardiovascular medicine at Stanford. His team is combing through the 3 billion base pairs in the infant’s genome, as well as those of her parents, to pinpoint what is likely a single genetic mutation responsible for her disease.
Sorting through the data isn’t just a volume problem; rather, the data are inherently tricky. They contain tens of thousands of errors introduced by the sequencing technology. How to separate these decoys from the causative genetic change is an open challenge. “You’re trying to sort out the needle from a stack of apparent needles,” says Frederick Dewey, MD, a postdoctoral fellow in Ashley’s lab.
This is just the kind of thorny data problem that is made to order for a biostatistician. Many types of expertise help crunch large data sets — computer science, mathematics and informatics, for example. But statistics brings something unique: Statisticians are not only trained in finding patterns in data, but also in separating real patterns from spurious ones. “Statisticians are very good at thinking about how bad their conclusions are,” says Bradley Efron, PhD, professor of health research and policy and of statistics at Stanford and recipient of the 2005 National Medal of Science.
From personal genomes to gene expression arrays to electronic medical records, biomedicine is awash in tricky data. As a result, biostatisticians are increasingly in demand and in the limelight. For example, biostatistics was at the center of recent cover stories in both The New York Times and The Wall Street Journal. The phenomenon isn’t unique to biostatistics; all of statistics is booming. As Google’s chief economist Hal Varian, PhD, famously told a crowd at the Almaden Institute in 2008: “I’ve been telling people that the really sexy job in the 2010s is to be a statistician. Because they’re the people who can make the data tell its story. And everybody has data.”
Statisticians have their pick of jobs — Google, Facebook, pharmaceutical companies and tenure-track academic positions straight out of graduate school. “We’re just not finding unemployed statisticians,” says Ronald Wasserstein, PhD, executive director of the American Statistical Association.
“It’s just a very exciting field,” says Rob Tibshirani, PhD, professor of health research and policy and of statistics at Stanford and the second most cited mathematical scientist in history. “It’s so much fun because people realize the need for statistics and they come with these very interesting problems.”
The idea that statistics might be sexy, popular and fun may sound radical, but it’s been gaining steam in biomedical circles for some time. Biostatistics has burst into attention recently, but only in the sense that a whale breaks the surface of the water, Wasserstein says. “The whale’s been there all the time, and it’s been having a huge impact.”
Biostatisticians have changed the way doctors and biologists think, shaped the way they do research and built the tools for analyzing all types of data. Stanford statisticians have long been leaders in these endeavors; its statistics department is world-renowned and repeatedly ranks No. 1 on surveys of U.S. graduate schools.
Statisticians are also biomedicine’s skeptics, Efron says. They scrutinize the evidence, and when it disagrees with conventional wisdom, they challenge the status quo.
Biostatisticians may have enjoyed a life of relative obscurity until recently, but their influence has rippled throughout medicine for nearly a century.
Statistical thinking revolutionized medicine by helping doctors focus on evidence rather than on intuition and feeling. “The general attitude that you ought to be quantitative and comparative in your thinking about medicine is a powerful idea that isn’t natural to doctors. Or at least it wasn’t from the Greeks until about 1930,” Efron says.
Interpreting data is an art, because some of the associations that turn up are just flukes. Statisticians devised ways to separate these flukes from the truth. They also pioneered the concept of randomization — which makes it possible to compare two (or more) treatments fairly. “Before this, people blundered around for 2,000 years trying to decide whether A was better than B,” Efron says.
‘Statisticians are very good at thinking about how bad their conclusions are.’ They’re trained in separating real patterns from spurious ones.
In the 1970s, the randomized clinical trial became standard practice for evaluating therapies. Biostatisticians developed the methods for both designing and analyzing these studies. Efron was involved in several early trials at Stanford, including seminal studies by Saul Rosenberg, MD, and Henry Kaplan, MD, that established radiation treatment as a curative therapy for Hodgkin’s lymphoma. “They changed it from an incurable disease to a curable disease,” Efron says.
Biostatisticians continue to play a critical role in designing and analyzing clinical studies. “I keep telling people that researchers should not be analyzing their own data by themselves,” says Helena Kraemer, PhD, emerita professor of psychiatry and the go-to statistician in that department for over 50 years. “People who know what they want the data to say develop a certain functional blindness when it comes to what the data actually say.” Statisticians can look at data honestly, since they don’t have any stake in the outcome, except as patients, she says.
Biostatistics entered a new phase in the 1990s and 2000s with the advent of high-throughput technologies — automated experiments that generate huge amounts of data, such as genome sequencing and microarrays. That’s when bench scientists, who previously had little interest in statistics, began to recognize biostatistics as indispensable.
It wasn’t immediately clear, even to statisticians, how to deal with so much data. For example, a microarray experiment might involve comparing the expression of 30,000 genes between cancer patients and controls. Traditional statistical tests allow one false positive to sneak in roughly every 20 comparisons; thus, when applied to microarray data, they generated a slew of false positives. Researchers excitedly proclaimed discoveries of gene signatures — patterns of gene activity predicting disease progression or treatment response — but the majority turned out to be nothing more than noise. Of hundreds of reported gene signatures, “almost none of them have panned out,” Tibshirani says.
Statisticians soon became involved in debunking some of these baseless claims and in helping researchers sort the garbage from the real biological signal. Tibshirani developed some of the primary tools for controlling the false positive rate. His 2001 paper introducing SAM (significance analysis of microarrays) has been cited more than 7,000 times. This software program estimates the percentage of genes in a gene signature likely to be false positives and lets researchers cut the false positive rate by using more stringent criteria for gene selection.
He and other biostatisticians have also played important roles as “forensic statisticians.” For example, in 2005, at the request of a colleague, Tibshirani scrutinized a high-profile New England Journal of Medicine article that claimed to have found a gene signature that predicts survival in follicular lymphoma. His conclusion after two weeks of work to reconstruct the analysis: The gene signature just didn’t hold up. He wrote a letter to the editor laying out his criticisms and re-analysis, which the journal published. But the original paper continues to be cited — which illustrates a problem with the current publication process: Papers are often cited as fact even years after they’ve been discredited, Tibshirani says.
Recently, two biostatisticians — Keith Baggerly, PhD, and Kevin Coombes, PhD, of the MD Anderson Cancer Center in Texas — unearthed a scandal at Duke University that has led to the retraction of at least 10 papers in major medical journals. Their sleuthing revealed multiple bookkeeping errors and ultimately fraud in the data of the researcher at the center of the scandal. If it weren’t for the persistence of Baggerly and Coombes, these problems, which were hidden within a complex data set, could easily have gone unnoticed — and clinical trial patients would still be receiving cancer drugs based on the fraudulent approach.
“As the data get more and more complex, it’s easy to sort of massage it until you get a good answer,” Tibshirani says. “As a result, biostatisticians are thinking about how to ensure that stuff that’s published is more credible, trustworthy and reproducible.”
As biology continues to evolve, so does biostatistics. Wing Wong, PhD, professor of health research and policy and of statistics at Stanford, is developing statistical tools for determining how genes work together in complex pathways in cells. “You cannot really understand the behavior of the cell by studying one gene at a time,” he says. “There are some pretty deep statistical issues that we are struggling with.”
Tibshirani is developing methods for analyzing data from phospho-flow cytometry. This technology (pioneered at Stanford) measures protein levels in individual cells, giving a level of resolution that microarrays don’t offer. But the data dimensions are flipped: Whereas microarrays compare the expression levels of tens of thousands of genes across maybe 10 to 100 tumors or people, phospho-flow experiments compare the activities of just 50 to 100 proteins across tens of thousands of cells. Thus, new statistical tools are necessary.
“Statistics is a unique field because it’s continually reinventing itself based on what types of data are out there,” says Nancy Zhang, PhD, assistant professor of statistics at Stanford.
Medicine is now entering a new phase in which “the role of biostatistics will become even more important,” Wong says. With the advent of cheap, fast sequencing technology, personal genomes will soon become routine, he predicts. This will open the door for so-called personalized, or individualized, medicine, in which doctors tailor therapies to patients based on their unique genetic makeup.
“The potential of this technology for transforming the way we do medicine is incredible,” Euan Ashley says. “And it’s really happening.” Ashley’s team is analyzing the genomes of several heart patients and their families. Identifying their specific genetic defects may help optimize these patients’ treatments (though these are still considered research projects). Cancer doctors are also beginning to sequence their patient’s tumors hoping to match patients to the correct therapies.
Sequencing and interpreting genomes brings a host of new statistical challenges. “The technology has moved so fast that we’re trying to learn how to leverage it as it arrives,” Ashley says. The sequencing machines generate short fragments of DNA reads, containing just 75 to 150 base pairs each, in random order; these have to be assembled into the entire 3 billion base pair genome, which is no easy task. Plus, the technology is imperfect, making an error about once every 100,000 base pairs. “This is a low error rate, but when amplified over the huge amount of data we’re dealing with, there are a large number of absolute errors,” Dewey says. In the case of the infant girl, Dewey and Ashley singled out a candidate for the causative genetic mutation, but on further investigation saw it was just a sequencing error. The baby continues to have frequent cardiac arrests, triggering the internal defibrillator to restart her heart.
Even if it was possible to reliably make sense of the data, it’s still not clear exactly how to use the data to prove that a specific treatment is the best treatment for a particular person. After all, how do you have a clinical trial with just one person? “If I’m going to look at the genome of your tumor and prescribe something uniquely for you, how are we going to assess whether that strategy is going to help or harm you compared with the traditional treatment? There are a whole lot of thorny areas,” says Terry Speed, PhD, professor of statistics at the University of California-Berkeley.
Personalized medicine will require innovations in study design. Wong and others are thinking about ways to mine the enormous amount of data stored in electronic medical records (which may soon encompass personal genomes), including the data stored in free text fields. “Physicians are typing away madly. All that information is actually very rich,” Wong says.
Biostatisticians are also inventing new ways to do randomized clinical trials. For example, Phil Lavori, PhD, wants to embed randomization into routine care. In situations where the best treatment option is unknown, doctors could enroll their patients (with consent) in an automated clinical trial. “The idea is that a physician could choose option A, choose option B or choose randomize from a drop-down menu,” says Lavori, professor and chair of health research and policy at Stanford and a pioneer of point-of-care clinical trials. The institution’s electronic system would monitor the trial; and, if a clear winner emerged, would immediately make this alternative the standard of care. A pilot trial of this approach is already under way in Boston comparing two methods of administering insulin to diabetic patients.
Kraemer is proposing new ways of measuring outcomes in clinical trials that better reflect an individual patient’s experiences of harms and benefits. For example, rather than comparing the average benefit of a schizophrenia drug with the average weight gain it causes, this balance would be evaluated for each patient individually. Such an outcome measure is more sensitive to individual differences and can help identify which types of patients the treatment is most appropriate for.
Methods for analyzing trial data may also get an overhaul, including allowing knowledge gained in one trial to be incorporated into the analysis of a subsequent trial. “When you take a statistics class in the future, it might not be all about p-values anymore,” Lavori says. P-values, which give the probability that a given pattern in the data could have arisen by chance, have been the cornerstone of much of statistics in the past century.
Whatever statistics looks like in the future, one thing is clear: It’s a skill set that’s only going to get more valuable as biomedicine (and other domains) churn out more and more data. Statisticians can play in any field that interests them, Lavori says. “It’s beginning to dawn on people that there are some skills that stay valuable no matter what happens, and I would say that statistics is one of them.”
Biostatisticians are in short supply.Fortunately, researchers at the Stanford School of Medicine can access one-on-one statistical consulting through Spectrum (the Stanford Center for Clinical and Translational Education and Research). Many universities provide similar services; and other researchers can access limited free statistical consulting on the web, such as at http://www.stat-help.com/.
“It used to be that people would come to statisticians at the end of the study, and we would do a post-mortem — tell them what went wrong,” says Spectrum biostatistician Raymond Balise, PhD. But this has changed since the introduction of Spectrum’s online study-design tool, which directs researchers to Balise and his colleagues early.
“There is a lot of basic advice that we can offer that gives studies more power, so researchers need fewer patients,” Balise says. He also guides researchers in designing better questionnaires, planning for dropouts, measuring the right variables, and planning statistical analyses.
Spectrum’s client load has nearly tripled since 2006, and the demand is “insatiable,” Balise says. It’s an incredibly rewarding job, he adds. “It’s very hard not to fall in love with the research projects around here because they’re all about saving lives and improving the quality of life,” he says. “And biostatistics makes these projects possible.”