Where Math Meets Biology:
An Accidental Career
By Jason Socrates Bardi
"The city is placed upon the confluence of two large rivers, the Avon and the Willy, neither of them considerable rivers, but very large when joined together"
—Daniel Defoe (1661-1731), a description of Salisbury that appears in From London to Land's End.
As a biostatistician, Professor James Koziol approaches biomedical research at The Scripps Research Institute from an unusual perspective—he is more of a mathematician than a biologist. But more unusual still is the story of how Koziol came to be a biostatistician: he was drafted into biology during the war in Vietnam.
Today the words "draft lottery" don't carry the same power to evoke strong emotions as they once did. Probably most American teenagers, twenty-somethings, and thirty-somethings today think of the draft lottery as little more than a contest among NBA teams that will determine which of them will get the pick of dream team prospects.
But in the early 1970s, the draft lottery carried a wholly different meaning. A year after Koziol graduated with his bachelor's degree in mathematics from the University of Chicago, he was working on his master's degree in statistics when the first nationwide lottery for the Vietnam draft was held. A ping pong ball with Koziol's birthday on it was picked fourth out of the hopper.
The drawing was statistically non-random, says Koziol, because the ping pong balls on which the birthdays were written were not mixed before the drawing was held. In fact, one of Koziol's professors at The University of Chicago brought a lawsuit challenging the validity of the drawing in court. The lawsuit failed to set aside the lottery results, although it did ensure that the balls were thoroughly mixed in the later draft lotteries.
The high number in the lottery virtually guaranteed Koziol would be drafted. "No way was I not going to go," recalls Koziol.
However, he found another way to serve. One of Koziol's professors, Paul Meier, helped him get a commission with the U.S. Public Health Service at the National Institutes of Health (NIH), which recruited him to look at the statistics of cancer growth curves.
"In retrospect, that changed my life," he says. "That serendipitous event was my entry into biostatistics and biomedical research."
Biostatistician in the House
After his two year "hitch" with the NIH in Bethesda, Koziol returned to Chicago and graduate school, where he concentrated on biostatistics. After he finished his Ph.D., he returned to the NIH for a few years and from there joined the mathematics department at the University of British Columbia. In 1978, he joined the faculty at the University of California, San Diego. In 1983, he joined the Radiation Effects Research Foundation in Japan as a statistician. Finally, in 1984, Koziol arrived at Scripps Research as a member of the Department of Molecular and Experimental Medicine.
At Scripps Research, Koziol spends about half his time as the biostatistician for the General Clinical Research Center (GCRC), a facility located in Scripps Green Hospital that brings together basic scientists with physicians and nurses who care for patients.
The GCRC is open to any Scripps Research-affiliated investigator or postdoctoral fellow who is interested in clinical studies involving humans, and it enables these investigators to determine the bearing of their discoveries on human biology. Scripps Research investigators who are interested in using the facilities establish a collaboration with a clinician who has admitting privileges to the hospital, and the studies are carried out by this licensed physician.
The majority of the GCRC's funding comes from an NIH grant, which means that the center is able to provide substantial financial assistance for many investigator-initiated studies by picking up the tab for the care, monitoring, and testing of patients. The grant also ensures that all studies follow the strictest federal guidelines, which were recently adjusted by the U.S. Food and Drug Administration and are designed to protect the rights and safety of patients in any human trial. Koziol is part of the GCRC Scientific Advisory Committee, which meets every month or so to evaluate proposed studies for scientific merit.
"A large portion of my time is spent with the clinical research operation [at the GCRC]—planning clinical trials and implementing them," he says.
This makes Koziol a valuable resource for those conducting trials at the center because clinical trials are expensive and time consuming—sometimes extraordinarily so. Planning a clinical trial contributes about 90 percent towards its success, says Koziol, and his input as a biostatistician can save time and money.
Koziol can help to determine what questions to ask and what data to collect in the course of the trial. He can help to determine how factors such as age, sex, weight, and health of the subjects should be controlled for in the experiment so that the data will support or falsify the study's hypothesis, and he can address issues such as how many people need to be involved in the trial so that the answer is robust.
"If you don't have a properly implemented trial and take into account factors like randomization and blinding, you are not going to answer the question you set out to answer," he says. "That's where a biostatistician like myself can be useful."
Analysis, Meta-Analysis, and Pure Math
Of course, planning is not everything. Koziol is also involved in the analysis of the data—something he calls "the fun part" as long as the experiment has been well planned and executed.
In his 20 years at Scripps Research, Koziol has seen more changes to his field than he could have imagined. Back in the day, says Koziol, he would have to write his own statistical analysis algorithms and program them into computers by hand, using a stack of punch cards that he would manually feed into the ancient IBM 360 machines that were around back then. Today, there is sophisticated statistical analysis software on the desktop and laptop computers of every researcher he knows—and these relatively tiny and cheap computers are much faster and more powerful than those behemoth machines of decades past.
"You see [this progress] in the types of questions people ask me," says Koziol. Nobody ever asks simple questions any more, he says, and this makes things vastly more interesting.
Koziol can also advise on data mining, the process of searching through data for interesting trends—perhaps ones that were not predicted before the data was collected. For instance, in clinical trials a drug may not work as hoped on a large group, but may appear to work in a subset of individuals who share a common set of genetic markers. Koziol can help determine if this is actually a significant effect—or at least enough of a measurable effect to warrant further studies.
"At that stage," says Koziol, "if [the researchers] really believe that what they found is meaningful and if there is an underlying biological rationale for it, then they might want to go back and do another trial focusing on that particular subgroup."
Koziol does some independent mathematical research on pure statistics, which is what he originally envisioned himself doing years ago before the draft led him to the NIH. He also has his own independent grants from the NIH to conduct applied statistical research on problems ranging from statistical techniques for analyzing genomic data to the growth of cancer tumors.
What can statistics tell us about cancer tumors? It can address some of the discrepancies of experimental data.
"Statistics is the study of uncertainty," says Koziol. "Our role is to help people understand that uncertainty."
In a classic tumor growth model, immunocompromised mice are injected with tumor cells and then exposed to anticancer agents. The effect of these agents is determined by looking at the number, size, and growth of the tumors that develop over time. Untreated mice will grow tumors of greater size, in greater number, and in less time than mice treated with a potent anticancer agent.
However, in a typical experiment not all the mice will respond identically to the same treatment, even if they are genetically identical. This makes it difficult to model the effect of the antitumor agents because it cannot fit an ordinary "parametric" growth curve—a continuous geometric curve that can approximate the size of the tumors over time. So Koziol has taken a "non-parametric" approach to addressing this, dispensing with the normal parametric model and looking at these statistics over time.
Some of Koziol's most interesting projects come out of consultations with other investigators at Scripps Research. Once recent example is a project with Immunology Professor Bruce Beutler looking at mutagenic experiments Beutler conducted to identify new genes involved in the innate immune response.
The experiments rely on making random mutations and looking for those that induce a particular effect on the innate immune system—the inability to properly sense a particular component of certain bacteria known as lipopolysaccharide (LPS). If a mutation is detected that causes a measurable change in LPS sensing, then the gene in which that mutation occurs can be identified and studied biochemically.
These studies can eliminate the false positives—mutations that cause a measurable effect but are not involved in LPS sensing. But what about the false negatives? What about the genes that are actually involved in LPS sensing but have not been detected because in the experiments they escaped random mutations? Put another way, if you have identified through mutagenesis a certain number of genes that are involved in LPS sensing, what does that tell you about how many of these genes you have not yet identified?
A few years ago, Koziol and Beutler began discussing how they could estimate how many total genes in the genome are involved in LPS sensing. While he was hesitant to reveal the results, which are not yet complete, Koziol was eager to explain this problem in terms of probability and roulette. (Probability and statistics owe much of their early conceptual development to attempts by mathematicians to explain and predict roulette outcomes, says Koziol, and the game of roulette is still a favorite tool of statisticians today for explaining their science).
Suppose you have a roulette wheel, says Koziol, and you spin it 55 times. Say 47 throws return a unique number and eight throws land in the same four slots twice. "How many slots does the roulette wheel have?" asks Koziol.
Assuming that with each spin of the wheel, the ball has an equal probability to land in each slot, this creates a straightforward question (for mathematicians, that is). A combinatorial calculation gives 150 slots as a reasonable estimate.
But what if some of the slots on the roulette wheel are larger than others? Since genes are not all the same length, the probability of random mutations is not equal in all genes. "That's a tough problem," says Koziol. "You have to do a little modeling in terms of the underlying distribution of widths."
Koziol is working on the problem.
Send comments to: firstname.lastname@example.org