An interview with Dr. Olga Troyanskaya,
Assistant Professor of Computer Science,
Lewis-Sigler Institute for Integrative Genomics
Most people who use common bread yeast use it to bake crusty loaves of bread. Olga Troyanskaya is using bread yeast to help find a cure for cancer. That’s far from the only unusual thing about Olga Troyanskaya. In college she double majored in biology and computer science which led to her current work in bioinformatics, a subject about which she says “I didn’t even know what it was until sophomore year.” Her current quick definition of the subject is “any sort of application of innovative computer science, statistics, and mathematical techniques used to solve problems in biology – usually also based on lots of data.”
In a brightly lit office in the Carl Icahn Lab building with her canine collaborator Jessy stationed protectively near her desk, Olga uses her expertise in math and computer science to try to understand the inner working of cells. “Traditionally in biology someone went into a lab, did some experiments, wrote some things down in a notebook, took some pictures and wrote a paper.” That’s still being done now and Olga thinks that for much biological research that is fine, but it isn’t what she does.
About 90% of her work is done on computers. She claims to use a great deal of data, and by biology standards she does, but on closer examination her data storage needs pale in comparison to people working with images. She builds many matrices of about 500,000 numbers and feeds them into her elaborate mathematical models which make predictions about yeasts or cancer, though she says data on almost anything would work as well. Then the predictions are validated in her wet lab. “Most mathematicians,” she points out, “wouldn’t have a wet lab.” They’d have to deal with biologists. Knowing the math, computer science, and biology speeds Olga’s research and directs it in more relevant ways.
A major difficulty in processing the input data is that it contains a great deal of noise. Noise is actually a technical term for extraneous and erroneous data that unfortunately creeps into almost all experiments. The noise is misleading and distracting. It is very difficult to be sure what is noise and what is some unexplained behavior. A classic example of noise is the attempt by Penzias and Wilson is 1965 to do noise free telephone transmissions. The unwanted noise they could not eliminate resulted in their discovery of the big bang background radiation.
The wet lab experiments indicate that her model is working or that it isn’t. In either case the results of the experiments are manually fed back to the model to improve it. Her future plans include getting the model to learn from its successes and failures by feeding back the results to the model and having the model adjust itself accordingly.
Today Olga gets the processing power she needs from Computer Science and Genomics computers. She knows she needs lots more sequential CPU cycles. Unfortunately her model does not lend itself to parallel processing. Without faster processing, effort that should go into making the model more predictive will be spent in seeing what compromises might speed up the model at some cost in accuracy. Another challenge is displaying the sea of numbers the model produces in a graphical format that makes it easier to grasp the results. Even graphical representations crowd a conventional computer screen, but a large display wall will soon be available in the lobby of the Carl Icahn Lab.
When you next take a bite of some tasty bread, remember that the yeast that makes it so tasty is being used at Princeton to help find a cure for cancer. Olga declines to agree that she might find a cure for cancer, but she says that her work will help understand how the regulation of cells occurs and what goes wrong in that to get disease. That sounds like a big step in the direction of a cure.