Learning to Read the Genetic Book of Life

Researchers at the Weizmann Institute have developed a method to expedite the analysis of genetic syntax, in an attempt to advance our understanding of the genome.

Geneticists like referring to the genome as the Book of Life. And if the genome is like a book, then sequencing it is like listing all the letters throughout the book, in all their various combinations.

But having that list doesn’t mean we know how to read the book yet. The next step is to learn the language of how these genetic letters arrange themselves into a whole – the human body. Now scientists at the Weizmann Institute in Israel have devised a technique to make that process a little easier by allowing researchers to study thousands of sequences at a time.

The genome was first sequenced just over ten years ago. Since then, scientists have been trying to better understand it, typically piecing together the puzzle by trial and error.

The genome is our DNA, which is made up of nucleotides. Scientists have been tryign to understand what specific genomic sequences do by taking the sequence and making small changes (substituting one nucleotide for another, for instance, or removing a nucleotide) to see what happens to the cell containing the "mutated" sequence.

Given the countless possible interactions between genes, deciphering the function of each genetic sequence in our DNA is a daunting task. Currently the process is applied in a rather artisanal way, analyzing just a few sequences at a time. When dealing with tens of thousands of genes, all interacting with each other in numerous ways, it’s not the most effective method.

Taking a cue from Henry Ford’s playbook of mass production, Weizmann researchers have developed a method that allows them to study the implications of permutations in tens of thousands of cells simultaneously, in the same time it takes to study a few dozen.

By combining existing advanced technologies for genetic research, the scientists, led by Prof. Eran Segal from the Department of Computer Science and Applied Mathematics and the Department of Molecular Biology at Weizmann, created a unique large scale method for analyzing the genetic syntax. The results have recently been published in the journals Nature Biotechnology and Nature Genetics.

The work begins on the computer where a program automatically designs tens of thousands of short DNA sequences (strings of nucleotides), all of them 150 letters long. The sequences are mostly similar, with slight changes systematically inserted to differentiate them. For instance, researchers might switch up the nucleotides’ placements to learn how their location affects the characteristics and behavior of a genetic sequence.

“Our experiments take genetic parameters and change them very systematically from one sequence to another so we can learn the regularity of the genetic language,” says Segal.

After the computer has generated those sequences, they’re sent to a lab that brings them to life as actual genetic strips which the scientists at Weizmann then physically insert into individual identical cells.

In a test case for their technique, the researchers examined how changes in a genetic sequence would affect a protein that makes cells luminescent. After creating thousands of DNA sequences, they inserted them into the cells and observed where luminescence was most affected.  Then by sequencing the genome of the most affected cells, they could identify which particular sequences of nucleotides were responsible for the change.

In future experiments, researchers hope to apply the process to examine the effects of different genetic sequences on cancerous cells.

Given that the researchers are currently limited to single cells and short DNA strips, even their newly expanded and expedited method of sequencing is limited in what it can teach about the rules of genomic syntax.

Prof. Segal says that he and his team are also interested in manipulating stem cells – a unique type of cell that can develop into any tissue – thus allowing them to observe the effects on a broader basis, such as an entire tissue or even a complete limb, with the potential to further increase our understanding of genetic language.

Part of the current research included a test where the researchers guessed how certain genetic changes might affect a cell, based on prior experiments. "We succeeded in identifying many regularities and predicting them,” says Segal. “But there are many regularities we are far from pinning down.

"In the long run,” he continues, “I hope our systematic effort will allow us to read the entire genome book – not a specific book of an organism, but any genetic book at all.”