Modern genetic research can be divided into several chapters. It began with the discovery of the double-helix structure of DNA, reported by scientists James Watson and Francis Crick in 1953, who based their findings on the research of Rosalind Franklin without giving her proper credit. This was followed by gradual progress in the identification and study of specific genes, until the second revolution: the sequencing of the full human genome, which was completed by the early 2000s. Since then, breakthroughs in DNA-sequencing techniques and in bioinformatics have made it possible to study millions of genomes simultaneously. This heralds the advent of the third genetic age, in which researchers are trying to understand how thousands of genes function together. The best way to do that is to examine the genome of as many people as possible.
These scientific revolutions have led to significant medical achievements: Treatments that bordered on science fiction just a few decades ago are now available for some diseases. However, many of these accomplishments deal only with understanding the “instruction manual” of the cells: the three billion letters of DNA that recur in almost identical order in every cell in our body.
At the same time, every cell uses (or “expresses,” in scientific terminology) different genes for its functioning – whether in a normal state or a diseased one – in accordance with the regulatory codes that determine where, when and how each gene is expressed. For example, a mutation (a genetic disorder) that leads to the development of cystic fibrosis or to a higher probability of developing Alzheimer’s disease will exist in the genome of every cell in the body, but will tend to be expressed only in certain types of cells. Accordingly, even if a disease affects multiple organs, understanding its cause requires the identification of the specific cells in which the mutation has begun to have an effect. For this reason, in order to understand how changes in genes cause diseases to develop, we need to know what cell types the body contains and what their attributes are.
The problem is that today science does not possess full knowledge of all the cells in the body, or of their function and role. Presently, researchers from around the world, including Israelis both at American universities and working here in Israel, are attempting to fill this lacuna. The Human Cell Atlas project aims to construct a full map of all the cells of the healthy human body and to examine how the different cells communicate with each other, in an attempt to create a point of comparison for all future research into human diseases. Knowledge of all the types of cells – the fundamental units of life – will make possible an additional revolution in understanding human health and in identifying, monitoring and curing disease.
Israeli-born Dana Pe’er, a computational biologist from the Sloan Kettering Institute in New York (the largest and oldest private cancer research institute in the world), has been involved in the project from its first stages and is responsible for some of the computational instruments that make it possible for scientists to set themselves the ambitious goal of mapping all the existing cell types in the body. Speaking by phone from New York, the professor explained that all the body’s cells contain an almost identical genome. But what’s important for understanding the biology is what happens in a particular cell – how it uses the same instruction manual for different purposes.
“Less than 1 percent of the genome codes genes,” noted Pe’er. “The overwhelming majority of the DNA codes the regulation of the genes – determines which gene will operate where.”
As such, each cell has a regulatory program that determines its response to different situations, and each cell is organized in relation to other cells.
“The cells are not just neighbors, they exist in a particular spatial arrangement,” she explained. “The different types of cells cooperate in order to create tissues.” It follows, she added, “that to understand diseases, it is necessary to understand where the malfunction took place, what went wrong and how the damaged cell is connected to the other cells, to the tissues and to the functioning of all the organs.”
However, until the past few years, science’s ability to differentiate between types of cells was limited to morphological analysis: to drawing conclusions about their functioning on the basis of their form only. According to Aviv Regev from the Massachusetts Institute of Technology, who is one of the founders of the Human Cell Atlas project and is basically overseeing all the research being conducted within its framework, some scientific technologies, known as Single Cell Genomics, have emerged from the work of several labs, including hers.
“Personally,” the biology professor added, “it became clear to me around 2014 that an atlas was possible – there were examples of these technologies being used to find cell types, states and dynamic transitions; you could trace their development; you could even map their positions in tissue.”
The Israeli-born scholar relates that she and her colleagues started working on advances that would make such mapping possible on a larger scale. She also started floating the idea within the scientific community. At the same time, other researchers became aware of the potential latent in accelerating a project involving the large-scale mapping of cell types.
In 2016, Regev, together with Dr. Sarah Teichmann from the Wellcome Sanger Institute in Britain, organized a first conference of about 90 scientists in London, with the aim of carving out a path for mapping the cells of the human body. An organizing committee of 24 members was set up – including Regev, Pe’er, Teichmann and Prof. Sten Linnarsson from the Karolinska Institute in Sweden – and spearheaded the planning process for an entire year. The full project was then launched with the presentation of a white paper at the Weizmann Institute of Science in Rehovot. At present, almost 1,500 scientists from 969 institutions in 64 countries are involved in the endeavor.
Esti Yeger-Lotem, from Ben-Gurion University in Be’er Sheva and from the National Institute for Biotechnology in the Negev – whose lab is taking part in the undertaking – explained that in order to differentiate between types of cells based on their functions, many researchers are focusing on DNA-like RNA molecules. One of these molecules’ tasks is to replicate the “operating instructions” in the genes and transmit them to ribosomes, which code proteins based on them. RNA molecules also play roles related to regulating the cell’s activity, thus dictating the cell’s development and functioning.
Until a few years ago, research of this kind would not have been possible. With the old methods, scientists could study RNA from whole tissues and come up with an “average” of the molecules that determine expression of the genes in a large number of cells; accordingly, they could glean only a general understanding of the cells’ function in a particular tissue.
“If you think of the cells as pieces of fruit, this is similar to making a fruit smoothie: What you see in the end is an average of all of them, but not any one fruit in particular. So, if for example some kind of cell is very rare, you may not be able to realize it at all,” Regev, Teichmann and Linnarsson wrote Haaretz.
The first important scientific development, noted Prof. Yeger-Lotem, was the ability to mark the RNA of individual cells and, in that way, to differentiate between the RNA of the various cells when all the genetic material is sequenced. The improvement of this method in recent years makes it possible now for researchers to measure gene activity in hundreds of thousands of cells simultaneously. However, because the importance of any given cell derives in part from its location in relation to others, scientists are now working on ways to improve the measurement of gene functioning in the tissue itself (and not in single cells).
“Importantly,” Regev, Teichmann and Linnarsson wrote, “these new methods can measure many genes simultaneously, so that we can generate integrated maps of all cell types at once.”
The new genetic research methods are generating vast quantities of complex, tangled data, which include flaws and errors. Hence the prime importance of the computational biologists in the project. As the three scientists explained, “Computational methods have been developed to preprocess and minimize artifacts [i.e., mistakes], and to integrate and reveal the underlying structure of human cell types. Computational algorithms are needed in order to group cells into their types, in order to group genes into the ‘programs’ that they use, and in order to trace back the developmental pathways of cells from high-throughput single cell genomics data.”
The first draft of the Human Cell Atlas, they added, “will profile 100 million cells from major tissues and systems, from healthy research participants of both genders. This first draft and the lessons learned in building it will then serve as the basis for a comprehensive atlas of at least 10 billion cells, covering all tissues, organs and systems – the necessary reference for future comparison and biological insight across disease areas, genetic diversity, environments and ages.”
The project is currently supported by the Wellcome Trust research charity in the U.K., the National Institutes of Health in the United States, the European Union, the Manton Foundation, the Helmsley Charitable Trust and the Chan Zuckerberg Initiative. “We aim to complete a first draft in the first five years, and anticipate that it would take about a decade for a comprehensive atlas,” the researchers noted.
The project’s leaders are convinced that even before its completion, the research being conducted will lead to scientific and medical breakthroughs. In fact, the researchers involved in compiling the atlas have already discovered several new cell types of considerable medical importance. One cell, very rare in the human body, that was discovered in the lab of Prof. Regev and Dr. Jayaraj Rajogopal from Massachusetts General Hospital, has been dubbed the “pulmonary ionocyte.”
“Strikingly,” the researchers wrote, “these ionocytes expressed the gene CFTR at levels higher than any other cell type. CFTR is the gene which, when mutated, causes cystic fibrosis in humans. CFTR is critical for airway function, and for decades researchers and clinicians assumed that it is frequently expressed, but at low levels, in ciliated cells, a common cell type spread throughout the entire airway. But according to the new data, the majority of CFTR expression occurs in only a few cells that we didn’t even know existed until now.”
Cystic fibrosis – a hereditary disease that affects breathing and other systems – has been at the center of scientific studies for decades, and is exceptional in that it has been linked to a mutation in a single gene.
“We’re still discovering completely new biology that could alter the way we approach it,” Regev, Teichmann and Linnarsson added. “The results of that work may have implications for developing targeted cystic fibrosis therapies. For example, a gene therapy that corrects for a mutation in CFTR would need to be delivered to the right cells, and a cell atlas of the tissue could provide a reference map to guide that process.”
Another breakthrough deriving from the use of the new cell-type identification techniques was made in the lab of Ido Amit at the Weizmann Institute. Prof. Amit, a member of the project’s organizing committee, has been involved from the outset. He and his colleagues discovered that there are cells in the immune system, called microglia, that are activated in the brain in response to the presence of protein clusters (called plaque) and are responsible for removing the debris in the brain, whose accumulation is associated with the development of Alzheimer’s. The discovery creates new possibilities for preventing and treating Alzheimer’s, such as by controlling the activity of the microglia in order to slow down the spread of the disease.
A third example in this regard consists of breakthroughs in identifying types of cells involved in the development of cancerous tumors. The malignant tumors are composed of different types of cells – cancerous cells that are differentiated by their mutant genomes and by the genes they express. Along with noncancerous cells, such as those of the immune system, connective tissues, blood vessels and so on, they create the tumor’s microenvironment. Recent studies have shown that all these cells, both cancerous and noncancerous, play a key role in the tumor’s growth, and new immunotherapy treatments have shown that attacking them can benefit cancer patients.
Many patients do not respond to immunotherapy, however. One reason for this was discovered in research in which Aviv Regev participated. The study detected a singular subtype of cancerous cell that distances cells of the immune system from the tumor. Regev’s lab, with the participation of researchers from the Dana-Farber Cancer Institute at Harvard, drew on computational methods to identify a medicinal treatment that would overcome these cells. Regev relates that the new method worked in cancerous tumors in a culture and in mice, and that scientists are now devising a clinical experiment of the treatment in humans.
Similarly, the three researchers reported, studies conducted by partners in the Human Cell Atlas project (among them, Profs. Pe’er and Amit) discovered previously unknown attributes in T cells of the immune system in cancer patients.
The ambition, scope and importance of the Human Cell Atlas project invite comparison with the previous big undertaking of biological research: the mapping of the human genome. Discussion about the mapping of all three billion letters of human DNA began in the mid-1980s, the federally funded project got underway officially in 1990, and the first draft of the complete genome was presented in 2001. The entire project cost $3 billion; the participants came from 20 universities and research institutes in the United States, Britain, Japan, France, Germany and China. The mapping of the genome also encountered competition between public research and private business, with scientist and entrepreneur J. Craig Venter announcing a parallel project at one-10th of the cost. Ultimately, the results of the public and the private initiatives were presented together.
The goal of both the genome and the atlas projects is to gain an understanding of vital aspects of human biology: our genes and our cells. There are differences between the two projects, however, beginning with their organizational structures. The earlier endeavor was initiated at the institutional level, centrally managed and carried out in a small number of labs, relative to the scale of the undertaking. In contrast, the Human Cell Atlas project originated from below, from the researchers, and is dispersed among a large number of labs and scientists around the world.
This democratic approach to scientific knowledge, reflected as well in the open-source nature of all the information accumulated within the project’s framework, is intended to make possible a large range of studies.
“From the beginning, we have designed this project as a public good to enable science around the world,” Regev, Teichmann and Linnarsson noted. “The project will help propel translational discoveries and applications, ultimately laying a foundation for a new era of precision and regenerative medicine.”
Dana Pe’er emphasizes the importance of the participation in the project of labs in African and South American countries – allowing it both to encompass the full range of human biology (unlike the genome project, which focused on the DNA of white Europeans), and to encourage scientific research in those countries. In addition, she observed, one of the project’s basic goals is to integrate clinical physicians into the research. “When we create a map of cells in the lung, it needs to be applicable, and therefore a lung expert also needs to be involved in constructing it,” she explained.
However, Prof. Pe’er is apprehensive that the initiative from below and its decentralized nature also constitute the greatest challenges to the project’s completion. “Many front-rank scientists are involved in it, and there is agreement on its necessity and importance,” she says. “But the NIH and the EU don’t yet know how to swallow a project like this whose initiative came from below.”
According to Aviv Regev, an international project of this breadth must necessarily rely on a range of donors. The next three to five years of the project, until the completion of its first stage, will largely be funded with the aid of most of the above-mentioned institutions and foundations.
In Regev’s view, the scientific aspect remains the study’s greatest challenge. “The adult human body has 37.2 trillion cells,” she wrote. “However, cells of the same ‘type’ will largely look very similar to each other.” In order to succeed, systematic research must be conducted on the large and highly complex organs of the human body. “We need to sample enough to assume we have seen enough representative examples,” she added.
How many types of cells do they expect to discover? “No one knows today how many types of cells there are,” Regev explained. She recalls that during the mapping of the human genome, some scientists estimated that DNA included hundreds of thousands of different genes, but after the mapping was completed it turned out that there are only a few tens of thousands. Scientists are still arguing about the issue, and therefore, she said, “people are afraid to specify a number.”