The genome’s ‘dark matter’
Repetitive DNA sequences once dismissed as junk may regulate how cells grow
More than two decades ago, scientists celebrated the completion of the Human Genome Project — an international endeavor to identify and sequence all human genes — as a triumph of modern biology. But there was a caveat: Roughly 7% of our genetic code was missing from the published sequence, as vast stretches of repetitive DNA called satellite sequences were too complex to decode with the technology of the time. Many researchers were unconcerned, as they suspected these sequences had little biological significance.
Now, a team led by Nicolas Altemose, PhD, an assistant professor of genetics, is revealing that these mysterious regions are far from junk DNA. They may be enormous control centers that help regulate one of the most fundamental processes in biology: how cells decide when to grow.
“Finally, technologies have come along that let us peer into these regions of the genome,” Altemose said. “It’s like charting the unknown and, eventually, it may have implications for human health.”

Photography by Greg Kahn/HHMI
The sequences are called satellite DNA because they have a tendency to separate from other snippets of DNA in certain experiments, like satellites orbiting a planet, due to their unique properties. The sequences are highly repetitive, with short DNA patterns repeated hundreds of thousands or even millions of times.
Early genetic sequencing technologies could read only about a thousand DNA letters at once, and each of these decoded thousand-letter snippets then had to be pieced back together in the right order. When the genome has unique sequences, scientists can figure out where each piece belongs by matching overlapping edges.
But satellite DNA is like a jigsaw puzzle made entirely of blue sky: millions of nearly identical pieces with no landmarks to distinguish them. For years, it was impossible to tell which repeat belonged where, or even how many repeats there were.
The breakthrough came with long-read DNA sequencing technology, which can process hundreds of thousands of DNA letters at once instead of merely a thousand. Altemose was part of the group of scientists who finally assembled these missing regions in 2022, completing the human genome in full.
“Seeing the complete genome assembly for the first time and seeing how these things were organized after decades of not really knowing what was in there was just so exciting,” Altemose said.
But it was also the beginning of trying to figure out why the satellite sequences existed.
Altemose, who launched his Stanford Medicine lab in 2023, and his team decided to search for proteins that might bind to HSat3 — a type of satellite DNA that is found in large blocks on many chromosomes, including the largest single satellite region in the human genome, which spans a staggering 28 million DNA letters (the length of about a thousand average genes).
Using computational analysis, Altemose’s lab group scanned HSat3 for matches to known protein binding sites. Their hunt turned up thousands of spots that matched the precise sequences used as docking stations for transcription factors — proteins that normally attach to the genome to flip on or off nearby genes. But HSat3, because of its incredibly long repetitive nature, doesn’t have any nearby genes.
“It was pretty surprising and almost, I would say, bizarre,” Altemose said. “Why would these transcription factors be localizing in such high numbers to these areas of the genome?”
A hidden switch for growth
His team focused on one of the transcription factors, TEAD, which is known to help control genes related to cell growth and proliferation. Using microscopy, the researchers watched fluorescently tagged TEAD accumulating on HSat3 regions inside living cells. At the same time, another protein, YAP, was also accumulating with TEAD. More surprisingly, the clusters of protein and HSat3 DNA were found inside the cells’ nucleoli, compartments where ribosomes are made.
The location gave Altemose’s team a clue about what the HSat3-associated TEAD might be doing. Ribosomes are the molecular machines that manufacture all proteins in the cell. Making ribosomes is a rate-limiting step for cell growth; cells cannot grow faster than they can build these molecular factories. Meanwhile, TEAD is part of a group of molecules famous for sensing environmental cues — nutrient availability, cell crowding, mechanical forces — to decide whether cells should proliferate.
Altemose and his colleagues suspected TEAD could be directly controlling ribosome factories in the nucleoli as part of its role in mediating cell growth and division. To test this, they repressed HSat3. TEAD was no longer found in the nucleoli, and the production of new ribosomes plummeted. By concentrating TEAD and YAP right where ribosomes are being made, the scientists hypothesized, HSat3 helps cells more quickly make new ribosomes when cells sense the right conditions.
“We think that HSat3 is acting almost like a sponge, soaking up TEAD and concentrating it inside the nucleolus where it can directly regulate ribosome production,” Altemose said. “It’s an incredibly novel discovery.”
The discovery may eventually have medical implications. Cancer cells are notoriously dependent on cranking out ribosomes to fuel their rapid division, so targeting the interaction between TEAD and HSat3 could be one way to slow their growth.
Altemose thinks the new findings are one small step toward fully understanding satellite DNA. His team plans to follow up on other transcription factors beyond TEAD that are predicted to bind to HSat3. Each may represent a separate regulatory function of the repetitive DNA sequences.
Altemose’s lab is also focusing on the repetitive regions of DNA found in centromeres — structures in each chromosome that ensure chromosomes are properly inherited when cells divide. The group has already shown how patterns of methyl chemical groups added to these repetitive regions can impact cell growth and division.
“This is all completely open-ended, brand-new biology,” Altemose said. “We don’t really know where it will lead, but we finally have the tools to do the hard work and study these regions of the genome that have been neglected for so long.”
Spotlight on Nicolas Altemose
Associate professor of genetics
The Altemose Lab applies new tools and technologies to explore the biology of repetitive areas of the human genome.
- Grew up in Hawaii and Southern California.
- Enjoyed graduate school so much that he completed two back-to-back PhD programs, first in statistics at Oxford and then in bioengineering at the University of California, Berkeley and UC San Francisco.
- Has applied for three patents for technologies used to map the interactions between proteins and DNA.
- Loves RuPaul’s Drag Race, Björk and ice cream.
In his words: “Some of the most important future scientific discoveries are currently hidden in unexpected places. That’s why we’re following our curiosity to explore the dark corners of the genome. Who knows what else we might find?”