This week's finds in genomics and beyond
Five items of interest as we get back to work in the new year
1. An expanded registry of regulatory DNA elements
One of the most profound findings of genomics is that a majority of the functional portion of our genome is devoted to regulatory sequence rather than genes that code for proteins. (This result was anticipated before the Human Genome Project.) Thus one of the major efforts in the field has been to catalog all of these regulatory DNA elements, which is difficult because they a) don’t follow a clear code like the genetic code for protein-coding genes and b) are often very cell type-specific and are hard to discover unless you make your measurements in the correct cell type. And the easiest, most scalable measurements we can make to detect regulatory DNA elements are only indirect measures of function, such as chromatin accessibility or epigenetic state. Thus making an inventory of all regulatory elements in the human genome is a tall order.
A new preprint reports an updated registry of candidate regulatory DNA elements for the human and mouse genomes. Using new analyses of ENCODE data, there are now 2.35 million human sequences that qualify as candidate regulatory elements. This is one of the most thorough and systematic analyses of regulatory elements across tissues.
There are a few notable features of the updated registry: More than half of these elements are not close to the transcription start sites of genes, meaning that act at a distance to control gene expression. 90% of the candidate elements (identified by indirect measures of function) have been tested by some sort of direct functional assay (which helps one decide whether to move an element out of the “candidate” category). This to me is a surprisingly high number and a nice sign of progress. Also, this update includes a big set of silencers — repressive regulatory elements that have been under-studied and which are not well understood.
The updated registry is a useful resource, but it’s important to keep some important limitations in mind: regulatory elements from many cell types are still not represented because the measurements haven’t been done, and the assays that directly measure regulatory function are still heavily weighted towards manipulable cell lines, rather than primary cells. In other words, there is still substantial work to do before we achieve a compete inventory of the most abundant type of functional element in the human genome.
2. On the meaning of “post-genomic”
One of my goals for 2025 is to work out a better science media diet, now that science Twitter has fragmented into too many sub-communities to keep track of. My old ‘must-read’ book mark folder has links to a bunch of dead blogs and publications that no longer publish anything that captures my interest. Trying to keep track of posts from all of the people I follow on Xitter, Bluesky, Substack notes, and LinkedIn usually leaves me disoriented.
Undark, an independent publication supported by MIT’s Knight Science Journalism Fellowship Program is an outlet that has flown under my radar, but which looks fantastic. It’s going on my daily must-read list for 2025. A recent piece by Yale biologist C. Brandon Ogbunu asks “What does it mean to be in the ‘post-genomic’ age?”Ogbunu discusses the kinds of problems biologists should be thinking about in this era in which sequence data no longer seems to be rate-limiting.
In today’s world, post-genomic is built on two important ideas: that, as previously mentioned, doing genomics is easier than ever, and that genetic information is not enough. Post-genomic embodies a world where we can and should focus on the next big (or small) revolutionary ideas in the study of the biological world.
“Genetic information is not enough” — while it may seem obvious, it’s also a good prompt for creative ideas. If genetic information is not enough to understand your disease/system of interest, and if “doing genomics is easier than ever”, how should we frame our questions? Will the solutions require more large consortia, or are ‘omic technologies sufficiently democratized so that important advances will be most likely made in individual labs?
3. How would you explain heritability to a fifth grader?
Harvard statistical geneticist Sasha Gusev explains how to explain heritability, and much else, in an interesting interview with Awais Aftab, who write the Substack Psychiatry at the Margins. Those of us working in genomics these days come from many different disciplinary backgrounds (I trained in biochemistry), and the grasp of key statistical genetic concepts in the broader community is uneven, to put it charitably. And get those statistical concepts are critical in the work that the genomics community does. What does a GWAS association really mean and what mechanisms could produce one? What does it mean to say that there is ‘missing heritability’? I don’t think our current biomedical PhD programs do a great job teaching non-statistical geneticists these concepts.
Gusev also discusses some fascinating ongoing challenges in the field, so even if you are a statistical geneticist, this interview is worth checking out. And if you want more, Gusev writes the excellent Substack The Infinitesimal.
4. Great reading ideas for 2025
Aside from getting in shape, a reading plan seems to be one of the most common New Year’s resolutions. I enjoyed this list of 9 Reading Ideas for 2025 from Jared Henderson who writes Commonplace Philosophy. Many of these ideas can be applied to scientific papers as well. For example: join a book (or journal) club, pick an author (or lab) and read the complete works, pick a topic and really master it. I like to read in stacks, such a bunch of books on the scientific revolution or general relativity. I’m trying to do the same thing with papers now, rather than only haphazardly reading whatever just came out in Cell or bioRxiv.
How do you master a topic? There is more to it than just reading (take notes, organize the information in your mind, write about it, even if just for yourself). But to narrow the question to just reading, how many papers should you read to master a topic? The advice from my director of graduate studies to those of us prepping for our qualifying exams was good: Identify 20 critical papers and know them in-depth. Along with that, read more quickly another ~100 other papers in the field. The process of identifying those 120 papers will already be a great start towards deeply understanding the field.
The final piece of advice from Henderson is to actually read the books you buy. (I admit I have a problem.) The same could be said for dozens or hundreds of the browser tabs many of us have open: just read the paper — or close the tab if you decide it’s not worth the time.
5. August Weissmann is underrated in the history of biology
On that last piece of reading advice, sitting on my shelf is a door-stop biography of the pioneering 19th century biologist August Weissmann. I’m finally going to read it this year. When people think about 19th century biology, Darwin gets an overwhelmingly disproportionate share of the attention, followed by Mendel as a distant second. Weissmann deserved much more, especially given the major influence of genetics and developmental biology on today’s biological sciences. Weissman developed important ideas about heredity and development well before the 20th century rediscovery of Mendelian genetics. He was also a superb experimentalist.
Here is the book summary from Harvard University Press:
The evolutionist Ernst Mayr considered August Weismann “one of the great biologists of all time.” Yet the man who formulated the germ plasm theory—that inheritance is transmitted solely through the nuclei of the egg and sperm cells—has not received an in-depth historical examination. August Weismannreintroduces readers to a towering figure in the life sciences. In this first full-length biography, Frederick Churchill situates Weismann in the swirling intellectual currents of his era and demonstrates how his work paved the way for the modern synthesis of genetics and evolution in the twentieth century.
In 1859 Darwin’s tantalizing new idea stirred up a great deal of activity and turmoil in the scientific world, to a large extent because the underlying biological mechanisms of evolution through natural selection had not yet been worked out. Weismann’s achievement was to unite natural history, embryology, and cell biology under the capacious dome of evolutionary theory. In his major work on the germ plasm (1892), which established the material basis of heredity in the “germ cells,” Weismann delivered a crushing blow to Lamarck’s concept of the inheritance of acquired traits.
In this deeply researched biography, Churchill explains the development of Weismann’s pioneering work based on cytology and embryology and opens up an expanded history of biology from 1859 to 1914. August Weismann is sure to become the definitive account of an extraordinary life and career.