Apr 2025
Volume 41Issue 4p261-358, e1-e2
Large language models trained on DNA sequences, also known as genomic
language models (gLMs), hold significant potential to advance our understanding of
genomes and the interactions between DNA elements that drive complex functions. In
this issue, Benegas et al. review key opportunities and challenges for gLMs, outlining
important considerations for their development and evaluation to benefit the genomics
community. In this image, the two binary strings correspond to reverse-complementary
DNA sequences (00 = A, 01 = C, 10 = G, and 11 = T). The connecting rectangles
represent “embeddings” learned by gLMs. Illustration by Yun S. Song....Show more
Large language models trained on DNA sequences, also known as genomic
language models (gLMs), hold significant potential to advance our understanding of
genomes and the interactions between DNA elements that drive complex functions. In
this issue, Benegas et al. review key opportunities and challenges for gLMs, outlining
important considerations for their development and evaluation to benefit the genomics
community. In this image, the two binary strings correspond to reverse-complementary
DNA sequences (00 = A, 01 = C, 10 = G, and 11 = T). The connecting rectangles
represent “embeddings” learned by gLMs. Illustration by Yun S. Song.