DYSPEPSIA GENERATION

We have seen the future, and it sucks.

The Poetry Fan Who Taught an LLM to Read and Write DNA

15th February 2025

Read it.

DNA is often compared to a written language. The metaphor leaps out: Like letters of the alphabet, molecules (the nucleotide bases A, T, C and G, for adenine, thymine, cytosine and guanine) are arranged into sequences — words, paragraphs, chapters, perhaps — in every organism, from bacteria to humans. Like a language, they encode information. But humans can’t easily read or interpret these instructions for life. We cannot, at a glance, tell the difference between a DNA sequence that functions in an organism and a random string of A’s, T’s, C’s and G’s.

“It’s really hard for humans to understand biological sequence,” said the computer scientist Brian Hie (opens a new tab), who heads the Laboratory of Evolutionary Design at Stanford University, based at the nonprofit Arc Institute (opens a new tab). This was the impetus behind his new invention, named Evo: a genomic large language model (LLM), which he describes as ChatGPT for DNA.

 

Comments are closed.