Researchers develop program to read any genome sequence and decipher its genetic code – shows underlying evolutionary forces

Yekaterina “Kate” Shulgina was a first year student in the Graduate School of Arts and Sciences, looking for a short computational biology project so she could check the requirement off her program in systems biology. She wondered how genetic code, once thought to be universal, could evolve and change.

That was 2016 and today Shulgina has come out the other end of that short-term project with a way to decipher this genetic mystery. She describes it in a new paper in the journal eLife with Harvard biologist Sean Eddy.

The report details a new computer program that can read the of any organism and then determine its genetic code. The program, called Codetta, has the potential to help scientists expand their understanding of how the genetic code evolves and correctly interpret the genetic code of newly sequenced .

“This in it of itself is a very fundamental biology question,” said Shulgina, who does her graduate research in Eddy’s Lab.

The genetic code is the set of rules that tells the cells how to interpret the three-letter combinations of nucleotides into proteins, often referred to as the building blocks of life. Almost every organism, from E. coli to humans, uses the same genetic code. It’s why the code was once thought to be set in stone. But scientists have discovered a handful of outliers—organisms that use alternative genetic codes—exist where the set of instructions are different.

This is where Codetta can shine. The program can help to identify more organisms that use these alternative genetic codes, helping shed new light on how genetic codes can even change in the first place.

“Understanding how this happened would help us reconcile why we originally thought this was impossible… and how these really fundamental processes actually work,” Shulgina said.

Already, Codetta has analyzed the genome sequences of over 250,000 bacteria and other called archaea for alternative genetic codes, and has identified five that have never been seen. In all five cases, the code for the amino acid arginine was reassigned to a different amino acid. It’s believed to mark the first-time scientists have seen this swap in bacteria and could hint at evolutionary forces that go into altering the genetic code.


Source: Researchers develop program to read any genome sequence and decipher its genetic code

Organisational Structures | Technology and Science | Military, IT and Lifestyle consultancy | Social, Broadcast & Cross Media | Flying aircraft