Scientists can now assemble entire genomes on their personal computers in minutes

Scientists at the Massachusetts Institute of Technology (MIT) and the Institut Pasteur in France have developed a technique for reconstructing whole genomes, including the human genome, on a personal computer. This technique is about a hundred times faster than current state-of-the-art approaches and uses one-fifth the resources. The study, published September 14 in the journal Cell Systems, allows for a more compact representation of genome data inspired by the way in which words, rather than letters, offer condensed building blocks for language models.

“We can quickly assemble entire genomes and metagenomes, including microbial genomes, on a modest laptop computer,” says Bonnie Berger, the Simons Professor of Mathematics at the Computer Science and AI Lab at MIT and an author of the study. “This ability is essential in assessing changes in the gut microbiome linked to disease and bacterial infections, such as sepsis, so that we can more rapidly treat them and save lives.”

[…]

To approach genome assembly more efficiently than current techniques, which involve making pairwise comparisons between all possible pairs of reads, Berger and colleagues turned to language models. Building from the concept of a de Bruijn graph, a simple, efficient data structure used for genome assembly, the researchers developed a minimizer-space de Bruin graph (mdBG), which uses short sequences of nucleotides called minimizers instead of single nucleotides.

“Our minimizer-space de Bruijn graphs store only a small fraction of the total nucleotides, while preserving the overall genome structure, enabling them to be orders of magnitude more efficient than classical de Bruijn graphs,” says Berger.

[…]

Berger and colleagues used their method to construct an index for a collection of 661,406 bacterial genomes, the largest collection of its kind to date. They found that the novel technique could search the entire collection for antimicrobial resistance genes in 13 minutes—a process that took 7 hours using standard sequence alignment.

[…]

“We can also handle sequencing data with up to 4% error rates,” adds Berger. “With long-read sequencers with differing error rates rapidly dropping in price, this ability opens the door to the democratization of sequencing data analysis.”

Berger notes that while the method currently performs best when processing PacBio HiFi reads, which fall well below a 1% error rate, it may soon be compatible with ultra-long reads from Oxford Nanopore, which currently has 5-12% error rates but may soon offer reads at 4%.

[…]

Source: Scientists can now assemble entire genomes on their personal computers in minutes

Simple Mathematical Law Predicts Movement in Cities around the World

The people who happen to be in a city center at any given moment may seem like a random collection of individuals. But new research featuring a simple mathematical law shows that urban travel patterns worldwide are, in fact, remarkably predictable regardless of location—an insight that could enhance models of disease spread and help to optimize city planning.

Studying anonymized cell-phone data, researchers discovered what is known as an inverse square relation between the number of people in a given urban location and the distance they traveled to get there, as well as how frequently they made the trip. It may seem intuitive that people visit nearby locations frequently and distant ones less so, but the newly discovered relation puts the concept into specific numerical terms. It accurately predicts, for instance, that the number of people coming from two kilometers away five times per week will be the same as the number coming from five kilometers twice a week. The researchers’ new visitation law, and a versatile model of individuals’ movements within cities based on it, was reported in Nature.

[…]

The researchers analyzed data from about eight million people between 2006 and 2013 in six urban locations: Boston, Singapore, Lisbon and Porto in Portugal, Dakar in Senegal, and Abidjan in Ivory Coast. Previous analyses have used cell-phone data to study individuals’ travel paths; this study focused instead on locations and examined how many people were visiting, from how far and how frequently. The researchers found that all the unique choices people make—from dropping kids at school to shopping or commuting—obey this inverse square law when considered in aggregate. “The result is very simple but quite startling,” says Geoffrey West, an urban scaling theorist at the Santa Fe Institute and one of the paper’s senior authors.

[…]

“Those organizational patterns have really profound implications on how COVID will spread,” Scarpino says. In a smaller rural location, where many people regularly go to the same church or grocery store, the entire town will experience sharp peaks of infections as the virus sweeps through the community. But in a bigger city, the propagation takes longer, he explains, because mini epidemics can occur in each neighborhood somewhat separately.

Stewart adds: “The authors demonstrate that their visitation law—that takes into account both travel distance and frequency of visits in a way that other models do not—outperforms gravity models when it comes to predicting flows between locations.”

Source: Simple Mathematical Law Predicts Movement in Cities around the World – Scientific American