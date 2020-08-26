Genetic data show how a single superspreading event sent coronavirus across Massachusetts — and the nation

None of the biotech executives at the meeting noticed the uninvited guest. They had flown to Boston from across the globe for the annual leadership meeting of the drug company Biogen, and they were busy catching up with colleagues and hobnobbing with upper management. For two days they shook hands, kissed cheeks, passed each other the salad tongs at the hotel buffet, never realizing that one among their number carried the coronavirus in their lungs.

By the meeting's end on Feb. 27, the infection had infiltrated many more people: a research director, a photographer, the general manager for the company's east division. They took the virus home with them to the Boston suburbs, Indiana and North Carolina, to Slovakia, Australia and Singapore.

Over the following two weeks, the virus that circulated among conference attendees was implicated in at least 35 new cases. In April, the same distinctive viral sub-strain swirled through two Boston homeless shelters, where it infected 122 residents.

Scientists know all this thanks to a mistake made during the coronavirus's replication process — a simple switch of two letters in the virus's 30,000-character genetic code. This mutation appeared in two elderly patients in France at almost exactly the same time that genetically matching viruses were sickening dozens of people at the Biogen meeting. After the conference, each time the infection spread, the mutation spread with it.

Now, a sweeping study of nearly 800 coronavirus genomes, conducted by no less than 54 researchers at the Broad Institute, Massachusetts General Hospital, the Massachusetts Department of Public Health and several other institutions in the state, has found that viruses carrying the conference's characteristic mutation infected hundreds of people in the Boston area, as well as victims from Alaska to Senegal to Luxembourg. As of mid-July, the variant had been found in about one-third of the cases sequenced in Massachusetts and 3 percent of all genomes studied thus far in the United States.

The study, which was added Tuesday to the preprint website MedRxiv, is probably the largest genomic analysis of any U.S. outbreak so far and is among the most detailed looks at how coronavirus cases exploded in the pandemic's first wave.

It documents the cost of the world's naivete this spring, when people traveling for events like the Biogen conference unwittingly imported the virus into Massachusetts dozens of times. It reveals the connections between seemingly disparate communities, showing how an outbreak at a gathering of wealthy executives was only a few infections removed from sickening some of Boston's most vulnerable residents. It highlights the outsize role of indoor "superspreading events" in accelerating and sustaining transmission. With genetic data, said co-author Bronwyn MacInnis, "a record of our poor decisions is being captured in a whole new way."

Although the study must undergo the rigors of peer review before it is published in a scientific journal, both outside experts and the scientists involved say it shows the power and promise of an emerging field of research known as genomic epidemiology. The small mutations that accumulate in a virus's genome are like genetic bar codes; by tracking them, researchers can trace infections to their sources and develop more effective interventions to stop the disease.

"This is the kind of study that … defines why genomics can be so useful in outbreak reconstruction," said Vaughn Cooper, a microbiologist at the University of Pittsburgh who was not involved in the Boston research. "It reflects a great deal of coordinating work, and that's what in part makes this so powerful."

But if the new research shows the powerful potential of genomic surveillance to unveil the path of the virus through communities, it's also an exception in terms of the large volume of data it contains. In the United States, such sophisticated genetic tracking has been "patchy, typically passive, reactive, uncoordinated, and underfunded," experts at the National Academies of Sciences, Engineering and Medicine wrote in a lengthy report last month. Advocates for the cutting-edge technique say more coordinated and comprehensive sequencing efforts could dramatically improve contact tracing and infection control.

As the nation flounders ahead of a possible second wave of infections, the study serves as both a portent and an opportunity, MacInnis said. The virus's genome may continue to record the consequences of the nation's failures — the too-large gatherings and too-fast reopenings, the testing shortages and lack of protective equipment, and the silent spread.