New Algorithms Developed to Attribute Origins of Genetically Engineered DNA
altLabs and the Johns Hopkins Center for Health Security Announces Results of Genetic Engineering Attribution Challenge 2020
January 26, 2021 – Every day, genetic engineering techniques are used to solve critical challenges in agriculture, manufacturing, and medicine. However, as the power of genetic engineering increases, so too does the potential for serious negative consequences if the technology is misused. It is often difficult to trace the origins of a genetically engineered product, making it difficult to ensure its creators receive due credit or are held accountable. Better tools are needed to advance our collective ability to connect the products of genetic engineering to their designers—a process known as genetic engineering attribution—to support responsible development of biotechnology.
To advance genetic engineering attribution, altLabs sponsored in partnership with the Johns Hopkins Center for Health Security, the Johns Hopkins University Applied Physics Laboratory, and the iGEM Safety and Security Program the Genetic Engineering Attribution Challenge on the DrivenData competition platform, offering monetary prizes for algorithms that could accurately predict the origin of genetically engineered DNA sequences. More than 300 teams from around the world participated in the competition, and prizes were awarded to 6 winning teams. Given 10 guesses for each sequence, the best teams were able to predict the source lab of an unfamiliar plasmid DNA sequence almost 95% of the time—a marked improvement over the top published score of 85%. These results demonstrate the potential for new machine learning approaches to further improve existing tools and solve the challenges associated with genetic engineering attribution.
“Synthetic biology offers fantastic benefits for society, but its anonymity opens the door for reckless or malicious actors to cause serious harm,” said Will Bradshaw, Competition Director at altLabs. “By removing this anonymity, genetic engineering attribution promises to make everyone safer—while still promoting and rewarding innovation. Thanks to modern machine learning techniques, reliable attribution of real-world engineered sequences is now within reach—as demonstrated by the results of this competition. Given further investment, attribution technology like this could play a key role in the future of synthetic biology.”
The Challenge consisted of 2 sequential tracks: Prediction and Innovation. In the Prediction Track, teams competed to attribute each DNA sequence to its origin lab with the highest possible accuracy. High-performing teams from the Prediction Track were then invited to participate in the Innovation Track, where they showcased their approaches to a multidisciplinary panel of expert judges. Winning teams adopted a variety of different technical approaches, demonstrating the diversity of methods available and the value of competitions as a means of attracting novel and wide-ranging expertise. To further promote the development and use of attribution technology, all code from winning submissions will be made open source.
While the top algorithms from the competition were highly effective at identifying where genetically engineered sequences were designed, they could not identify whether a given sequence, or the microbe that contains it, was engineered in the first place.
“Genetic engineering could improve the lives of all humans, but without the ability to determine where a genetically engineered microbe comes from, it is near impossible to govern or regulate genetically engineered products,” explained Lane Warmbrod, a Senior Analyst at the Johns Hopkins Center for Health Security. “As the ability to attribute a genetically engineered organism improves, so will the accessibility and equity of benefits as well as the accountability for any unintended harms.”
The competition ran from August 18 to October 19, 2020. More information on the winning teams and scores of the Challenge.
altLabs’ mission is to promote forward-thinking development and the responsible application of emerging technologies, with a special emphasis on synthetic biology. With backgrounds in genetics, bioinformatics, bioengineering, and computer science, altLabs researchers collaborate with partner organizations around the world to create new tools and novel pathways toward technology that benefits everyone.
About Johns Hopkins University Applied Physics Laboratory
The Johns Hopkins Applied Physics Laboratory (APL) has provided critical contributions to critical national security and scientific challenges with systems engineering and integration, technology research and development, and analysis. The Laboratory’s scientists, engineers, and analysts serve as trusted advisors and technical experts to the government, ensuring the reliability of complex technologies that safeguard our nation’s security and advance the frontiers of space. A research division of Johns Hopkins University, APL also maintains independent research and development programs that pioneer and explore emerging technologies and concepts to address future national priorities.
About the iGEM Foundation
The International Genetically Engineered Machine (iGEM) Foundation is an independent, nonprofit organization dedicated to the advancement of synthetic biology, education and competition, and the development of an open community and collaboration. The iGEM Competition gives students the opportunity to push the boundaries of synthetic biology by tackling everyday issues facing the world. Multidisciplinary teams work together to design, build, test, and measure a system of their own design using interchangeable biological parts and standard molecular biology techniques.
DrivenData is a social enterprise dedicated to bringing the data tools and methods that are transforming industry to the world’s biggest challenges. As part of that work, DrivenData competition platform channels the skills and passion of data scientists, researchers, and other quantitative experts to build solutions for social good. These online machine learning challenges are designed to engage a large expert community, connect participants with real-world data problems, and highlight their best solutions.