Three-dimensional Molecular Distance Map (MoDMap3D) of (a) 3273 viral sequences...
Three-dimensional Molecular Distance Map (MoDMap3D) of (a) 3273 viral sequences from Test-1 representing 11 viral families and realm Riboviria, (b) 2779 viral sequences from Test-2 classifying 12 viral families of realm Riboviria, (c) 208 Coronaviridae sequences from Test-3a classified into genera.

News • Coronavirus origins

Researchers crack COVID-19 genome signature

Using machine learning, a team of Western computer scientists and biologists have identified an underlying genomic signature for 29 different COVID-19 DNA sequences.

This new data discovery tool will allow researchers to quickly and easily classify a deadly virus like COVID-19 in just minutes – a process and pace of high importance for strategic planning and mobilizing medical needs during a pandemic. The study also supports the scientific hypothesis that COVID-19 (SARS-CoV-2) has its origin in bats as Sarbecovirus, a subgroup of Betacoronavirus. 

The findings, Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: COVID-19 case study, were published in PLOS ONE.

All we needed was the COVID-19 DNA sequence to discover its own intrinsic sequence pattern

Kathleen Hill

The “ultra-fast, scalable, and highly accurate” classification system uses a new graphic-based, specialized software and decision-tree approach to illustrate the classification and arrive at a best choice out of all possible outcomes. The entire method uses a new graphic-based, specialized software to illustrate a best choice out of all tested possible outcomes. Biology professor Kathleen Hill co-led the study with Western collaborators in Computer Science and Statistical and Actuarial Sciences, along with others in the University of Waterloo’s Department of Computer Science.

The machine-learning method achieves 100 per cent accurate classification of the COVID-19 sequences and more importantly, discovers the most relevant relationships among more than 5,000 viral genomes again within minutes. “All we needed was the COVID-19 DNA sequence to discover its own intrinsic sequence pattern. We used that signature pattern and a logical approach to match that pattern as close as possible to other viruses and achieved a fine level of classification in minutes – not days, not hours but minutes,” Hill said.

This classification tool has already been used to analyze more than 5,000 unique viral genomic sequences, including the 29 COVID-19 sequences available on Jan. 27. Hill believes the tool, which is able to classify any newly discovered virus sequence COVID-19 or otherwise, will be an essential component in the toolkit for vaccine and drug developers, front-line health-care workers, researchers and scientists during this global pandemic and beyond.

Source: Western University


Read all latest stories

Related articles


News • Coronavirus genome folding

Researchers prepare for “SARS-CoV-3”

For the first time, an international research alliance has observed the RNA folding structures of the SARS-CoV2 genome with which the virus controls the infection process. This could not only lay the…


News • Real-time tumor profiling

AI tool decodes brain cancer’s genome during surgery

Scientists have designed an AI tool that can rapidly decode a brain tumor’s DNA to determine its molecular identity during surgery — critical information that can guide treatment decisions.


News • Independent genes

Cancer: Study sheds light on mysterious DNA rings

Tumors sometimes seem to take on a life of their own, with cancer genes “striking out” in ring shapes. An international research team has new insights into this phenomenon.

Related products

Low DNA Binding Micro Tubes

Research Use Only

Sarstedt · Low DNA Binding Micro Tubes

Maxwell CSC Instrument


Promega · Maxwell CSC Instrument

Promega GmbH
White Multiply PCR Plates


Sarstedt · White Multiply PCR Plates

COVID-19 virus diagnostics products

Saliva Collection

Sarstedt · COVID-19 virus diagnostics products

Subscribe to Newsletter