Detecting influenza A virus antigenicity with density-based algorithms

Sofia McDonough, a student from Florida State University, worked with Dr. Pej Rohani & Dr. Alpha Forna on density-based algorithms for detecting influenza A virus.

Abstract: Influenza A H3N2 viruses mutate over time, leading to different antigenic variants. Viruses that elicit a similar immune response are considered to be part of the same antigenic cluster. It is important to predict antigenic cluster transitions in order to determine which viruses to include in seasonal flu vaccines. Typically, these transitions are identified using a hemagglutination inhibition assay, which can be time consuming. In an attempt to identify viruses that may form new antigenic clusters using sequence data alone, we have developed NIAViD (Novel Influenza A Virus Detector), which is an unsupervised machine learning model. It uses an outlier detection algorithm to detect transitions based on physicochemical properties calculated from HA sequences. NIAViD has previously utilized the Isolation Forest and One Class SVM outlier detection algorithms. This project tested the performance of two new methods, one based on density distribution (ECOD) and the other based on hierarchical clustering (HDBSCAN). We found that Isolation Forest, One Class SVM, and ECOD resulted in similar performance, as quantified by ROC AUC, but ECOD resulted in much lower sensitivity. HDBSCAN had a much lower AUC and lower precision than any of the other algorithms. Overall, NIAViD’s performance is higher when using Isolation Forest and One Class SVM than using ECOD and HDBSCAN. In the future, additional algorithms could be included in the pipeline to improve performance so that NIAViD can be used to aid in vaccine development.