Model Accuracy in Forecasting Pathogen Spread Using Climatic Data

Annakate Schatz, a student from Mount Holyoke College, worked with Dr. Andrew Kramer to test the accuracy of models used to predict pathogen spread.

Abstract: Species distribution models (SDMs) are commonly used to predict the total possible area a species can occupy. These models, however, rely on an assumption of equilibrium with the environment. When we discover of a new disease or pathogen, it has not necessarily reached equilibrium, but one of the most urgent questions is where else it might spread. This project sought to answer where a pathogen will spread in the future with a model selection case study focused on time-sensitive predictive ability. Previous research on how well SDMs predict disease spread has worked either with a single model-fitting method or with a simulated disease (Vaclavik and Meentemeyer 2012, Patel unpublished). We extend such investigations by modeling a real disease, Batrachochytrium dendrobatidis, using multiple methods trained on time subsets of the available occurrence data. We split our data into training and testing sets, and evaluated models on each set by calculating the AUC, kappa, True Skill Statistic, and false negative rate. We found that Maximum Entropy performed best on the subsetted data, followed by boosted regression trees and randomForest methods. Predictions tended to improve with more training data, although all AUC scores dropped when the last subset was added in. From this project, we gain a better sense of how quickly we can determine the potential extent of an emerging disease.

Download (PDF, 1.2MB)