Detecting influenza outbreaks by analyzing Twitter messages

Preprint

27 July 2010

preprint
Published by arXiv in arXiv

https://doi.org/10.48550/arXiv.1007.4748

Abstract

We analyze over 500 million Twitter messages from an eight month period and find that tracking a small number of flu-related keywords allows us to forecast future influenza rates with high accuracy, obtaining a 95% correlation with national health statistics. We then analyze the robustness of this approach to spurious keyword matches, and we propose a document classification component to filter these misleading messages. We find that this document classifier can reduce error rates by over half in simulated false alarm experiments, though more research is needed to develop methods that are robust in cases of extremely high noise.

Keywords

ROBUSTNESS OF THIS APPROACH
DOCUMENT CLASSIFICATION
TWITTER MESSAGES
FALSE ALARM
INFLUENZA
KEYWORDS

All Related Versions

Version 1, 2010-07-27, ArXiv

This publication has 0 references indexed in Scilit: