-
BioCaster: detecting public health rumors with a Web-based text mining system
- Back
Metadata
Document Title
BioCaster: detecting public health rumors with a Web-based text mining system
Author
Collier N, Doan S, Kawazoe A, Goodwin RM, Conway M, Tateno Y, Ngo QH, Dinh D, Kawtrakul A, Takeuchi K, Shigematsu M, Taniguchi K
Name from Authors Collection
Affiliations
Research Organization of Information & Systems (ROIS); National Institute of Informatics (NII) - Japan; Japan Science & Technology Agency (JST); City University of New York (CUNY) System; Lehman College (CUNY); Research Organization of Information & Systems (ROIS); National Institute of Genetics (NIG) - Japan; Kasetsart University; National Science & Technology Development Agency - Thailand; National Electronics & Computer Technology Center (NECTEC); Okayama University; National Institute of Infectious Diseases (NIID)
Type
Article
Source Title
BIOINFORMATICS
Year
2008
Volume
24
Issue
24
Page
2940-2941
Open Access
hybrid, Green Published
Publisher
OXFORD UNIV PRESS
DOI
10.1093/bioinformatics/btn534
Format
Abstract
BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recognition and entity identification is conducted on a gold standard corpus of annotated news articles.
Funding Sponsor
Japan Science and Technology Agency (JST); Japan Society for the Promotion of Science [18049071]; Research Organization of Information Systems (ROIS)
License
Copyright
Rights
Publisher
Publication Source
WOS