Abstract
Public health surveillance is crucial for early disease detection, outbreak prediction, and epidemic response. However, traditional surveillance systems primarily rely on structured clinical data, limiting their capacity to capture emerging health threats from diverse and unstructured sources. This study explores the integration of Natural Language Processing (NLP) and Artificial Intelligence (AI) to automate disease surveillance by analyzing unstructured data, including electronic health records (EHRs), social media posts, news reports, and online health forums. Leveraging state-of-the-art NLP techniques—such as transformer-based language models, named entity recognition (NER), sentiment analysis, and topic modeling—an AI-driven surveillance framework is proposed to process, classify, and extract epidemiological insights from vast unstructured text streams in real time. The framework integrates multilingual data processing, anomaly detection, and geospatial trend analysis to enhance early warning capabilities for healthcare authorities. Its effectiveness is evaluated using benchmark datasets, such as the BioCaster Global Health Monitor, and real-world case studies on infectious disease outbreaks, demonstrating significant improvements in detection speed and accuracy. The findings highlight the transformative role of NLP and AI in advancing public health intelligence, improving disease surveillance scalability, and enabling proactive intervention strategies.
Keywords
natural language processing
artificial intelligence
public health surveillance
disease monitoring
unstructured data
social media analysis
electronic health records
epidemiological intelligence
Data Availability Statement
Data will be made available on request.
Funding
This work was supported without any funding.
Conflicts of Interest
Vijayalaxmi Methuku is an employee of CYNOSOFT SOLUTIONS INC, Austin, TX 78750, United States.
Ethical Approval and Consent to Participate
Not applicable.
Cite This Article
APA Style
Methuku, V. (2025). NLP and AI for Public Health Intelligence: Automating Disease Surveillance from Unstructured Data. IECE Transactions on Emerging Topics in Artificial Intelligence, 2(1), 43–56. https://doi.org/10.62762/TETAI.2025.222799
Publisher's Note
IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions

Copyright © 2025 by the Author(s). Published by Institute of Emerging and Computer Engineers. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (
https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.