Deployed hypothesis tests and classification models on 18,000+ entries to pinpoint West Nile Virus hotspots and high-risk species to guide community health interventions.
The dataset for this project comprises information from various mosquito traps set up in the city of Chicago. It contains attributes such as the date, species of mosquito, and whether West Nile Virus was detected in the trap.
Data cleaning involved handling missing values, identifying outliers, and transforming categorical variables for machine learning models.
Features were engineered to provide insights into the frequency of West Nile Virus occurrences within specific regions and times. Seasonal data was also incorporated to determine peak West Nile Virus periods.
Visualizations were crafted using libraries such as Matplotlib, and Seaborn to identify patterns and trends. The main focus was on determining the spread of the virus over time, zipcodes with the highest prevalence, and the mosquito species most associated with the virus. Below is a visual mapping out the areas with the highest WNV prevalence by zipcode.
Through statistical hypothesis testing, specific regions and mosquito species in Chicago were identified as high-risk for West Nile Virus. These insights are crucial for community health interventions.
Thank you for your interest in West Nile Watch. For further inquiries or insights, please feel free to reach out through this GitHub repository or at scelarek@gmail.com.