West Nile Watch

By Sam Celarek

"How might we use hypothesis testing to pinpoint high-risk areas or mosquito species for West Nile Virus to inform community health interventions in Chicago?"

🎯 Project Overview

Deployed hypothesis tests and classification models on 18,000+ entries to pinpoint West Nile Virus hotspots and high-risk species to guide community health interventions.

📊 Dataset

The dataset for this project comprises information from various mosquito traps set up in the city of Chicago. It contains attributes such as the date, species of mosquito, and whether West Nile Virus was detected in the trap.

🧹 Data Wrangling

Data cleaning involved handling missing values, identifying outliers, and transforming categorical variables for machine learning models.

🛠️ Feature Engineering

Features were engineered to provide insights into the frequency of West Nile Virus occurrences within specific regions and times. Seasonal data was also incorporated to determine peak West Nile Virus periods.

📶 Exploratory Data Analysis (EDA)

Visualizations were crafted using libraries such as Matplotlib, and Seaborn to identify patterns and trends. The main focus was on determining the spread of the virus over time, zipcodes with the highest prevalence, and the mosquito species most associated with the virus. Below is a visual mapping out the areas with the highest WNV prevalence by zipcode.

West Nile Watch Image

📈 Analysis

Through statistical hypothesis testing, specific regions and mosquito species in Chicago were identified as high-risk for West Nile Virus. These insights are crucial for community health interventions.

Thank you for your interest in West Nile Watch. For further inquiries or insights, please feel free to reach out through this GitHub repository or at scelarek@gmail.com.

Best Wishes,
Sam Celarek