ClearSky AI — Real-Time Air Quality Monitoring and Next-Day AQI Prediction for Indian Cities

ClearSky AI — Real-Time Air Quality Monitoring and Next-Day AQI Prediction for Indian Cities

A Python final year project that fetches live pollution data from the WAQI API, processes it through a trained XGBoost model, and delivers real-time AQI readings plus a next-day forecast for any Indian city.

Technology Used

Python | Flask | XGBoost | scikit-learn | SQLite | pandas | NumPy | Vanilla JS | HTML5 | CSS3 | WAQI API | joblib | Jupyter Notebook

codeAj
codeAjVerified
🏆1K+ Projects Sold
Google Review

2999

1999

Get complete project source code + Installation guide + chat support

Project Files

Get Project Files

About ClearSky AI

ClearSky AI is a full-stack web application built as a final year project for students in BCA, MCA, BTech CSE, and BSc IT programs. It pulls live pollution data from the World Air Quality Index (WAQI) API for any Indian city and runs it through a trained XGBoost gradient-boosted regression model to predict the next day's AQI. The output is a clean, functional system that covers real-time monitoring, machine learning inference, weather data, and persistent search history — all in one place.

The model was trained on six years of daily pollution records from 29 major Indian cities, sourced from the Central Pollution Control Board (CPCB) dataset. That historical depth gives it a solid read on India's seasonal pollution patterns, which makes the predictions more grounded than a generic regression model would be.

If you want a AI and machine learning final year project that covers a live data source, a trained ML model, and a working web interface together, ClearSky AI covers all three. It ships with source code and is available with a pre-built college-format project report and optional mentorship from the CodeAj team.

What the Project Does

The main dashboard lets users search any Indian city by name — autocomplete covers 60-plus cities — and immediately see the current AQI, a breakdown of nine individual pollutants (PM2.5, PM10, NO2, SO2, CO, O3, NO, NOx, NH3), live weather conditions, and a machine learning forecast for the next day. A separate Weather tab goes deeper: UV index, dew point, Beaufort-scale wind classification, humidity with a progress bar, and contextual outdoor activity recommendations based on actual readings.

A manual prediction page lets users enter their own pollutant values and get a predicted AQI from the same model. This is useful for offline testing, for classrooms where instructors want to show how individual pollutants affect the index, or for students who want to explore the model's behavior before their viva.

All searches are saved to a local SQLite database and shown in a Recent Searches sidebar, so frequently monitored cities are one click away on every session.

Machine Learning Pipeline

The XGBoost model was trained on the city_day.csv file from the CPCB dataset — 29,531 daily records across 29 Indian cities from January 2015 to July 2020. After dropping null AQI rows and applying city-level median imputation for missing pollutant values, 24,824 records went into training. The target variable is AQI_Tomorrow, created by shifting the AQI column backward by one day within each city group to prevent data leakage across cities.

The feature vector has 14 inputs: PM2.5, PM10, NO, NO2, NOx, NH3, CO, SO2, O3, Benzene, Toluene, Xylene, Month, and DayOfWeek. The model runs 500 trees at a learning rate of 0.05, max depth 6, with L1 and L2 regularization. A StandardScaler is applied before inference. Both the scaler and the trained model are saved as .pkl files via joblib and loaded at Flask startup.

The full training pipeline lives in a Jupyter Notebook with ten clearly labelled cells covering data loading, cleaning, EDA, feature engineering, model training, evaluation, and artifact export. The notebook can be run end-to-end to reproduce the model from scratch — which is exactly what viva panels want to see. You can browse more final year projects with source code on the CodeAj marketplace to compare tech stacks.

Key Features

  • Live AQI fetch for any Indian city via the WAQI API, with animated count-up numbers and a colour-coded gradient scale bar showing category position
  • Nine-pollutant breakdown grid — PM2.5, PM10, NO2, SO2, CO, O3, NO, NOx, NH3 — each with an individual animated progress bar
  • XGBoost next-day AQI forecast displayed alongside the live reading, with category badge and health advice
  • Detailed Weather tab covering UV index with Beaufort-scale wind description, humidity progress bar, dew point, atmospheric pressure, comfort level classification, and dynamically generated outdoor tips
  • Manual prediction page with synced sliders and number inputs for all 14 model features, giving instant results without an API call
  • Recent searches sidebar backed by SQLite — persists across browser sessions with no external database required
  • Light and dark mode toggle with no flash on reload, preference saved via localStorage
  • GPS city detection using the browser Geolocation API
  • Autocomplete search with keyboard navigation — arrow keys, Enter, Escape
  • Four REST API endpoints (/api/aqi, /api/predict, /api/history, /api/cities) that work independently from the UI
  • Graceful degradation — if model files are missing at startup, the API returns a rule-based estimate instead of crashing
  • Fully responsive layout with collapsing side panel on mobile screens

Real-World Applications

ClearSky AI was designed as an academic project, but the architecture maps directly to real scenarios. Environmental monitoring agencies use similar Flask-based stacks to display live pollution data publicly. Smart city dashboards integrate prediction endpoints to trigger alerts when forecast AQI crosses a health threshold. Health apps use pollutant breakdowns to give users location-aware outdoor activity advice.

For final year students, this project covers topics that come up repeatedly in viva sessions: missing value handling with domain-specific imputation, supervised regression with a gradient-boosted ensemble, REST API design in Flask, SQLite integration without an ORM, and a frontend communicating with the backend through fetch calls. It covers enough ground to hold up under detailed technical questioning. You can also look at related air quality route planner project or the Indian climate monitor project for alternative approaches in this domain.

Why This Project Works for Final Year Submission

Most AQI projects available online stop at training a model in a notebook. ClearSky AI goes further — the trained model is connected to a live data source, wrapped in a production-ready Flask application, and delivered through a UI that a non-technical examiner can actually interact with. The codebase is structured clearly, the Jupyter Notebook reproduces everything in one run, and the dataset is publicly verifiable through CPCB and Kaggle. These three things together make it much easier to defend in a viva than a project with opaque data or a model that cannot be re-trained from scratch.

CodeAj provides this project with source code, a pre-built college-format project report, and optional add-on services. You can explore the full range of Python and AI final year projects on the marketplace, or check our project services page for report writing, setup sessions, and research paper support.

Add-On Services Available

  • Idea Implementation: Need a custom version of this project tailored to your college's requirements? The CodeAj team builds it from scratch based on your brief.
  • Project Setup and Source Code Explanation: A one-on-one session where a developer sets up the project on your machine, walks through the codebase, and explains the ML pipeline in plain terms — exactly what you need before a viva.
  • Custom Project Report, Research Paper, and PPT: College-format documentation written to your university's submission guidelines, a presentation deck, and optional research paper drafting for IEEE or Springer conference submission. Visit the research paper publishing page for details.

Extra Add-Ons Available – Elevate Your Project

Add any of these professional upgrades to save time and impress your evaluators.

Project Setup

We'll install and configure the project on your PC via remote session (Google Meet, Zoom, or AnyDesk).

Source Code Explanation

1-hour live session to explain logic, flow, database design, and key features.

Want to know exactly how the setup works? Review our detailed step-by-step process before scheduling your session.

999

Custom Documents (College-Tailored)

  • Custom Project Report: ₹1,200
  • Custom Research Paper: ₹1000
  • Custom PPT: ₹500

Fully customized to match your college format, guidelines, and submission standards.

Project Modification

Need feature changes, UI updates, or new features added?

Charges vary based on complexity.

We'll review your request and provide a clear quote before starting work.

Project Files

⭐ 98% SUCCESS RATE
  • Full Development
  • Documentation
  • Presentation Prep
  • 24/7 Support
Chat with us