Forum at IndoML 2026

Datathon@IndoML 2026

Solve Real-World Problems. Compete. Win.

Noise event detection and removal in real-world Indic speech, built for robust and inclusive speech AI.

TBA Teams Expected
Rs. Cash Prizes
Present at IndoML

About the Datathon

Welcome to Datathon@IndoML 2026 - a research-oriented data science competition held in conjunction with IndoML 2026. Building on the success of previous editions, this year's datathon challenges participants to tackle noise event detection and removal in real-world Indic speech - a critical problem for inclusive, robust Automatic Speech Recognition across Indian languages.

The competition is organised into two coupled tracks: Track 1 (Detection) - detect noise events with precise timestamps - and Track 2 (Removal) - suppress detected events while preserving the underlying speech. The dataset is a curated subset of the Vaani corpus, spanning ~167 hours of labelled real-world Indic audio across seven noise categories.

Top-performing teams will be invited to attend IndoML 2026 and present their solutions to leading researchers and professionals from academia and industry. These teams will also receive exciting cash prizes.

Registration details and important dates will be announced soon. Task details and dataset are described below!

Datathon Chairs

Mahesh Mohan

Dr. Mahesh Mohan

IIT Kharagpur
Debopriyo Banerjee

Dr. Debopriyo Banerjee

Inception - G42
Subhajit Datta

Dr. Subhajit Datta

Heritage Institute of Technology

Technical Volunteers

Shivay Vadhera

Shivay Vadhera

IIT Bombay
Shubhadip Nag

Shubhadip Nag

Walmart

Announcements

April 2026

Website Launched!

The official website for Datathon@IndoML 2026 is now live. Registration details, task description, dataset, and timeline will be announced soon.

Coming Soon

Registration Opens

Stay tuned for updates on registration, task details, and important dates.

Task Description

Noise Event Detection & Removal in Indic Speech

Robust, Inclusive Speech Processing for Real-World Indic Speech

Speech recordings collected in real Indian environments are dominated by non-stationary background events - vehicle horns, dogs barking, children crying, doorbells, ringtones, kitchen appliances, and devotional music. These events degrade downstream Automatic Speech Recognition (ASR).

This challenge invites participants to build a two-stage system on the Vaani dataset that (i) detects noise events with precise timestamps, and (ii) removes them while preserving the underlying speech. The challenge is framed under the Responsible AI theme, with explicit emphasis on robustness, linguistic inclusivity across multiple Indian languages, and methodological transparency.

Evaluation uses per-track metrics - F1 & Dice for detection, SI-SDR & Delta WER for removal - with PESQ evaluated for top-5 removal entries. A novelty score adjusts the final standings.

Challenge Tracks

Participants may enter either track independently or both.

01

Detection

Detect Events

Detect noise events in Indic speech recordings with precise onset/offset timestamps.

Raw Audio
Your Model
Event JSON
{onset: 1.24, offset: 3.81}, {onset: 4.31, offset: 4.71}, {onset: 5.04, offset: 5.41}
Evaluation Metrics: F1 Dice
Event timeline (onset/offset pairs) from Track 1 is passed as conditioning input to Track 2 — guiding the model on where to suppress noise
02

Removal

Suppress & Preserve

Suppress the detected noise events while preserving the underlying speech signal - output clean, intelligible audio.

Audio + Events
Your Model
Clean WAV
16 kHz mono WAV - one per test clip, original filename retained
Evaluation Metrics: SI-SDR Delta WER PESQ (top-5)

Competition Process

Submission

Track 1: Submit a JSON file with onset/offset events per clip.
Track 2: Submit cleaned 16 kHz mono WAV files.

Evaluation

Automated scoring on held-out test clips. Track 1: F1 + Dice. Track 2: SI-SDR + ΔWER, then PESQ for top-5.

Leaderboard

Live rankings published after each submission window. Final standings adjusted by expert Novelty Score for top-5 entries.

Awards

Top teams invited to present at IndoML 2026 and receive cash prizes. Code release required for prize-eligible entries.

Evaluation Metrics

Track 1 — Detection
Primary
Event-based F1

A prediction is correct when its temporal extent overlaps with ground truth within +/-20% of event duration.

Primary
Segment Dice

Temporal overlap between predicted and reference event segments: 2 * |P intersection G| / (|P|+|G|).

Overall ranking Equal-weight average of F1 and Dice scores.
Track 2 — Removal
Primary
SI-SDR

Scale-Invariant Signal-to-Distortion Ratio between enhanced signal and synthetic clean reference.

Primary
Delta WER

A frozen multilingual Indic ASR is run on both noisy and enhanced clips. Delta WER = WERnoisy - WERenhanced. Higher is better.

Top-5 Only
PESQ

Perceptual Evaluation of Speech Quality - intelligibility & naturalness vs. clean references. Evaluated only for the top-5 initial entries.

Initial ranking Equal-weight average of SI-SDR and Delta WER. Top-5 then evaluated for PESQ.
Novelty Score - Following metric-based ranking, an expert panel applies a novelty factor to the top-5 entries in each track to determine the final standings. Rewards original contributions: new architectures, novel training regimes, principled event-conditioning, or semi-supervised approaches exploiting unlabelled Vaani audio.

Responsible AI Alignment

This challenge is positioned under the Responsible AI track on three explicit axes:

Robustness

Real-world Indic recordings - not curated studio mixtures - are the evaluation distribution. Systems are scored on actual ASR improvement (Delta WER), not signal-level metrics alone.

Inclusivity

Vaani spans multiple Indian languages and a wide range of speakers. The eval set is monitored for language-wise and class-wise balance so no sub-population is under-represented.

Transparency

Top-5 submissions must release code and document pre-trained dependencies. The frozen ASR used for Delta WER is publicly identified for independent reproducibility.

Dataset

Vaani Corpus

A large-scale, openly released Indic speech dataset spanning multiple Indian languages, collected across districts of India. Learn more at vaani.iisc.ac.in or read the paper. The challenge subset provides ~167 hours of training audio across seven noise-event categories, with a ~16.7-hour evaluation set (proportional to training distribution, ~10% of each category).

Annotations include onset/offset timestamps and category labels. The training set is imbalanced and reflects the natural frequency of events in real Vaani recordings - Human Non-Speech (40 hrs) and Animal/Vehicle (~40 hrs each) dominate, while Appliance/Machine is comparatively rare (~1.89 hrs). Handling this long-tailed distribution is part of the task.

Sample Preview on HuggingFace
# Category Example Events Train (hrs) Eval (hrs)
1 Animal Barking, mooing, bird chirps, insect noise, cat, hen, goat ~40 ~4
2 Vehicle / Traffic Horns, engines, motorbikes, sirens, train, general traffic ~40 ~4
3 Baby / Child Crying, babbling, yelling, playing, child laughter ~24.2 ~2.42
4 Singing / Music Background music, singing, instruments, prayer, devotional ~13.4 ~1.34
5 Phone / Signal / Alarm Ringtones, beeps, alarms, sirens, bells, doorbells ~7.98 ~0.798
6 Appliance / Machine Fans, mixers, TVs, mics, typing, clocks, machinery ~1.89 ~0.189
7 Human Non-Speech Breathing, lip smacks, coughs, sneezes, snoring, throat clearing ~40 ~4
Total ~167.47 ~16.7

Noise Event Taxonomy & Training Distribution

Seven noise-event categories annotated with onset/offset timestamps. The chart below illustrates the long-tailed distribution in the training set.

Animal
~40 hrs
Vehicle / Traffic
~40 hrs
Baby / Child
~24.2 hrs
Singing / Music
~13.4 hrs
Phone / Signal / Alarm
~7.98 hrs
Appliance / Machine
~1.89 hrs
Human Non-Speech
~40 hrs

The training set is intentionally imbalanced — Appliance/Machine (~1.89 hrs) is the rarest class. Handling this long tail is part of the challenge.

Animal

Barking, mooing, bird calls & chirps, cricket/insect noise, cat meows, hen, rooster, goats

bird_soundbarkinginsect_noisecrow_soundmeowcricket_chirping
~40 hrs

Vehicle / Traffic

Horns, engines, motorcycles, cars, trains, ambulance sirens, general traffic noise

hornvehicle_noisemotorcycleengine_runningtraffic_noisetrain_sound
~40 hrs

Baby / Child

Crying, babbling, yelling, screaming, laughing, playing

baby_cryingchild_talkingchild_yellingchildren_talkingbaby_noise
~24.2 hrs

Singing / Music

Background music, singing, instruments, whistling melodies, drums, flute, prayer/devotional music

musicsingingsongbackground_musicflutewhistling
~13.4 hrs

Phone / Signal / Alarm

Phone ringing, ringtones, beeps, vibrations, alarms, sirens, buzzers, bells, doorbells, notifications

phone_ringingringtonebeepvibrationsirennotificationbell
~7.98 hrs

Appliance / Machine

Fans, mixers, TVs, mics/loudspeakers, typing, clocks, washing machines, generic machinery

fan_noisefan_whirringmachine_noisetelevisionmic_noisetypingclock_ticking
~1.89 hrs

Human Non-Speech

Breathing, lip smacks, coughs, sneezes, yawns, throat-clearing, snoring, hiccups

coughsneezingsnoring...
~40 hrs

Important Dates

Registration Opens

TBA

Development Phase

TBA

Test Phase

TBA

Results Announcement

TBA

Presentation at IndoML 2026

TBA

All deadlines will be at 12:00 Noon IST (Indian Standard Time).

Prizes

Cash Prizes

Exciting cash prizes await top-performing teams. Detailed prize distribution will be announced along with the registration opening.

Present at IndoML 2026

Top teams will be invited to present their solutions at IndoML 2026, in front of leading researchers from academia and industry.

Novelty Award

An expert panel award for the most original methodological contribution: new architectures, novel training regimes, or unsupervised approaches.

Top-5 entries per track must release training and inference code under a permissive open-source licence and submit a short (<=4 pages) system description.

FAQ

Who can participate?

Students and early-career professionals are welcome. Each team must include at least one member affiliated with an Indian university or research institution.

Is there a team size limit?

There is no restriction on team size. However, each participant may only join one team.

What models and data can we use?

We follow an open model and open data policy. Teams may use any publicly available, closed-source, or proprietary models, along with additional data or augmentation strategies.

How will submissions be evaluated?

Evaluation criteria will be announced alongside the task description. Stay tuned!

Will top teams get travel support?

Top-performing teams will be invited to present at IndoML 2026. Details regarding travel support will be communicated later.

Contact Us

Previous Editions