Datasets for Speech Recognition Analysis

In This post we share top Datasets for Speech Recognition. Speech emotion analysis is an important task which further enables several application use cases. Due to the widespread use of smartphones, it becomes viable to analyze speech commands captured using microphones for emotion understanding by utilizing on-device machine learning models
- Google Audio Dataset
- Urbansound Dataset
- Spoken Digit Dataset
- Bird Audio Detection
- TensorFlow Speech Recognition
- Emotion Based Speech Recognition in the Wild
1. Google Audio Dataset
Google Audio Dataset is a large-scale dataset of manually annotated audio events. In this Daataset 2.1 million annotated videos, 5.8 thousand hours of audio with 527 classes of annotated sounds.
Some Audio Classes are given below
2. Urbansound Dataset
This is Urbansound Dataset. This dataset contains 1302 labeled sound recordings. Each recording is labeled with the start and end times of sound events from 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music. Each recording may contain multiple sound events, but for each file only events from a single class are labeled.
This dataset in Both CSV and JSON File. You can Download and Grow your Machine Learning Skills.
3. Spoken Digit Dataset
This is simple audio/speech dataset consisting of recordings of spoken digits in wav files at 8kHz. The recordings are trimmed so that they have near minimal silence at the beginnings and ends. FSDD is open Dataset for Everyone. In this Dataset 4 Speakers with 2000 Recordings in English pronunciations.
No. of Rows:- 2000
4. Bird Audio Detection
Detecting bird sounds in audio is an important task for automatic wildlife monitoring, as well as in citizen science and audio library management.you can download dataset through { freefield1010: • [data labels] • [audio files (5.8 Gb zip)] (or [via bittorrent]) Warblr: • [data labels] • [audio files (4.3 Gb zip)] (or [via bittorrent]) }
5. TensorFlow Speech Recognition
Can you build an algorithm that understands simple speech commands? If Yes than this Dataset is for you. Note: There are only 12 possible labels for the Test set: yes, no, up, down, left, right, on, off, stop, go, silence, unknown. The unknown label should be used for a command that is not one one of the first 10 labels or that is not silence.
6. Emotion Based Speech Recognition in the Wild
EmoSpeech Dataset India's first keyword-emotion dataset. Surveillance 3 datasets in one 8K Environment Samples, 8K Keywords Samples and 8K Emotion Samples.
Your browser does not support the audio element.
Thanks for Reading, Share this Blog
Random Blogs
- Where to Find Free Datasets for Your Next Machine Learning & Data Science Project
- What Is SEO and Why Is It Important?
- Avoiding the Beginner’s Trap: Key Python Fundamentals You Shouldn't Skip
- Understanding OLTP vs OLAP Databases: How SQL Handles Query Optimization
- Internet of Things (IoT) & AI – Smart Devices and AI Working Together
- Grow your business with Facebook Marketing
- Exploratory Data Analysis On Iris Dataset
- SQL Joins Explained: A Complete Guide with Examples
- Time Series Analysis on Air Passenger Data
- Python Challenging Programming Exercises Part 3
Prepare for Interview
Datasets for Machine Learning
- Ozone Level Detection Dataset
- Bank Transaction Fraud Detection
- YouTube Trending Video Dataset (updated daily)
- Covid-19 Case Surveillance Public Use Dataset
- US Election 2020
- Forest Fires Dataset
- Mobile Robots Dataset
- Safety Helmet Detection
- All Space Missions from 1957
- OSIC Pulmonary Fibrosis Progression Dataset
- Wine Quality Dataset
- Google Audio Dataset
- Iris flower dataset
- Artificial Characters Dataset
- Bitcoin Heist Ransomware Address Dataset