Welcome to MLEnd Datasets!

The MLEnd Datasets are a collection of datasets acquired during the modules Principles of Machine Learning and Introduction to Data Science Programming, led by the Data Science and AI Teaching Group at Queen Mary Univeristy of London.

Following collaborative crowdsourcing approaches, students participate in the processes of data collection and curation. Then, they formulate and solve problems using the new dataset that they have co-created. This unique learning experience builds upon constructivism and a deployment-first perspective of machine learning, and allows our students to gain a deeper understanding of the significance of data collection and curation.

Explore the MLEnd Datasets





MLEnd Spoken Numerals

Spoken numerals with different intonations
  • • 32K audio files from 154 speakers
  • • Four different intonations
  • • Demographic data

MLEnd Hums and Whistles

Humming and whistling to songs
  • • 6K audio files, 8 songs
  • • 2 interpretations: Hums and Whistles
  • • 235 interepreters

MLEnd London Sounds

London acoustic scenes
  • • 2.5K audio files
  • • 6 areas of London, 6 spots per area
  • • Indoor and outdoor scenes


MLEnd Happiness

Happiness with demographics
  • • Tabular data
  • • 310 items
  • • 9 attributes

MLEnd Yummy

Enriched food images
  • • 3K images
  • • 200 participats
  • • Participants' preferences and assessment