MLEnd Yummy

A dataset of enriched images of food from diverse cuisines


About Dataset


Fuel or pleasure, food is present in our everyday lives and some would say, defines who we are: Tell me what you eat, and I will tell you who you are.

Background, environment, means; circumstances, habits, beliefs; health, interests, goals, and a pinch of luck, they all contribute to our daily food choices. Such choices might indeed give us a glimpse of who we are, but our relationship with food goes beyond our individual, daily choices. If we are shown one single dish, what would we call it? How would we describe it? How would we categorise it?

Food pictures continuously flood the internet and the machine learning community has long been scraping the web to build models that identify food on images. The resulting datasets have been invaluable for the purpose of food classification but unfortunately, do not allow us to explore the humam dimension. The MLEnd Yummy Dataset is a collection of more than 3,000 enriched images from more than 200 participants that opens a window to our relationship as humans with food. This dataset was created by students at the School of Electronic Engineering and Computer Science, Queen Mary University of London. Students took pictures of dishes that they were about to eat, ate them and then annotated each picture, adding attributes such as the dish name, ingredients, whether they liked their food or not, or whether they had them at home or at a particular restaurant.

Enjoy!


Sample of data


Here are some examples of data:




Download Data: Small set

Install mlend

To download the Yummy dataset, first step is to install mlend library. Use pip to install library.

pip install mlend



Small Yummy dataset: To get started

To download small subset of the data, that includes 99 images of Rice and Chips, use following piece of code:

import mlend
from mlend import download_yummy_small, yummy_small_load

baseDir = download_yummy_small(save_to = '../MLEnd')

This code will download data in given path (‘../MLEnd’) and returns the path of data as datadir (='../MLEnd/yummy')

To read dataset with trainig and testing split using pre-defined ‘Bechmark’ use following code:


TrainSet, TestSet, Map = yummy_small_load(datadir_main=baseDir,train_test_split='Benchmark_A')

A starter-kit

A starter-kit is prepared hosted with Google-Colab and Binder. Use following links to open in colab or binder.


Open In Colab\


Open In Binder\



Download Data: Full

To download full yummy dataset make sure to updgrade mlend library to version>1.0.0.2


pip install mlend --upgrade

To download full dataset, use following piece of code

import mlend
from mlend import download_yummy, yummy_load

subset = {}

datadir = download_yummy(save_to = '../MLEnd', subset = subset,verbose=1,overwrite=False)

It will download all the images (3K+) in folder ../MLEnd/yummy/MLEndYD_images directory


Download partial data

Alternately, to download subset of data use following piece of code


import mlend
from mlend import download_yummy, yummy_load

subset = {'Diet':['vegan'], 'Home_or_restaurant':['home']}

datadir = download_yummy(save_to = '../MLEnd', 
                                   subset = subset,verbose=1,overwrite=False)

It will download images of vegan dish cooked at home



datadir, FILE_ERROR = download_yummy(save_to = '../MLEnd', 
                                   subset = subset,verbose=1,overwrite=False,debug_mode=True)

This will return the list of image files (FILE_ERROR) that are not found on cloud.




Load the Data with benchmark sets

After downloading partial or full dataset, mlend allows you to load the dataset with specified method of training and testing split. Note, mlend doesn’t load the image files in memory, instead it reads the path of files, for further reading and cleaning data as per requirement of the model. For more details, check help(yummy_load).


import mlend
from mlend import download_yummy, yummy_load

subset = {'Diet':['vegan'], 'Home_or_restaurant':['home']}

datadir = download_yummy(save_to = '../MLEnd', 
                                   subset = subset,verbose=1,overwrite=False)

TrainSet, TestSet, MAPs = yummy_load(datadir_main = datadir,encode_labels=True,)






Yummy
A Starter-kit
Yummy Dataset
Open In Colab Open In Binder\



MLEnd Documentation

For mlend documentation use help(fun) in python terminal or Jupyter-notebook. Alternately, check out

MLEnd Documentation