MLEnd Yummy
A dataset of enriched images of food from diverse cuisines
About Dataset
Fuel or pleasure, food is present in our everyday lives and some would say, defines who we are: Tell me what you eat, and I will tell you who you are.
Background, environment, means; circumstances, habits, beliefs; health, interests, goals, and a pinch of luck, they all contribute to our daily food choices. Such choices might indeed give us a glimpse of who we are, but our relationship with food goes beyond our individual, daily choices. If we are shown one single dish, what would we call it? How would we describe it? How would we categorise it?
Food pictures continuously flood the internet and the machine learning community has long been scraping the web to build models that identify food on images. The resulting datasets have been invaluable for the purpose of food classification but unfortunately, do not allow us to explore the humam dimension. The MLEnd Yummy Dataset is a collection of more than 3,000 enriched images from more than 200 participants that opens a window to our relationship as humans with food. This dataset was created by students at the School of Electronic Engineering and Computer Science, Queen Mary University of London. Students took pictures of dishes that they were about to eat, ate them and then annotated each picture, adding attributes such as the dish name, ingredients, whether they liked their food or not, or whether they had them at home or at a particular restaurant.
Enjoy!
Sample of data
Here are some examples of data:
Download Data: Small set
Install mlend
To download the Yummy dataset, first step is to install mlend
library. Use pip to install library.
pip install mlend
Small Yummy dataset: To get started
To download small subset of the data, that includes 99 images of Rice and Chips, use following piece of code:
import mlend
from mlend import download_yummy_small, yummy_small_load
baseDir = download_yummy_small(save_to = '../MLEnd')
This code will download data in given path (‘../MLEnd’) and returns the path of data as datadir (='../MLEnd/yummy')
To read dataset with trainig and testing split using pre-defined ‘Bechmark’ use following code:
TrainSet, TestSet, Map = yummy_small_load(datadir_main=baseDir,train_test_split='Benchmark_A')
A starter-kit
A starter-kit is prepared hosted with Google-Colab and Binder. Use following links to open in colab or binder.
Download Data: Full
To download full yummy dataset make sure to updgrade mlend
library to version>1.0.0.2
pip install mlend --upgrade
To download full dataset, use following piece of code
import mlend
from mlend import download_yummy, yummy_load
subset = {}
datadir = download_yummy(save_to = '../MLEnd', subset = subset,verbose=1,overwrite=False)
It will download all the images (3K+) in folder ../MLEnd/yummy/MLEndYD_images
directory
Download partial data
Alternately, to download subset of data use following piece of code
import mlend
from mlend import download_yummy, yummy_load
subset = {'Diet':['vegan'], 'Home_or_restaurant':['home']}
datadir = download_yummy(save_to = '../MLEnd',
subset = subset,verbose=1,overwrite=False)
It will download images of vegan dish cooked at home
NOTE: Above code might raise HTTPError
, if any of the image file, part of subset selection is not found to download. All the files are still being uploaded on cloud. To avoid this error, use following code:
datadir, FILE_ERROR = download_yummy(save_to = '../MLEnd',
subset = subset,verbose=1,overwrite=False,debug_mode=True)
This will return the list of image files (FILE_ERROR
) that are not found on cloud.
Load the Data with benchmark sets
After downloading partial or full dataset, mlend allows you to load the dataset with specified method of training and testing split. Note, mlend doesn’t load the image files in memory, instead it reads the path of files, for further reading and cleaning data as per requirement of the model. For more details, check help(yummy_load)
.
import mlend
from mlend import download_yummy, yummy_load
subset = {'Diet':['vegan'], 'Home_or_restaurant':['home']}
datadir = download_yummy(save_to = '../MLEnd',
subset = subset,verbose=1,overwrite=False)
TrainSet, TestSet, MAPs = yummy_load(datadir_main = datadir,encode_labels=True,)
MLEnd Documentation
For mlend documentation use help(fun)
in python terminal or Jupyter-notebook. Alternately, check out