Researchers at MIT have developed deep-learning algorithm that can compile a list of ingredients and even recommend recipes after looking at photos of food. The artificially intelligent system still needs some fine tuning, but this tool could eventually help us learn to cook, count kilojoules, and track our eating habits.
Image: Social Media Dinner/Flickr
Imagine being able to snap a picture of a meal you’re devouring at your favourite restaurant, and having a smartphone app provide you with the list of ingredients, and even a recipe to help you make it at home. Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have taken us one step closer to this goal by beginning to train a system to do exactly that.
Using machine learning, the program analyses still images of food, and by referencing a massive database, it then predicts a list of ingredients and recommends a recipe. In tests, the system was able to retrieve the correct recipe around 65 per cent of the time. That’s a fairly decent success rate, especially considering how complicated and varied some meals can get.
To make it work, a research team led by CSAIL graduate student Nicholas Hynes collected data from websites such as All Recipes and Food.com to create a database, called Recipe1M, containing over a million recipes. All the recipes were annotated with information about the ingredients found in a wide variety of meals. A neural network was set loose on this trove of data, seeking out patterns and connections between images of food and the matching ingredients and recipes.
So when given a photo of a muffin, for instance, the system, dubbed Pic2Recipe, could correctly identify ingredients such as flour, eggs and butter. It then suggests several recipes from within the Recipe1M database that it figures are the closest match.
As Hynes told Gizmodo, the system is more than just a food recognition program.
“That the program recognises food is really just a side effect of how we used the available data to learn deep representations of recipes and images,” he said. “What we’re really exploring are the latent concepts captured by the model. For instance, has the model discovered the meaning of ‘fried’ and how it relates to ‘steamed’? We believe that it has and we’re now trying to extract the knowledge from the model to enable downstream applications that include improving peoples’ health.”
Hynes says the system, in its current form, isn’t designed to predict ingredients directly.
“What you can do, however, is encode an image and return the ingredients associated with the most similar recipe,” Hynes told Gizmodo. “What differentiates this from reverse image search is that we go directly from image to recipe instead of simply returning the recipe associated with the most similar image; in other words, we return the recipe that, according to the model, was most likely to have produced the query image.”
The system performed well with relatively simple foods, such as cookies or muffins. But when confronted with more complex and ambiguous foods, such as sushi rolls or smoothies, the system struggled. It also had a hard time parsing foods for which there are a near endless supply of recipes. Lasagna is a great example. There are like, a gajillion ways to make a lasagna, so the CSAIL researchers had to make sure the AI wouldn’t “penalise” or exclude recipes that were similar when trying to parse one lasagna from another. A work-around was to get the AI to see if the ingredients in each recipe were generally similar before comparing the recipes themselves.
Looking ahead, the researchers are hoping to train the system such that it can better understand how food is prepared (for example, boiling, frying, slicing, dicing), and to tell the differences between food types (for example, mushrooms and onions). They’re also hoping to turn the system into a “dinner aide”, where a person can key in their dietary preferences and a list of food items available in the home, and AI devises a meal based around those constraints.
“This could potentially help people figure out what’s in their food when they don’t have explicit nutritional information,” said Hynes. “For example, if you know what ingredients went into a dish but not the amount, you can take a photo, enter the ingredients, and run the model to find a similar recipe with known quantities, and then use that information to approximate your own meal.”
Conceptually, the system should also be able to perform a kilojoule account, and indeed, Hynes is currently looking into this.
It will be a while before you see an app like this on your smartphone, but even when it does appear, a system like this will forever serve as a rough guide. Just because you know the ingredients of a meal and how it might have been put together doesn’t suddenly mean you’re a master chef.
The CSAIL team plans to present its findings later this month at the Computer Vision and Pattern Recognition conference in Honolulu.