Creative Cooking : Network Analysis of Ingredient Swaps
Thanksgiving… the American holiday of gluttony, a sin of which nearly all of us are guilty at some time in our lives, is a good demonstration of one of that most human of inventions: cooking.
As anyone who has looked at shelves full of cookbooks can readily attest, there are a vast number of variations on any basic recipe. Well, it turns out that some academics decided to use tools from the bag of computer science to look at recipes and see what they could discover:
Researchers mined more than 40,000 recipes and nearly two million reviews from the website allrecipes.com, investigating various aspects of cooking and ingredient preferences. ‘We wondered if the analysis would let us see how flexible recipes are,’ says coauthor Lada Adamic of the University of Michigan in Ann Arbor.
Her team discovered that there’s a lot of wiggle room. The analysis, reported online November 16 at arXiv.org, identified several clusters of ingredients that can be swapped for one another.
Adamic, Michigan’s Chun-Yuen Teng and Yu-Ru Lin of Harvard and Northeastern University in Boston generated a list of the top 1,000 ingredients, which accounted for 94.9 percent of the ingredients in the database. The team noted heating methods, such as broiling and simmering, and various food processing techniques, both mechanical (such as grinding) and chemical (such as marinating). Recipe ratings and regional preferences within the United States were also taken into account.
Then the team created a huge diagram of ingredients that are frequently substituted for one another, yielding a network of clustered communities of swappable ingredients. A sweet potato community, for example, includes yams, pumpkin, potatoes, parsnips and butternut squash. Milk, butter, chicken broth and sugar all have recipe doppelgangers. And in some instances, dropping ingredients all together won’t hurt a dish.
A second ingredient diagram connected complementary ingredients, pairings that are found together more often than expected by chance. This network split cleanly into two communities, one sweet and one savory. The only other community detected was a small satellite cluster comprising alcoholic drink ingredients.
Other foodie preferences also emerged. Recipes that called for processing foods in some way, rather than just tossing ingredients together, were rated more favorably. This link could relate to a longstanding hypothesis regarding the development of bigger brains in the evolution of humans and our hominid relatives, the team speculates. Processing food mechanically and chemically makes extracting nutrients easier, reducing the cost of digestion. Such techniques may have allowed more nutritional resources to be allocated toward growing bigger brains [….]
Which cooking method is preferred, however, appears to depend on regional tastes. While baking is popular everywhere, marinating and grilling are favored in the West and Mountain regions, and in the West this often entails seafood. Frying is especially popular in the South and Northeast, a trend that prompted Teng to look more closely at the data. The recipes suggest that while the frying signature of the South emerges from the soul food tradition, Northeasterners use a lot of bacon (especially in chowdah) and have a lot of recipes for buffalo wings.
And the research project helped the scientists feel slightly more at ease in their own kitchens. ‘I’ve felt more comfortable leaving out nutmeg,’ Adamic says.
An example of a clustering/substitution diagram is that for cinnamon:
The abstract from the paper:
The recording and sharing of cooking recipes, a human activity dating back thousands of years, naturally became an early and prominent social use of the web. The resulting online recipe collections are repositories of ingredient combinations and cooking methods whose large-scale and variety yield interesting insights about both the fundamentals of cooking and user preferences. These insights include preferences for cooking methods depending on the nutritional value extracted from food, and the geographic region from which the recipe originates. At the level of an individual ingredient we measure whether it tends to be essential or can be dropped or added, and whether its quantity can be modified. We also construct two types of networks to capture the relationships between ingredients. The complement network captures which ingredients tend to co-occur frequently, and is composed of two large communities: one savory, the other sweet. The substitute network, derived from user generated suggestions for modifications, can be decomposed into many communities of functionally equivalent ingredients, and captures users’ preference for healthier variants a recipe. Our experiments reveal that recipe ratings can be well predicted with features derived from combinations of ingredient networks and nutrition information.
The paper itself is fairly readable, and the PDF has several neat diagrams to represent analyses of aspects of ingredient substations.
My only question is: who let the computer scientists into the kitchen?