The Food.com 2023 Dataset

The dataset is a rich compilation of recipes, spanning a wide range of cuisines and styles. It offers a unique perspective on what makes a recipe more than just a list of ingredients and steps. With over 500k recipes, it’s a deep dive into the culinary world, providing data enthusiasts, chefs, and food bloggers an opportunity to analyze and understand cooking trends on a macro scale.

Bootstrapping Estimates for Comment Likelihood, Hacker News: EDA II

In my previous Hacker News EDA we looked at how words could be embedded in two dimensions. This time we implement a bootstrapping simulator for seeing the impact of posting time on number of comments received. Examining the dataset To get an idea of what keywords are popular at different times of the day, we… Continue reading Bootstrapping Estimates for Comment Likelihood, Hacker News: EDA II

Solve a Substitution Cipher with a Markov chain

Photo by Mauro Sbicego on Unsplash

There are k! substitution ciphers for an alphabet with k letters—too many for an exhaustive search. With a frequency-based approach adapted to the graph of alphabetic ciphers, we redefine the act of deciphering as a sampling problem suitable for a Metropolis-Hastings random walk. A substitution cipher is thus solvable with a Markov chain. Let’s begin… Continue reading Solve a Substitution Cipher with a Markov chain

Plant Pairs on the Tufts Campus

Photo by Markus Spiske on Unsplash

In Spring of 2019 my Environmental Fieldwork class surveilled the herbaceous plants growing on and around the Tufts campus, recording their identities and locations into a GIS database. For a final project I created a simple Cartesian quadrature algorithm in Python to identify the distinct plant pairs most likely to share the same soil. Whether… Continue reading Plant Pairs on the Tufts Campus