MIDS Capstone Project Fall 2024

HealthBite

Team members

Problem & Motivation

Three of the top chronic diseases in the US today – type 2 diabetes, hypertension (high blood pressure), and cardiovascular disease – affect over 100 million Americans according to the American Heart Association. These conditions require strict adherence to dietary guidelines provided by healthcare professionals, as proper nutrition directly impacts disease management and quality of life. Research shows that many patients struggled to consistently follow these guidelines, leading to significant preventable healthcare costs.

The problem we aimed to solve was developing an effective application to provide personalized meal plans based on a user's chronic condition(s), nutritional needs, and dietary preferences. Our solution needed to address three key challenges: simplifying complex medical guidelines into actionable meal choices, accounting for various health conditions, and creating plans that patients would actually follow. Most importantly, this solution needed to require minimal effort from users, as studies indicate that complex meal planning systems often faced declining adherence rates over time.

This represented a compelling opportunity because recent advances in artificial intelligence and recommendation systems have enabled the processing of medical guidelines into personalized meal planning at scale. The growing adoption of digital health solutions indicates strong market readiness. Our solution sought to bridge the gap between medical guidelines and practical implementation, making healthy eating both accessible and sustainable for chronic disease patients.

Our Solution

We developed HealthBite, a meal recommendation application designed specifically for chronic disease patients. Our solution leverages two key data types: medically-backed dietary guidelines from leading research organizations such as the National Library of Medicine and Harvard Medical School, combined with a comprehensive recipes data source in Edamam.

Our app streamlines meal planning by automatically generating personalized meals that align with medical guidelines while respecting individual preferences. For each meal, HealthBite provides detailed ingredient lists and direct links to cooking instructions, eliminating the need for users to search multiple sources. This end-to-end solution transforms complex dietary requirements into practical, actionable meal choices.

Data Source

HealthBite's recommendations are powered by two comprehensive datasets. The first consists of detailed dietary guidelines defining recommended and restricted foods across multiple categories including proteins, fruits, vegetables, and grains. We compiled this dataset by synthesizing research from leading medical organizations including the National Library of Medicine, Office of Disease Prevention and Health Promotion, Harvard Medical School, and the American Heart Association.

Our second dataset comes from Edamam, a leading provider of recipes and nutrition data and analytics. Through this partnership, we gained access to 65,000 recipes complete with detailed attributes including ingredients, nutritional content, calorie counts, health labels, cuisine types, and preparation instructions. The combination of evidence-based dietary guidelines with this extensive recipe database enables HealthBite to generate meal plans that are both medically appropriate and practically executable.

Data Science Approach

Database Vectorization

Our approach began with transforming unstructured dietary guidelines and structured recipe data into a vector database using a Retrieval-Augmented Generation (RAG) system. We integrated a GPT-3 powered agent with Pinecone vector database, utilizing OpenAI's embedding model to convert textual data into high-dimensional vectors. This vectorization process enabled efficient semantic search and retrieval of relevant dietary information and recipes based on user requirements.

Model Architecture

Our development process evolved through several iterations of agent-based frameworks. We initially implemented Microsoft's AutoGen for task orchestration but encountered scalability limitations. This led us to explore LangChain, which offered improved simplicity in connecting to Large Language Models (LLMs) and managing task sequences. Ultimately, we adopted LangGraph, an extension of LangChain, for its superior handling of complex workflows. LangGraph's graph-based architecture, with predefined nodes and edges, provided the structured task management necessary for handling meal plan modifications, dietary restrictions, and cuisine preferences effectively.

For production deployment, we implemented a cloud-based architecture using AWS services. The front-end is hosted on AWS Amplify, while our LLM agents run on Lambda functions. We utilize two databases: DynamoDB for managing user state and Pinecone for vector-based recipe and dietary guideline storage.

Evaluation

SME Evaluations

To validate HealthBite's effectiveness, we conducted rigorous testing with Subject Matter Experts (SMEs) who regularly work with chronic disease patients. Our evaluation panel consisted of licensed dietitians and nutritionists who assessed the application's recommendations across five diverse patient scenarios. Each scenario included specific combinations of chronic conditions, dietary restrictions, and meal preferences.

The evaluation focused on two key metrics:

Dietary Guidelines Accuracy: Measures how well the recommended meals align with medical dietary guidelines
Meal Plan Suitability Accuracy: Assesses the practicality and appropriateness of the meal recommendations

Our results showed a Dietary Guidelines Accuracy score of 67% and a Meal Suitability score of 77%. While these scores demonstrate the viability of our approach, they fell below our target threshold of 85%. Analysis of the feedback revealed opportunities for improvement in handling complex dietary interactions and accommodating diverse cultural preferences. Our roadmap prioritizes these areas for enhancement, with a target score of 95% for both metrics.

LLM as a Judge

In addition to human evaluation, we explored using Large Language Models (LLMs) as automated judges to evaluate our system's outputs. This approach involved creating detailed evaluation criteria based on medical dietary guidelines and using GPT-4 to assess each meal plan against these standards. For each recommendation, the LLM judge analyzed nutritional compliance, meal variety, and adherence to condition-specific restrictions. We structured our prompts to have the LLM generate numerical scores, similar to our SME evaluation metrics, and detailed feedback across multiple dimensions. While this automated evaluation approach provided consistent and scalable feedback, we used it as a complement to our SME evaluations.

Key Learnings & Impact

Our project yielded several significant technical insights. Working with vector databases, we gained practical experience in embedding both unstructured and structured data using Pinecone. Through our exploration of different frameworks, we discovered the relative strengths and limitations of AutoGen, LangChain, and LangGraph for agent-based systems. This progression enhanced our understanding of how to effectively architect complex AI workflows. Additionally, we learned valuable lessons about user experience design, particularly in collecting health data efficiently, and crafting effective prompts for LLM interactions.

The potential impact of HealthBite extends beyond technical achievements. Chronic disease management through proper nutrition represents a significant market opportunity, with millions of patients struggling to maintain appropriate diets. More importantly, our application addresses a critical healthcare need by making dietary compliance more accessible and sustainable. This aligns with our core mission:

HealthBite Mission Statement: We believe in the power of food as medicine and strive to make healthy eating both accessible and convenient by providing personalized meal plans tailored to each user's medical condition and dietary needs. By streamlining the entire meal planning process, from concept to preparation, we enable users to manage their health with greater confidence and ease.

Acknowledgements

We extend our sincere gratitude to our Capstone instructors, Joyce Shen and Korin Reid, for their invaluable guidance and mentorship throughout this project. Special thanks to our subject matter experts - Jenn Tao, Laura Tousaw, and Hyemyung Kim - whose clinical expertise and detailed feedback were crucial in validating and improving HealthBite. We also thank our key data partner, Edamam, for providing us with a comprehensive dataset of recipes and nutrition data. Lastly, we acknowledge the contributions of the teaching assistants and our classmates, whose constructive feedback during project reviews helped shape our final solution.

Course

Data Science 210. Capstone , Fall 2024

Class Project Gallery

More Information

HealthBite Website

HealthBite Final Presentation

Video

Last updated: December 13, 2024