Home agAIn
Problem & Motivation
Homelessness remains a pressing issue in the U.S., with over 600,000 individuals affected in 2023². Despite ongoing efforts and funding, promised improvements on increasing permanent housing rates have fallen short, prompting scrutiny of resource allocation. Our project leverages historical data from homelessness support programs to create individualized service plans focused on improving permanent housing outcomes. Unlike traditional guesswork by case managers, Home agAIn uses predictive modeling to recommend the most effective services for each individual, quantifying their impact on housing success.
Data Source & Data Science Approach
We used private participant data from a Southern California homeless support organization. Our approach began with a random forest model, testing 243 hyperparameter combinations to assess feature importance in predicting housing outcomes. To enable causal insights, we implemented logistic regression, yielding interpretable coefficients and identifying service-housing relationships. We applied sensible heuristics to determine eligibility for a subset of services. For example, being a Veteran is a prerequisite to receive Veteran Services. We utilized AWS services including S3, RDS, SageMaker, EC2, and Cognito to handle data preprocessing, modeling, and integrating data into the web product.
System Architecture
Evaluation
The final logistic regression model performed well, with strong recall (0.84) but a tendency toward optimism, resulting in high false positive rates (precision= 0.30). However, for our project’s focus—examining how services influence the likelihood of permanent housing rather than absolute probabilities—false positives were less problematic. Emphasizing the need for further exploration, we highlighted the importance of analyzing feature interactions and addressing omitted variables to strengthen causal relationships.
Key Learnings & Impacts
While the model has room for improvement, it effectively demonstrated how specific services increase a participants' odds of achieving permanent housing. Permanent housing outcomes showed a clear relationship with individuals' characteristics and service combinations, highlighting our model's promising potential. The most compelling evidence for our project's potential impact came from a case manager's comment: "This is the first time I’ve felt guided in supporting participants without guessing what will help them most."
Acknowledgments
Our team owes a California Homelessness Organization a huge thank you for their data and permission which allowed this project to happen. We would also like to thank the case manager who shared their insights on our MVP.
Additionally, we would like to thank our MIDS Capstone professors, Todd Holloway and Zona Kostic, for their invaluable guidance and feedback throughout the development of this project.
Additional Credits
- HMIS for data, data dictionaries, and data queries
- AWS for system components and deployment
- ChatGPT for coding support, help interpreting results, writing DB queries, and copy
- GitHub Copilot for coding support, website development, and copy
- SKLearn package and documentation for modeling and evaluation
- Prior MIDS courses for coding, modeling, statistics, web hosting, etc.
- Prior MIDS Capstone projects and website pages for inspiration and guidance
Footnotes
- ¹Source of banner image at top of page: iStock illustration from CalMatters article.
- ²The U.S. Department of Housing and Urban Development (HUD): The 2023 Annual Homelessness Assessment Report (AHAR) to Congress.