MIDS Capstone Project Summer 2024

ShelterAID Public Health Resources Map

Team members

Mission Statement

ShelterAID Public Health Resources Map uses computer vision technology to help improve safety and public health outcomes for people living in unsheltered homelessness.

Problem & Motivation

There are growing numbers of people experiencing unsheltered homelessness in US cities, and they face many health hazards including a lack of sanitation, higher risk for chronic and communicable diseases, and lack of access to medical, behavioral, and mental health care. While local public health agencies are trying to provide support to this community—and are doing some really great work—it’s difficult to help people if you can’t find them.

We have created a product that helps these agencies find, count, and better support people living in these conditions, leveraging computer vision technology to detect temporary shelters in aerial imagery of cities. Public health agencies will be able to use our product to provide services to the people that need it most, exactly when and where they need it.

Data Source

No such data set existed when we began this project, so we created one from scratch. This process involved four key steps:

First, we searched GoogleEarth to find aerial images of shelters and encampments. To guide this process, we leveraged data from the US Department of Housing and Urban Development's (HUD) annual Point-In-Time Count, which helped us target cities with large unsheltered homelessness populations. We also used information from select cities to target known encampment locations.
We then created 204 images from GoogleEarth screenshtos, ultimately representing five US states, 14 cities, and heights ranging from 200m-550m above ground.
Next, we uploaded our data to Roboflow, which we used to streamline the annotation process. Our final dataset includes over 1,700 unique annotations, with as many as 73 shelters labeled in a single image.
Finally, we exported the dataset for modeling, using a 70/20/10 split.

Data Science Approach

We built and evaluated four models, all of which leveraged transfer learning and were built upon models already pretrained for object detection. Crucially, this allowed us to proceed with a data set containing just a few hundred images rather than requiring thousands, or even tens of thousands, to be successful.

We started with a Faster RCNN model with a MobileNet backbone pretrained on the COCO dataset, which established our baseline. Our second model sought to improve upon the baseline using hyperparameter tuning, preprocessing, and augmentation. For model three, we leveraged YOLOv7, another model also pre-trained on the COCO dataset, and for our last and final model (at this project stage) we used a newer version, YOLOv8.

Evaluation

To determine our evaluation strategy, we first needed to decide how to define a true positive. We used intersection Over Union, which compares the ground truth bounding box to the predicted bounding box, asking the question, “how close is this prediction to being 100% correct?” For our true positives, we used an IoU threshold of 25%, which aligns with our use case. Our target users just need to know rough locations of where shelters are, and a low-IoU true positive is just as valuable to them as one with a much higher IoU.

It follows that our primary metric was recall, given our goal of minimizing false negatives, and our secondary metric was precision.

Key Learnings & Impact

After evaluation, we selected our two highest-performing models and deployed them within our MVP. The app allows credentialed users to upload an aerial image, select a model to run, then return an output image with all shelters detected. In order to meet the goals in our mission statement, we opted not to share our app publicly at this stage.

Among many other valuable lessons throughout this project, we learned that object detection for this use case is extremely difficult. While we were able to achieve moderately successful results, there's plenty more work to be done. In our future roadmap, we plan to take additional time to improve our models further, strategically expand our dataset to address gaps and increase data integrity, and expand our MVP to more directly meet the needs of public health professionals. We hope that in the future, our product can help local public health agencies optimize limited resources, bring aid to people who need it quickly during emergencies, and generally improve the safety, health, and wellbeing of our unsheltered neighbors.

We also recognize that our product involves an incredibly vulnerable group, and it does have the potential to cause harm if used for purposes outside of our original mission. We plan to vigorously defend against any future uses of our product that could harm the unsheltered population.

Acknowledgements

Special thanks to the Denver Department of Public Health and Environment for their partnership on this project, including providing guidance, subject matter expertise, and data on known shelter locations.

Course

Data Science 210. Capstone , Summer 2024

Class Project Gallery

ShelterAID Public Health Resources Map

Sample Image Outputs

The below images show test set outputs from our strongest model, built using YOLOv8.

An aerial image of a city block showing four labeled shelters.

An aerial image of a highway and surrounding areas showing two labeled shelters.

An aerial image of a city street near a review showing many labeled shelters.

Sample image outputs from ShelterAID

Last updated: August 7, 2024