MIDS Capstone Project Summer 2024

TerraAI: Detecting Illegal Cannabis Cultivations in Northern California National Forests

Problem & Motivation

Illegal cannabis cultivation in federally owned national forests, particularly in Northern California, poses a significant environmental threat to wildlife habitat and surrounding communities. These operations often involve the unregulated use of harmful chemicals, leading to severe contamination of water sources and soil, and causing widespread loss of biodiversity. The current monitoring efforts by the USDA Forest Service and California Water Board are limited and time-consuming, relying heavily on manual techniques. This creates a substantial gap in effective detection and regulation, allowing these harmful practices to continue unchecked. By leveraging high-resolution satellite imagery and machine learning, we can develop a more efficient and accurate system to detect and monitor these illegal activities. This initiative not only aims to protect our national forests but also to ensure the sustainability of our water resources and the health of our ecosystems, addressing a pressing environmental issue with innovative technology.

Data Source & Data Science Approach

The project will leverage high-resolution satellite images from the Global Forest Change dataset provided by the University of Maryland, as well as data from ArcGIS and the Copernicus Data Center. These sources offer comprehensive and up-to-date information on forest cover and changes, which are crucial for identifying illegal cannabis cultivation sites. Our data science approach will focus on employing a UNET Transformer model, known for its image segmentation and classification capabilities. The model will be trained on these datasets to detect specific patterns associated with illegal cultivation activities. By utilizing the UNET Transformer’s ability to capture both fine and coarse features, we aim to achieve high accuracy in identifying and monitoring illegal cannabis growth, providing a robust and scalable solution for environmental protection and regulatory enforcement.

Evaluation

The evaluation of this project will be multifaceted, focusing on the accuracy, reliability, and scalability of the developed machine learning models. Initially, baseline models will be tested on a reduced dataset to validate the formulation of the problem and identify any preliminary issues. Subsequent iterations will involve more sophisticated models tested on larger data subsets, with continuous performance monitoring and debugging. Key evaluation metrics will include precision, recall, and F1 score to ensure the models accurately detect illegal cultivation sites. Additionally, phased implementation starting on the West Coast will assess the model’s scalability and adaptability to different regions.

Key Learnings & Impact

This project offers significant learning opportunities in applying advanced data science techniques to real-world environmental challenges. Key learnings will include the practical application of machine learning models to high-resolution satellite imagery, feature extraction, and anomaly detection in complex ecological contexts. Additionally, understanding the intricacies of environmental data and the legal and operational frameworks of agencies like the USDA Forest Service and California Water Board will be invaluable. The project's impact extends beyond technological innovation, aiming to protect vital natural resources by providing accurate and timely detection of illegal cannabis cultivations. By mitigating the environmental damage caused by these activities, the project supports biodiversity preservation, water conservation, and overall ecosystem health. Furthermore, the insights gained can drive policy changes and improve regulatory frameworks, demonstrating how data science can contribute to sustainable environmental management and enforcement.

More Information

Last updated: August 6, 2024