RxReduce: A Medication Deprescribing Agent Framework
Problem & Motivation
Polypharmacy (defined as patients taking 5+ medications) in patients 65+ years of age has tripled over the past few years. Proper medication review with the possibility of deprescribing (elimination of medications from a patient's profile) is a method with which providers can prevent medication interactions, reduce adverse effects, and reduce the complexity of a patient's medication regimen. Due to the complexities of completing the medication review and lack of effective supporting tools, this process is extremely time consuming for providers. As a result, many medications are continued past discharge which could potentially be removed.
Data Sources & Data Science Approach
Our solution uses three large language model (LLM) powered agents that evaluate the patient's diagnosis, inpatient stay information, and clinical notes to make a final recommendation to the provider during the discharge process. For our project, we focused on the medication class Proton Pump Inhibitors (PPIs) as these are a commonly prescribed medication and are frequently continued without supporting indications. In addition there is a robust deprescribing algorithm which we were able to translate into our code. Our test deidentified data sources were acquired through agreement from the Information Commons team at UCSF Medical Center. In addition, for our demonstrations and model evaluation, additional synthetic patient examples were created with were based on actual scenarios we encountered in the deidentified data sources.
Evaluation
Our evaluation was a combination of classification metrics around the final recommendation with measured Precision, Recall, and F1 scores in addition to response similarity evaluation using BERT, BLEU, ROUGE, and METEOR scores. Classification results between iterations were compared with confusion matrix visualizations and evaluated. Our final version resulted in Precision (0.91), Recall (0.80), and F1 (0.84) scores. In addition to the final recommendation classification metrics, our final version resulted in BERT Precision (0.834), BERT Recall (0.839), and BERT F1 (0.836). Rouge F1 (0.202), METEOR (0.157), and BLEU (0.005) scores were also calculated.
Key Learnings & Impact
Our team learned first hand the complexities and time requirements of evaluation of a patients discharge medication list and effort that is required to build a labeled data set. Many of the patients had high note volumes (100+) which we found to be correlated with inpatient stay length. The project demonstrated the viability and power that a LLM powered solution would have in assisting providers with appropriate medication deprescribing recommendations in addition to the location and flagging of the relevant note and diagnosis sections found. We spoke to clinical providers who confirmed the difficulties and time commitment of the discharge medication process and need for better technology to assist them in this crucial step of patient care.
Acknowledgements
We extend our gratitude to our instructors, Korin Reid and Ramesh Sarukkai along with appreciation for the feedback and guidance that Mark Butler provided as we worked to optimize our LLM responses. Additionally we thank the UCSF Information Commons team for their curation of the deidentified health data. Finally we thank Matthew Growdon, MD, MPH, Brian L Michaels, PharmD BCPS, Jessica Pourain, MD, FAAP, and Cynthia Fenton, MD for their feedback on the difficulties surrounding providers in the medication discharge process.