A group of UC Berkeley School of Information Master of Information and Data Science graduates have turned their Hal R. Varian Award-winning project into a published paper in Nature Scientific Reports.
Romain Hardy, Joe Klepich, Ryan Mitchell, Steve Hall, and Jericho Villareal (all MIDS ’23) are the team members behind curie.AI, a toolkit for medical practitioners to empower patient liver health. The toolkit provides a suite of applications for a variety of uses, including liver ultrasound navigation, organ delineation, steatosis severity grading, liver mass detection, and generating realistic, synthetic liver ultrasound images from scratch for dataset enrichment.
With the guidance and support of Assistant Professor of Practice Cornelia Ilin, the team transformed their collaborative efforts into a comprehensive academic contribution that highlights the transformative potential of curie.AI in the realm of medical diagnostics.
“Approximately 25% of the US population has nonalcoholic fatty liver disease, which could lead to a progressional pathology of fatty liver, cirrhosis, and liver cancer, which is the second leading cause of cancer-related deaths. This research was done while taking care of my dad in home hospice care; his battle with this common and incurable disease inspired me to contribute to finding AI solutions that can make a difference in the lives of millions of us who live with this disease” said alum Joe Klepich, an author of the study and a physician by trade. This research will allow healthcare professionals to make quicker, more cost-effective diagnoses, which is crucial for managing and potentially reversing this incurable disease, added Dr. Klepich.
Latent Diffusion Models and NAFLD Classification Performance in Low Data Settings
Currently, the best methods for detecting nonalcoholic fatty liver disease (NAFLD) are biopsies, which can be both invasive and expensive, and ultrasounds, which are less costly but reliant on technician skill. A newer method, which utilizes deep learning methods to improve the performance of liver ultrasounds, quickly ran into a data collection issue; collecting and annotating medical data is prohibitively expensive, requires professional oversight, and is often restricted by privacy protocols.
The team began looking into latent diffusion models (LDM) — cutting-edge generative AI models — in hopes of creating synthetic liver ultrasounds to supplement the absence of plentiful datasets. They found that LDMs showcased excellent performance in generating realistic NAFLD liver ultrasound images through the training process. Many synthetic images reproduced key visual characteristics of NAFLD as well as finer details like hepatic veins, portal veins, and organ linings. The team also found that these synthetic ultrasounds improve performance on an NAFLD classification task when mixed in with real ultrasounds. This improvement proved superior to that achieved by more traditional augmentation techniques, demonstrating the benefit of a cutting-edge generative AI approach in low-data settings.
“Our research focuses only on a single data modality (liver ultrasounds), but its results broadly apply in the biomedical space. In the future, generative AI will become an essential tool for enriching datasets and distilling foundational insights about human biology and disease development,” said Romain Hardy, the study’s first author.
“An essential question revolves around how the AI-reconstructed images accurately depict healthy or unhealthy patient characteristics,” articulated Cornelia Ilin, the senior author of the study. To test the functionality of their models, the team recruited five medical professionals to classify an ultrasound image as fake or real and honed in on what features matter for NAFLD prediction. “The results revealed that medical participants in our study struggled to differentiate between real and synthetic images consistently; and that the model relies on the stylistic features of NAFLD for disease classification, so this was all good news for us," Ilin continued.
“Radiologists will be able to use these real-time navigation and classification systems in medical imaging to detect radiological signs that might be overseen due to work overload, time constraints, or early clinical onset of the disease. Embedding these systems into the technician workflow benefits not only the patient but also the healthcare professional,” said Dr. Klepich.
“This research experience motivated me to seek a Ph.D. in biomedical data science. As a graduate student, I plan to continue contributing toward research in generative AI and the development of equitable biomedical tools,” added Hardy.
“Collaborating with the curie.AI team has been an incredibly positive experience for me," shared Professor Ilin.