AudioAid: Audio Fall Detection Device
An innovative machine learning-based monitoring system, designed to enable seniors to maintain independence in their homes safely. Recognizing the unprecedented demographic shift with the elderly population (65+) becoming the fastest-growing age group globally, Audio Aid addresses the paramount challenge of ensuring the safety and well-being of seniors, particularly in the context of aging in place.
Problem & Motivation
The world is undergoing a significant demographic transformation, and seniors' desire to Age in Place amplifies the importance of addressing the pervasive issue of falls. Every second, an older adult experiences a fall in the U.S., making falls the leading cause of injury and injury-related deaths in this age group. Audio Aid responds to this critical public health issue by proposing a monitoring system that utilizes advanced sound classification through machine learning, offering both security and practical assistance.
Real World Impact
Audio Aid directly tackles the alarming statistics surrounding falls among older adults, with approximately 36 million reported falls annually resulting in over 32,000 deaths. Beyond statistics, AudioAid contributes to enhancing the well-being of the aging population by providing reassurance to seniors and their families, making a tangible difference in addressing the urgent issue of falls.
Data Source & Data Science Approach
Data Source
We proposed an innovative audio detection algorithm to identify occurrences of falls, necessitating groundwork starting with data generation due to the absence of a public dataset for fall sounds. Our team, in collaboration with volunteers, simulated falls based on research-based fall mechanics. To diversify our dataset, we introduced a life-sized 150-pound manikin dummy named "Rugged Red," generously provided by the UCPD, specifically employed to simulate falls resulting from fainting, leveraging the intuition-conscious humans have to catch themselves. These simulations included both experimental fall and non-fall tasks defined within the established KFall dataset. To enhance the dataset's robustness, techniques such as time stretch, noise addition, and time and frequency masking were employed.
Data Science Approach
In crafting Audio Aid, our data science approach involved a rigorous dataset augmentation, incorporating time and frequency domain masking and background noise. Microphones were strategically placed in various rooms during data recording to ensure seamless integration into home environments with a single device. We employed time stretching to simulate the fall biomechanics of older adults, aligning with research indicating slower falls and reaction speeds in this demographic.
Our model, a Convolutional Neural Network (CNN), underwent training on Mel-Frequency Spectrograms and Mel-Frequency Cepstral Coefficients (MFCCs)—visual representations illustrating how frequencies change over time in an audio file. This process involves converting an audio wave into a spectrogram through Fourier transformations, which is then transformed into the Mel Scale, closely mirroring human pitch perception.
Moving forward, our focus is on creating a tangible device. By deploying a Tensorflow-lite model onto a Raspberry Pi, this device efficiently processes streaming audio data, extracting features, conducting inference, and triggering alerts to emergency services or caregivers as needed. This streamlined process ensures that Audio Aid, rooted in advanced data science techniques, is translated into a user-friendly solution for the safety and well-being of older adults at home.
Evaluation
Model Evaluation
AudioAid's efficacy is anchored in its powerful machine-learning framework. Utilizing a convolutional neural network (CNN), AudioAid excels in the precise classification of falls. The CNN processes an array of Mel-frequency cepstral coefficients (MFCCs), yielding predictions on a scale from 0 to 1, with falls identified when the prediction exceeds 0.5. Impressively, our model demonstrates exceptional performance, boasting a validation accuracy of 98.99% and a test accuracy of 96%. Currently, our team is actively working on the integration of the model into a real-time Raspberry Pi setup, to ensure AudioAid's seamless deployment and continuous commitment to enhancing safety for older adults.
Product Evaluation
Emphasizing the ethical dimension of the project, Audio Aid integrates the principles of informed consent and value frameworks such as the Belmont Report. The deliberate approach to data collection ensures users comprehend and consent to the process, reflecting a commitment to ethical considerations and user privacy.
Key Learnings & Impact
Developing AudioAid revolved around the nuanced application of audio deep learning. Delving into the intricacies of convolutional neural networks (CNNs), we designed an 8-layer model from scratch and explored transfer learning with the VGG16 model. We confronted challenges in achieving optimal results with transfer learning, leading us to prioritize efficiency and reliability by opting for our tailored CNN architecture. Noteworthy technical considerations include our meticulous data augmentation, incorporating time and frequency domain masking, and our emphasis on model robustness through diverse datasets. Ethically, our focus on informed consent reflects an appreciation for user rights and privacy. In summary, our journey underscores the intersection of advanced data science, model architecture, and ethical considerations in shaping a transformative solution for fall detection among older adults.