SeizurEShield Mission Statement
MIDS Capstone Project Summer 2024

SeizurEShield

Problem and Motivation

Approximately 1 in 10 people will experience a seizure in their lifetime, and for some, these seizures become chronic, resulting in a condition known as epilepsy. Affecting an estimated 50 million people worldwide, it’s one of the most common neurological disorders globally, alongside Alzheimer's and Parkinson's. The primary challenge of living with epilepsy is its unpredictability; while about 65% of people with generalized epilepsy report experiencing "auras" or warning sensations before a seizure, many do not, and there are four other types of epilepsy where such warnings are even less common. 

This unpredictability poses significant quality of life and safety concerns, as seizures can occur without warning, and range from brief periods of lost time to severe convulsions and unconsciousness, the latter of which necessitates immediate hospitalization. Given that medications for epilepsy do not guarantee complete control over seizures and only four seizure detection products are currently commercially available, our mission is to leverage deep learning and machine learning techniques to detect seizure activity from EEG recordings of the human brain. This will allow for enhanced speed, accuracy, and reliability of seizure detection, thereby improving the quality of life and safety for individuals with epilepsy, offering greater independence and peace of mind.

Data Source and Data Science Approach

The data used in this project was sourced from the Temple University Hospital's EEG Corpus (https://isip.piconepress.com/projects/nedc/html/tuh_eeg/), and spans 17 years of data collection efforts with a total of 16,986 sessions from 10,874 unique subjects. The recordings in the dataset were taken via an 18 to 31 lead electroencephalography, or EEG, from approximately equal ratios of male and female patients (49% and 51% respectively), with an age range of <1 year to >90 years (median of 51.6 years). All data was thoroughly de-identified and randomized by the TUH Department of Neurology prior to addition to the corpus to comply with the HIPAA Privacy Rule.  Approximately 87% of the recordings were taken due to epileptic activity, with another 12% due to strokes and the remaining 1% due to concussions.

As the total dataset required ~600 GB of storage space, our team elected to use Amazon SageMaker as our main platform for staging the training and testing of our machine learning solutions. After comparing several different algorithms, the team decided to implement a Recurrent Neural Network model, chosen due to its aptitude for working with time-series data.  

Last updated: July 24, 2024