eXpressMe! AI Driven Intent-Predicting Speech Assistant
Our project uses cutting edge kinetics-trained computer vision models, combined with natural language GPT2 sentence generation. We are re-imagining the speech assistants for teenagers having expressive communication difficulties. The app takes a video clip of 8 seconds duration, applies kinetics model on top of it, infers the possible actions. Using the actions, we provide subject (I, we, they) and let the speech API generate simple expressive sentences. (Can I eat hot dog, Shall we eat donuts).