Data Exploration Made Easy
A solution for data scientists and other data professionals, we aim to simplify the initial data exploration phase that is vital to inform the direction of data analysis. We are trying to do away with the processes that make data exploration tedious due to the need to generate and inspect potentially hundreds of graphs to search for interesting variable relationships by generating smart recommendations on the connected dataset.
The existing version of Lux is an open-source, Python library that accelerates and simplifies the process of data exploration. Upon a simple import of data into a Lux-specific object within their Jupyter Notebook, the Lux visualization recommendation system works its magic to automatically generate collections of graphs for users to browse and recommends interesting visualizations to guide users towards potential next-steps in their analysis.
To make Lux more accessible to database users and allow them to leverage all of their data in the exploration process, our capstone project focused on extending the Lux feature set by enabling it to work on a Postgresql database directly by creating a new SQL Execution backend. The execution engine we have created lets users connect Lux to a relational database and use its visualization recommendation system on top of their database systems. To make this work, we needed to make a new SQL Execution engine that automatically creates and pushes queries to users’ databases to gather visualization data. Furthermore to ensure that the new Lux SQL capabilities meet database users’ needs, we conducted interviews to validate and inform our engineering work. Talking with these users helped us understand their workflows and what would be needed for Lux to better fit into them. The interviews also highlighted other API features that would greatly improve users’ experience.