MIDS Capstone Project Summer 2023

FRED2Vis: A ChatGPT Enabled Tool for Macro Economic Analysis

Problem & Motivation

ChatGPT has a profound impact on human work especially for the financial market. And productivity and efficiency improvement is predicted to be the 1st commerial AI application enabled by ChatGPT. The current scope of how ChatGPT can be used in economic research focuses only on small tasks. We aim to create a tool that can perform a series of tasks for economic analysis, that would significantly save time and put into commercial use right away.

Fred2Vis is an end to end LLM enabled tool for financial market participants or economists, to use natural language to query, analyze and visualize data from FRED database seamlessly.

Data Source & Data Science Approach

Our underlying data source is FRED database - https://fred.stlouisfed.org/.

  • FRED is short for Federal Reserve Economic Data, containing more than 800k time series from over 100 sources. This is essentially the go-to database for anyone doing macro analysis.
  • The data is well structured by category, geo location, time and source etc. We can directly call FRED api with a series ID to retrieve the data

As an end to end tool, in addition to front end and back end, we created three modules for users to use natural language to perform the data query, data transformation and data visualization tasks, which is a typical workflow for users for economic analysis:

  • Query module queries data and condenses it into a digestible format
  • Transformation module applies data manipulations according to the user's instructions 
  • Visualization module brings the data to life by creating charts and plot visualizations
  • All three data modules primarily use OpenAI GPT-3.5 line of models for text and code generation. We also use LangChain for structuring prompts in a conversational style.
  • Lastly for the Data Query module, we also assemble a local search engine using Pinecone as a specialized semantic vector database to solve the hallucination issue. 

Evaluations

To evaluate our final product, we created a survey to let real-life users test the product and provide their feedback. We collected 17 users with highly positive feedbacks:

  • 12-14 out of 17 find it extremely helpful or very helpful
  • The success rate in completion each task ranges from 82% - 88%
  • 53% - 65% of times users can complete the tasks in 1-2 tries

Key learnings & impact

FRED2Vis is a cutting edge tool that enables financial market participants or economists to complete data query, transformation and visualization in an integrated flow. It would significantly improve efficiency and productivity (by 60-70% using our example)

We learnt that for LLMs to integrate several workflows into a seamless process need the accuracy to perform each task with ease and overcome the below key challenges

  • Solving the hallucination issue for querying the right data is very essential. I
  • Prompt management is important to interpret users' intentions. Providing users with sample questions and further fine tuning prompt are helpful

In addition, our team has diversified skills including coding, data analytics, software engineering and project management. These enable us to focus on each of our strengths to complete the project but also learn from others.

Acknowledgements

We would like to thank our Capstone instructors, Fred Nugen and Ramesh Sarukkai, for their instructions and feedback throughout the semester. 

We also want to thank our colleagues and friends who are willing to take the time to test our product and provide feedback.

Last updated: August 8, 2023