MIDS Capstone Project Fall 2024

TranscriptIQ: Financial Analyst Tool for Deeper Earnings Insight

Team members

Unlock rapid insights from earnings calls with TranscriptIQ! Dive into AI-powered summaries, sentiment trends, and actionable data from the S&P 500

TranscriptIQ is an AI tool that revolutionizes how financial analysts extract insights from company earnings calls. By combining machine learning and sentiment analysis, it delivers fast, concise summaries of key financial discussions, saving hours of manual work. With a database of 40,000 calls spanning 10 years, TranscriptIQ uncovers unique insights on topics like "Supply Chain" in industries such as "Automobiles". Using Retrieval-Augmented Generation (RAG), it surfaces relevant information and provides a nuanced view of management tone. TranscriptIQ streamlines earnings call analysis, empowering analysts to make smarter, faster decisions and stay ahead in today’s dynamic market.

Key Impacts

Time is the most valuable asset in the fast-paced financial industry, and TranscriptIQ is built to preserve and enhance it. Our tool allows analysts to bypass the exhaustive process of combing through full earnings call transcripts, instead delivering AI-generated summaries and metric-based insights in a fraction of the traditional time. By reducing the time required for deep analysis, TranscriptIQ enables faster, better-informed decision-making. With our platform, analysts can obtain timely, actionable insights that save resources and open new opportunities for firms to respond quickly to market shifts.

Model and Evaluations

To support our solution, we developed a robust language model based on Meta’s LLaMA architecture, optimized to perform efficiently in 8-bit precision. This model is fed quarterly earnings data from the S&P 500, with careful preprocessing to ensure quality. Our evaluation framework focuses on three essential metrics: Faithfulness, Sentiment Analysis, and Relevancy.

Faithfulness: Ensures that model-generated summaries accurately reflect the source material, avoiding overgeneralization. Each response is crafted to provide reliable insights in a clear, concise format.
Sentiment Analysis: Using VADER for sentiment evaluation, we analyze the polarity of summaries, providing nuanced insights into management's tone across earnings discussions.
Relevancy: Assesses how well model responses address key topics, like capital expenditure, aligning generated content with analyst queries.

The LLaMA model processes financial documents, retrieves contextually relevant sections based on target queries, and generates summaries that reflect key trends each quarter. This approach allows analysts to quickly gauge company sentiment and strategic direction, supported by a model rigorously tested for accuracy, alignment, and insight relevance

Front/Back-end Architecture

To deliver a seamless, low-latency experience, the architecture of TranscriptIQ is divided into two key subsystems:

Offline RAG Component:
- Stores S&P 500 segment information in an RDBMS.
- Utilizes the Qdrant vector store to manage embeddings of management answers from S&P 500 quarterly transcripts spanning the last 10 years.
- Employs an 8-bit quantized LLaMA model for summarizing responses and the VADER sentiment analysis model to compute sentiment scores.
- Outputs from this component are stored in a NoSQL database, ensuring fast and smooth request handling.
Real-time Application:
- Precomputed text summarizations and sentiment scores are stored in DynamoDB for efficient retrieval.
- Responses are served via AWS Lambda functions, accessible through AWS API Gateway, with endpoints secured using AWS Cognito.

The front-end interface, deployed on Netlify, is designed for simplicity and accuracy, empowering analysts to uncover meaningful insights without technical barriers.

Framework of application from RAG model, to vector database, to Web application interface.

Image 1. Backend/Frontend Architecture

TranscriptIQ offers two main visualization features:

Sentiment Analysis Dashboard: Provides a sentiment trend analysis of key topics in financial earnings transcripts over the past five years. Each row represents a peer company, and each column denotes a fiscal quarter, with color-coding to indicate sentiment scores. These scores are a proxy for management's tone on industry trends, market changes, and competitive positioning. The sentiment trend-line aids analysts in hypothesis validation and trend prediction.
Dispersion Analysis Page: Illustrates the total range of sentiment scores by sector, highlighting sentiment disparities across topics for different industries. By pinpointing topics with strong positive or negative sentiment, analysts can benchmark sector-wide sentiment on market dynamics and focus on priority areas.

Conclusion

TranscriptIQ marks a significant leap forward for financial analysts, delivering precise, rapid insights from extensive earnings call data. By leveraging advanced language models, sentiment analysis, and a streamlined interface, TranscriptIQ transforms the analyst’s workflow, turning hours of reading into seconds of summarized insights. With features designed to spotlight critical trends and sentiments, our tool allows users to stay ahead of the curve in a competitive financial landscape, making it an invaluable asset for investment teams and market researchers alike. As TranscriptIQ evolves, we anticipate it will continue to drive innovation in financial analysis, setting a new standard for efficiency and depth in earnings call interpretation.

Acknowledgements

We are grateful for the guidance and encouragement of our Capstone Advisors, Dr. Kira Wetzel and Dr. Puya H. Vahabi, of the Masters in Data Science program at School of Information, University of California, Berkeley.

In addition, the following subject matter experts contributed heavily during the research and/or prototype testing phases of our project:

-Masahiro Kikuchi, Credit Research Director at Metlife Investment Management