MIDS Capstone Project Fall 2024

GreenCompute

Team members

Problem & Motivation

Data centers, the backbone of the digital economy, consume vast amounts of energy and contribute significantly to global carbon emissions. With increasing pressure from regulatory bodies, stakeholders, and consumers to adopt sustainable practices, data center operators face challenges in monitoring and reducing their environmental impact.

GreenCompute addresses this gap by offering a user-friendly, data-driven platform to track carbon emissions and provide actionable recommendations to optimize energy efficiency and minimize emissions.

Data Source & Data Science Approach

GreenCompute offers two key features: data center carbon footprint estimations and a chatbot to answer questions on best practices to improve efficiency. Each of these two features employs a different approach: the former, a classical ML approach; the latter, a GenAI approach.

Classical ML: Data Center Carbon Footprint Estimations

In the classical ML approach, the goal is to train various regression models that will yield predictions which we will use in calculating a data center’s carbon footprint.

We utilize publicly available datasets, including:

SPEC Power Database: Provides server power metrics for energy consumption analysis.
LBNL Data Center Tools: Offers insights into energy efficiency and carbon estimates.
SCI Guidance Project: Contains embodied carbon emissions data for various data center machines.

Our exploratory data analysis includes:

Servers and data centers account for a significant portion of global IT electricity usage. Power @ 50% load demonstrates how power scales with operational demand. Efficient hardware minimizes electricity usage at varying loads. Active Idle Power is the power consumption of IT equipment (e.g., servers, storage) when powered on but not actively processing workloads. It represents the baseline energy usage of systems in standby, ready to handle incoming tasks.

Alt Text — Figure 1: This chart shows the average IT electricity consumption at different load levels

2nd chart 2 — Figure 2: This chart reflects the distribution of average active idle power

PUE is heavily climate dependent. Cooler, less humid climates allow for more economization (free cooling). Some cooling systems are more efficient than others.

3rd chart us — Figure 3: This chart displays the average power usage effectiveness across different states in U.S

Larger the memory and the number of CPUs generally exhibit larger embodied carbon emission. Similar values of memory or CPU count often resulted in a wide range of emission levels, suggesting that the relationships were not linear.

3rd chart — Figure 4: This chart reflects the relationship between memory capacity and the number of CPUs, and how their associated embodied carbon emission

To streamline analysis of our datasets, we reduced dimensionality using correlation metrics and random forest importance scores. With additional critical domain insights, we identified key independent variables:

From the SPEC dataset: Memory capacity, number of cores, and CPU counts.
From the PUE dataset: Location and cooling system type.
From the Cloud Carbon dataset: Memory capacity and CPU counts.

After data preprocessing and feature engineering, we tested multiple models and selected the most accurate:

Gradient Boost Regressor to predict IT equipment electricity consumption.
Random Forest to predict active idle power.
XGBoost to predict embodied carbon
XGBoost to predict PUE (Power Unit Efficiency)

Finally, we calculated the average annual power draw per server by combining active idle power and IT equipment electricity consumption. By factoring in PUE, server count, and constants, we determined the data center's carbon footprint. Adding the embodied carbon estimate yielded the total annual energy consumption—a key result of our analysis.

GenAI: Chatbot Recommendations

The GenAI approach employs a traditional RAG (Retrieval Augmented Generation) chatbot, which is composed of the following steps:

Document content extracted, chunked, and embedded into vector db
User sends a query
Query is vectorized and similarity search is conducted against our vector db
Top-n most similar documents are returned and passed to the LLM with a custom prompt
Answer is returned to the user

The document content that we use comes from the Center of Energy Expertise library. With our SME, we indexed a select group of documents which were focused on increasing data center efficiency and had a focus on small to medium sized data centers.

Evaluation

Our solution is evaluated through:

Model Accuracy: Achieving a high R-square and low Mean Squared Error (MSE).
Chatbot Metrics: Ensuring a high correctness score in our chatbot responses

Classical ML: Data Center Carbon Footprint Estimations

We tested various models and selected the most accurate for each prediction task. We used a Gradient Boost Regressor to estimate IT equipment electricity consumption, a Random Forest model for active idle power, and XGBoost for predicting embodied carbon and PUE.

To ensure reliable model performance, we focused on Mean Squared Error (MSE) and R-square ($R^2²$) as our evaluation metrics:

R-square: Measures how well the independent variables explain the variance in the dependent variable. A higher $R^2²$ indicates the model effectively captures patterns in the data, helping us understand the relationships between features and outputs.
Mean Squared Error (MSE): Quantifies the average squared difference between predicted and actual values, penalizing larger errors. This metric ensures the model minimizes prediction inaccuracies, critical for precise estimations of carbon footprints and energy consumption.

By combining these metrics, we can assess both the model's predictive power ($R^2²$) and its accuracy (MSE), ensuring robust and actionable results for data center carbon footprint estimation.

Target Variable	Model	MSE	R²
Embodied Carbon	Linear Regression	0.23	0.53
Embodied Carbon	Decision Tree	0.10	0.80
Embodied Carbon	Random Forest	0.08	0.83
Embodied Carbon	XGBoost	0.06	0.87
IT Electricity	Random Forest	14,409	0.96
IT Electricity	XGBoost	8,339	0.97
IT Electricity	Neural Network	18,455	0.94
Active Idle Power	Random Forest	3,794	0.92
Active Idle Power	XGBoost	5,573	0.89
Active Idle Power	KNN	21,496	0.57
PUE	Linear Regression	0.026	0.633
PUE	Decision Tree	0.025	0.644
PUE	Random Forest	0.025	0.644
PUE	XGBoost	0.025	0.644

GenAI: Chatbot Recommendations

To evaluate our RAG chatbot, we first generated a synthetic dataset of question and answer pairs and had our SME remove questions which were deemed irrelevant for our application (~5%). We then used correctness as our evaluation metric, which is based on the G-Eval framework. In this framework, an LLM is used as a judge to determine if the provided answer is correctly answering the given question.

We employed three different prompting strategies which we evaluated: a base prompt, a more advanced prompt, and an optimized prompt. The base prompt used a simple “Using the context provided, give an answer.” The advanced prompt added a role component and more context on a data center. Finally, the optimized prompt (which was optimized with DSPy framework), provided extensive role and context descriptions along with a Chain-of-Thought approach with multiple question/answer pair examples. The results of each strategy can be seen below:

Prompt Type	Correctness Score (%)
Base	66.67
Advanced	88.40 (+21.73)
Optimized	92.14 (+3.74)

Architecture and Deployment

The GreenCompute platform is underpinned by a microservices architecture designed for scalability and maintainability. An overall view of the deployed components is featured below:

The deployment sequence is the following:

API/Frontend code image built and stored in ECR via Github Actions
AWS ECS pulls images and runs on ECS task
AWS ECS API task connects to AWS RDS, S3, and Bedrock to load data and models
Load balancer attaches to AWS ECS and serves as entrypoint for users to the frontend via NGINX proxy
Entire infrastructure deployed with the help of Terraform (IaC)

Key Learnings

Our key learnings were derived from conversations with potential users who are experienced researchers or operators in the data center world. From our interviews with this group of individuals, we found that:
Simplicity is key. Facilitating the input of information gives GreenCompute a great advantage over current tools which employ tedious ways of collecting user input.
References to our source material are highly valued. By linking to the sources our RAG chatbot utilizes, users can follow up and conduct deeper research on the recommendations provided.
Users want to further personalize their experience. We heard several comments about how each data center is different and that capturing these differences would greatly improve the user experience.
There are a lot of questions about AI. With the hype around AI, data center researchers and operators are more curious than ever about how the use of AI will impact their field and how they can leverage the technology to keep up.

Impact

Developed and unified energy and emission regression models to facilitate visibility on a data center’s footprint
Empowers data centers to reduce carbon emissions, supporting global climate goals.
Assists organizations in meeting ESG reporting requirements.
Provides decision-makers with tools to balance sustainability with operational efficiency.

Future Work

There are three main areas to explore in the future:

Streamlined user experience - to collect granular information on the user’s data center for improved data features
More personalized data center recommendations - by using the user’s input along with a wider range of expert corpora as context for our RAG chatbot, we can provide tailored recommendations for the user’s data center
Time series forecasting - by providing a means of storing historical compute data, we can create time series forecasting models that could help users get a better understanding of their long term data center energy usage

Acknowledgements

We extend our gratitude to our project advisors, UC Berkeley W210 course instructors, and the organizations providing access to datasets. Special thanks to the teams behind LBNL and SPEC for their invaluable resources.

Course

Data Science 210. Capstone , Fall 2024

Class Project Gallery

More Information

Website

GitHub Repo

Video

Last updated: December 10, 2024