Banner
MIDS Capstone Project Fall 2024

GreenCompute

Problem & Motivation

Data centers, the backbone of the digital economy, consume vast amounts of energy and contribute significantly to global carbon emissions. With increasing pressure from regulatory bodies, stakeholders, and consumers to adopt sustainable practices, data center operators face challenges in monitoring and reducing their environmental impact. 

GreenCompute addresses this gap by offering a user-friendly, data-driven platform to track carbon emissions and provide actionable recommendations to optimize energy efficiency and minimize emissions.

Data Source & Data Science Approach

GreenCompute offers two key features: data center carbon footprint estimations and a chatbot to answer questions on best practices to improve efficiency. Each of these two features employs a different approach: the former, a classical ML approach; the latter, a GenAI approach. 

Classical ML: Data Center Carbon Footprint Estimations

In the classical ML approach, the goal is to train various regression models that will yield predictions which we will use in calculating a data center’s carbon footprint.

We utilize publicly available datasets, including:

  • SPEC Power Database: Provides server power metrics for energy consumption analysis.
  • LBNL Data Center Tools: Offers insights into energy efficiency and carbon estimates.
  • SCI Guidance Project: Contains embodied carbon emissions data for various data center machines.

Our exploratory data analysis includes:

  1. Servers and data centers account for a significant portion of global IT electricity usage. Power @ 50% load demonstrates how power scales with operational demand. Efficient hardware minimizes electricity usage at varying loads. Active Idle Power is the power consumption of IT equipment (e.g., servers, storage) when powered on but not actively processing workloads. It represents the baseline energy usage of systems in standby, ready to handle incoming tasks.
Figure 1: This chart shows the average IT electricity consumption at different load levels
Figure 2: This chart reflects the distribution of average active idle power
  1. PUE is heavily climate dependent. Cooler, less humid climates allow for more economization (free cooling). Some cooling systems are more efficient than others.
Figure 3: This chart displays the average power usage effectiveness across different states in U.S
  1. Larger the memory and the number of CPUs generally exhibit larger embodied carbon emission. Similar values of memory or CPU count often resulted in a wide range of emission levels, suggesting that the relationships were not linear.
Figure 4: This chart reflects the relationship between memory capacity and the number of CPUs, and how their associated embodied carbon emission

To streamline analysis of our datasets, we reduced dimensionality using correlation metrics and random forest importance scores. With additional critical domain insights, we identified key independent variables:

  • From the SPEC dataset: Memory capacity, number of cores, and CPU counts.
  • From the PUE dataset: Location and cooling system type.
  • From the Cloud Carbon dataset: Memory capacity and CPU counts.

After data preprocessing and feature engineering, we tested multiple models and selected the most accurate:

  1. Gradient Boost Regressor to predict IT equipment electricity consumption.
  2. Random Forest to predict active idle power.
  3. XGBoost to predict embodied carbon 
  4. XGBoost to predict PUE (Power Unit Efficiency)

Finally, we calculated the average annual power draw per server by combining active idle power and IT equipment electricity consumption. By factoring in PUE, server count, and constants, we determined the data center's carbon footprint. Adding the embodied carbon estimate yielded the total annual energy consumption—a key result of our analysis.

GenAI: Chatbot Recommendations

The GenAI approach employs a traditional RAG (Retrieval Augmented Generation) chatbot, which is composed of the following steps: 

  1. Document content extracted, chunked, and embedded into vector db
  2. User sends a query
  3. Query is vectorized and similarity search is conducted against our vector db
  4. Top-n most similar documents are returned and passed to the LLM with a custom prompt
  5. Answer is returned to the user

The document content that we use comes from the Center of Energy Expertise library. With our SME, we indexed a select group of documents which were focused on increasing data center efficiency and had a focus on small to medium sized data centers. 

Evaluation

Our solution is evaluated through:

  • Model Accuracy: Achieving a high R-square and low Mean Squared Error (MSE).
  • Chatbot Metrics: Ensuring a high correctness score in our chatbot responses

Classical ML: Data Center Carbon Footprint Estimations

We tested various models and selected the most accurate for each prediction task. We used a Gradient Boost Regressor to estimate IT equipment electricity consumption, a Random Forest model for active idle power, and XGBoost for predicting embodied carbon and PUE.

To ensure reliable model performance, we focused on Mean Squared Error (MSE) and R-square ($R^2²$) as our evaluation metrics:

  • R-square: Measures how well the independent variables explain the variance in the dependent variable. A higher $R^2²$ indicates the model effectively captures patterns in the data, helping us understand the relationships between features and outputs.
  • Mean Squared Error (MSE): Quantifies the average squared difference between predicted and actual values, penalizing larger errors. This metric ensures the model minimizes prediction inaccuracies, critical for precise estimations of carbon footprints and energy consumption.

By combining these metrics, we can assess both the model's predictive power ($R^2²$) and its accuracy (MSE), ensuring robust and actionable results for data center carbon footprint estimation.

Target VariableModelMSE
Embodied CarbonLinear Regression0.230.53
Embodied CarbonDecision Tree0.100.80
Embodied CarbonRandom Forest0.080.83
Embodied CarbonXGBoost0.060.87
IT ElectricityRandom Forest14,4090.96
IT ElectricityXGBoost8,3390.97
IT ElectricityNeural Network18,4550.94
Active Idle PowerRandom Forest3,7940.92
Active Idle PowerXGBoost5,5730.89
Active Idle PowerKNN21,4960.57
PUELinear Regression0.0260.633
PUEDecision Tree0.0250.644
PUERandom Forest0.0250.644
PUEXGBoost0.0250.644

GenAI: Chatbot Recommendations

To evaluate our RAG chatbot, we first generated a synthetic dataset of question and answer pairs and had our SME remove questions which were deemed irrelevant for our application (~5%). We then used correctness as our evaluation metric, which is based on the G-Eval framework. In this framework, an LLM is used as a judge to determine if the provided answer is correctly answering the given question.

We employed three different prompting strategies which we evaluated: a base prompt, a more advanced prompt, and an optimized prompt. The base prompt used a simple “Using the context provided, give an answer.” The advanced prompt added a role component and more context on a data center. Finally, the optimized prompt (which was optimized with DSPy framework), provided extensive role and context descriptions along with a Chain-of-Thought approach with multiple question/answer pair examples. The results of each strategy can be seen below:

Prompt TypeCorrectness Score (%)
Base66.67
Advanced88.40 (+21.73)
Optimized92.14 (+3.74)

Architecture and Deployment

The GreenCompute platform is underpinned by a microservices architecture designed for scalability and maintainability. An overall view of the deployed components is featured below:

The deployment sequence is the following:

  1. API/Frontend code image built and stored in ECR via Github Actions
  2. AWS ECS pulls images and runs on ECS task
  3. AWS ECS API task connects to AWS RDS, S3, and Bedrock to load data and models
  4. Load balancer attaches to AWS ECS and serves as entrypoint for users to the frontend via NGINX proxy
  5. Entire infrastructure deployed with the help of Terraform (IaC)

Key Learnings

  • Our key learnings were derived from conversations with potential users who are experienced researchers or operators in the data center world. From our interviews with this group of individuals, we found that:
  • Simplicity is key. Facilitating the input of information gives GreenCompute a great advantage over current tools which employ tedious ways of collecting user input. 
  • References to our source material are highly valued. By linking to the sources our RAG chatbot utilizes, users can follow up and conduct deeper research on the recommendations provided.
  • Users want to further personalize their experience. We heard several comments about how each data center is different and that capturing these differences would greatly improve the user experience.
  • There are a lot of questions about AI. With the hype around AI, data center researchers and operators are more curious than ever about how the use of AI will impact their field and how they can leverage the technology to keep up. 

Impact

  • Developed and unified energy and emission regression models to facilitate visibility on a data center’s footprint 
  • Empowers data centers to reduce carbon emissions, supporting global climate goals.
  • Assists organizations in meeting ESG reporting requirements.
  • Provides decision-makers with tools to balance sustainability with operational efficiency.

Future Work

There are three main areas to explore in the future: 

  1. Streamlined user experience - to collect granular information on the user’s data center for improved data features
  2. More personalized data center recommendations - by using the user’s input along with a wider range of expert corpora as context for our RAG chatbot, we can provide tailored recommendations for the user’s data center
  3. Time series forecasting - by providing a means of storing historical compute data, we can create time series forecasting models that could help users get a better understanding of their long term data center energy usage

Acknowledgements

We extend our gratitude to our project advisors, UC Berkeley W210 course instructors, and the organizations providing access to datasets. Special thanks to the teams behind LBNL and SPEC for their invaluable resources.

More Information

Last updated: December 10, 2024