CPU Microarchitecture for Data Centres

As Moore’s Law slows down and the world’s demand for computing power continues to grow, researchers on CPU microarchitecture are challenged more than ever to rethink established mechanisms and develop innovative ideas for a sustainable future. Our research group aims to combat this challenge by addressing key bottlenecks in today’s CPU’s with a particular focus on data centers.

Currently, we focus on addressing the following challenges:

# Branch Prediction

Branch prediction is a critical microarchitectural mechanism for high-performance CPUs. By anticipating branch instructions and their outcomes, branch predictors keep processor pipelines continuously fed with instructions. Each misprediction triggers a pipeline flush, rendering the work of tens of cycles and hundreds of instructions useless. More accurate branch prediction not only enhances performance on today’s workloads but crucially enables more aggressive future CPU designs with wider pipelines and larger instruction windows.

Our recent research took a first step by introducing the Last-Level Branch Predictor (LLBP), a novel hierarchical branch predictor design that opens up new avenues for rethinking branch prediction strategies. By decoupling metadata storage from prediction logic, LLBP creates opportunities for novel algorithmic approaches and demonstrates a path to more sophisticated prediction mechanisms. Our research addresses these challenges to boost branch prediction accuracy and unlock future CPU designs.

Keywords: Branch Prediction

# Frequent Context Switching

Today’s data center workloads are moving away from monolithic services towards event-based software systems. These architectural shifts enable high concurrency of fine-grained tasks, with thousands of user requests executing simultaneously. The granular nature of these tasks allows for efficient co-scheduling on the same physical host, enhancing multi-tenancy and improving overall cloud resource utilization.

However, modern CPUs are designed for long-running workloads where microarchitectural structures like branch predictors, caches, and prefetchers can learn and optimize program behavior. The short execution times and frequent context switches between hundreds of fine-grained tasks prevent the CPU from learning execution patterns, causing significant performance degradation. Our research addresses these challenges by developing novel mechanisms to support frequent context switches while preserving critical information about program behavior.

Keywords: Context switches, Microarchitectural state

If you are interested in addressing these challenges together, please do not hesitate to contact us. We’re looking for BSc/MSc students and PhD students.

An incomplete list of currently available projects:

Implement and evaluate a modern branch target buffer (BTB) hierarchy in gem5.
Implementing a state-of-the-art BTB prefetcher in gem5.
Characterization and optimization of modern data center applications in gem5.
Evaluating the performance of a multi-block ahead branch predictor in gem5.
…

Group Members

Dr. David Schall

Research Group Leader
david.schall@tum.de

Systems Research Group

Department of Computer Science // TUM School of Computation, Information and Technology

CPU Microarchitecture for Data Centres

# Branch Prediction

# Frequent Context Switching

Group Members

Anschrift

Zuständige Aufsichtsbehörde

Umsatzsteueridentifikationsnummer

Inhaltlich verantwortlich

Haftungshinweis