- Language: English
- Type: Practical training
- Module: IN0012, IN2106, IN4312
- SWS: 6
- ECTS Credits: 10
- Prerequisites: As such we don’t have any compulsory pre-requisites, but we prefer students to be proficient in the basic concepts of operating systems, distributed systems, and systems programming (C/C++/Rust), or equivalent background.
- Preferred pre-requisite courses at TUM:
- IN0009 : Basic Principles: Operating Systems and System Software
- IN0010 : Introduction to Computer Networking and Distributed Systems
- IN2259: Distributed Systems
- Praktikum: Systems Programming in C++
- Registration: For registration you have to be identified in TUMonline as a student.
- Note: Compulsory enrollment after two weeks of the matching outcome; students who fail to de-register in this period will be registered for the exam.
The swiss-knife course covers some of the most important tools/workflows for building, deploying, and evaluating large-scale modern computer systems, such as running in the cloud. The primary goal is to equip you with a set of generic skills for building and evaluating high-performance and scalable systems.
In particular, we will cover a range of topics through a set of lectures with the necessary background and associated programming assignments over the semester. Note that the programming assignments will have a flavor of promoting “creativity and craftsmanship”, where each assignment will have a “basic” task to bootstrap the assignment, coupled with open-ended challenges to push the boundaries!
More specifically, we will cover the following topics:
- Containers and microservices: How to build and deploy applications using containers for cloud environments. We will cover topics on containerization with Docker, deployment with Kubernetes, and system monitoring with Prometheus.
- Compilers and Dynamic Binary Translation (DBT): We will investigate program instrumentation in the following two flavors:
- (a) static analysis with the LLVM compiler,
- (b) dynamic analysis with DBTs, e.g., Intel Pin, DynamoRIO, or QEMU.
- Filesystems: We will investigate the state-of-the-art filesystems, such as ext4 and btrFS, where we will investigate their design and performance trade-offs via the FIO framework.
- Concurrency: How to build systems for modern multicores systems, such as SMP and NUMA architectures. We will investigate the state-of-the-art Phoenix/PARSEC benchmark suites. We will also investigate lock-free data structures for parallel programs/concurrency, user-space scheduling, and aspects of the NUMA-aware memory management.
- Key-value (KV) systems: How to build and deploy systems leveraging modern KV stores (NoSQL), such as RocksDB, Memcached, Redis. We will investigate the state-of-the-art YCSB benchmark for data access and transaction processing to evaluate modern KV stores.
- I/O stack: We will investigate high-performance I/O subsystems in the following flavors: kernel-based I/O with io_uring. (and if time permits, we will also cover (b) kernel-bypass I/O using direct I/O libraries (SPDK and DPDK).)
- Operating systems: Unikernels and LibraryOS: We will investigate building applications using high-performance operating systems, such as Unikernels (MirageOS, Unikraft) and Library OS (LKL).
- Introduction to a variety of system building tools (the swiss-knife!).
- Building, and deploying state-of-the-art systems at scale.
- Skills for performance analysis, understanding of the system design, choosing the best tool or workflow at hand to solve a given problem.
Teaching and Learning Methods
This course consists of a set of programming modules related to different aspects of building computer systems. For each of these modules, we will first present the necessary background via a lecture. Thereafter, there will be a dedicated assignment that will help the students dig deeper into these concepts and get familiar with them with actual, useful, hands-on tasks. The students will do these assignments in a team of 2-3 students. In addition, we will have office hours for the students to ask questions and clarify aspects of the programming tasks. The students will be required to perform these tasks within a time frame (1 or 2 weeks depending on the difficulty level and the workload of each assignment) and submit their work in the system. The submitted workpieces will then be evaluated using a peer-review system and instructors, and based on that, a grade will be calculated for each assignment.