Course Information
- Language: English
- Type: 3V + 2Ü
- Module: IN0006
- ECTS Credits: 6
- Prerequisites: Foundations of Programming (IN0002)
- Only students who passed IN0002 or a comparable course can participate in this course
- You must have experience with object-oriented programming in Java
- TUM Online: You must register for this course in TUM Online before the course starts
- Contact:
- Post your questions to the corresponding channels on TUM Zulip
- For any formal organization matter, you can contact us at eist25@dos.cit.tum.de
- Please do not contact staff members using their email addresses
- Time:
- See TUMOnline for the lecture and tutorial schedule
Content
Software engineering is the application of engineering principles to the development of software. It’s a systematic approach that involves the entire lifecycle of software creation, from defining requirements and designing the architecture to coding, testing, deployment, monitoring, and ongoing maintenance.
Almost all modern software systems are designed for and deployed in the cloud. The new curriculum focuses on “software engineering for the cloud.” The course covers ten significant topics about design, development, and deployment of cloud software systems.
Please expand the respective lecture unit to view the detailed course content.
- Part I: Administrative
- Introduction to the course staff and tutors
- Organization of the lectures, tutorials, homework exercises, and exams
- Introduction to the course tools and textbooks
- Part II: Introduction to Software Engineering
- Explain the basic terminology: software engineering and activities
- Abstraction
- System modeling with UML
- Part III: Software engineering models
- Defined (e.g., Waterfall model) and agile methods (e.g., Scrum)
- Applying agile methods in software engineering projects
- A note on tactical vs. strategic programming
- Part IV: Course overview
- Course focus: Cloud software engineering
- An overview of software engineering activities in the cloud
- Part V: Introduction to Cloud Computing
- What is cloud computing?
- Why is it important for software engineering?
- Cloud hardware architecture
- Data center and cloud infrastructure
- Virtualization: Compute, storage, & network
- Types of cloud services
- IaaS, PaaS, and SaaS
- Types of cloud infrastructure
- Public, private, and hybrid cloud
- Part I: System design requirements
- Requirements engineering
- Stages in requirements engineering
- Functional vs non-functional requirements
- Non-functional requirements
- Scalability
- Reliability
- Availability
- Performance
- Security
- Maintainability
- Deployability
- Monitoring
- Part II: Software architectures
- What is a software architecture?
- An overview of the cloud software architectures
- Part III: Client-server architecture
- Architecture overview
- Communication layer
- REST
- Remote procedure calls (RPC) via gRPC
- Serialization and deserialization of structured data using Protbuf
- Part IV: Layered architecture
- Architecture overview
- Subsystem decomposition: layered architectures
- Ubiquitous adoption of layered architectures in systems
- Open vs closed layered architectures
- Different layers, different abstraction
- Pulldown the complexity downward
- Three-tier architectures
- Definition of tiers
- Part V: Monolithic architectures
- Architecture overview
- Deployment models
- Advantages
- Limitations
- Part VI: Microservice architectures
- Architecture overview
- Advantages/drawbacks of microservice architectures
- How to design and build a simple microservice application
- Part I: System design challenges
- Software complexity and the quest for simplicity
- Design goals
- System design trade-offs
- Hints for computer system designs
- Part II: Modularity
- System decomposition into sub-systems/components
- Subsystem decomposition: Modules
- Create an initial subsystem decomposition
- Differentiate between coupling and cohesion
- Pattern implementation:
- Facade pattern
- Interface design
- Shallow vs deep modules: The trade-offs between interface and functionality
- General purpose modules are deeper
- Information hiding (and leakage) principle
- Part III: Data management
- What are data management systems?
- Key-value stores (KVS) (In-memory/Persistent)
- Filesystems
- Shared log
- Databases
- Part IV: Pattern implementation
- MVC pattern
- Part I: Performance
- Performance metrics: Latency, throughput, utilization, SLAs
- A systems approach to designing for performance
- Measurement-driven approach to building high-performance systems
- Identifying bottlenecks (time-based profiling)
- Automated performance profiling tools (Linux perf and Flamegraphs)
- Design hints for performance:
- Resource splitting
- Caching
- Compute in the background
- Batch processing
- Parallelism
- Part II: Concurrency (or Scale Up!)
- Why Concurrency?
- Single-node parallelism: multicores, accelerators, smart devices (SSDs/NICs)
- Process vs threads
- Thread scheduling
- Cooperative vs. preemptive schedulers
- Scheduling policies: round robin, fairness-based, priority-based, earliest-deadline first
- Parallel programming
- Managing threads: thread spawning, thread pools
- Communication mechanisms and critical sections
- Synchronization primitives: Semaphores, mutexes, barriers, readers-writer locks
- Lock-free data structures
- Problems with threads
- Race Conditions
- Deadlocks, livelocks
- Starvation
- Parallel programming patterns
- Fork-join
- Data-parallel programming pattern w/ barriers
- Synchronous vs asynchronous
- Why Concurrency?
- Part III: Scalability (or Scale Out!)
- Why scalability? (Limitations of single-node scaling up!)
- Challenges of scalability
- Scaling stateless applications
- load balancers
- Scaling stateful applications
- Sharding
- Consistent hashing
- Secondary indexes
- Scalable system architectures
- A generic controller-workers architecture for building scalable systems
- Scalable computation case study: Data-parallel programming model (MapReduce)
- Scalable data management case study: Distributed key-value stores (KVS)
- Part I: Security
- Security engineering
- Security policies: Threat model and security (CIA) properties
- Security design principles
- Least privilege
- Compartmentalization
- Isolation via privilege mediation
- A general recipe for secure systems design
- Access control
- Access control lists (ACLs)
- Capabilities
- Software security in the cloud
- Security challenges in the cloud
- Secure systems stack: Compute, network, and storage
- Authentication, key management, and attestation
- Security engineering
- Part II: Reliability and availability
- Terminology: System failures, fault types and sources, and properties and metrics
- Single-node fault-tolerant systems
- Write-ahead logging for system reliability
- Issues with a logging-only approach
- Replication as the general recipe
- Replication for stateless services
- Issues with replication for stateful services
- Replication for stateful services
- Primary-backup replication
- State machine replication
- Part III: Pattern implementation
- Adapter pattern
- Observer pattern
- Strategy pattern
- Part I: Faults and failures in software
- Terminology and impact
- Part II: Software testing
- Testing overview
- Black box, grey box, white box testing
- Unit testing
- Apply unit testing with JUnit
- Integration testing
- Stubs and drivers
- Bottom-up, top-down, sandwich integration
- Testing overview
- Part III: Automated large-scale system testing
- Fuzzing (AFL)
- Symbolic execution (KLEE, S2E, Angr)
- System crashing for resiliency (Chaos Monkey)
- Part IV: Mock testing
- Test doubles
- Part I: Program analysis
- Motivation and terminology
- Trade-offs: Soundness, completeness, static vs. dynamic analysis
- Part II: Static analysis tools
- Compiler warnings/errors
- Infer
- Clang Analyzer and Clang Tidy
- Spotbugs
- Part III: Brief introduction to C
- Part IV: Dynamic analysis tools
- Undefined behavior
- Memory safety issues
- Dynamic binary instrumentation
- Compiler-assisted instrumentation (LLVM)
- Valgrind
- Sanitizers (AddressSanitizer, ThreadSanitizer, MemorySanitizer, UBSan)
- Combining Fuzzing with Sanitizers
- Part I: Source code management
- Source code management
- Version control
- Branch management
- Centralized (piper) vs decentralized (git) source code management
- Part II: Build systems
- Why a build system?
- Task-based build systems
- Artifact-based build systems
- Distributed builds
- Dependency management
- Hermeticity
- Part III: Release management
- Release planning
- Software versioning
- Software upgrades
- Part IV: Continous *
- Continuous integration
- Continuous delivery
- Continuous deployment
- Continuous testing/fuzzing w/ OSS Fuzz
- Part I: Software deployment models in the cloud
- Workflows, advantages, challenges, use-cases, examples, and best practices
- Baremetal
- Virtual machines
- Containers
- Serverless
- Workflows, advantages, challenges, use-cases, examples, and best practices
- Part II: “Hello World in the Cloud” w/ container
- Understanding the design, implementation, and deployment of a simple cloud application
- Introducing aspects of container-based (Docker) application development and deployment
- Part III: Cloud orchestration
- Cluster management
- Kubernetes: A container orchestration system
- How to deploy a simple microservice application with Kubernetes
- Part IV: Cloud systems monitoring
- Why system monitoring?
- Prometheus architecture
- Metrics types
- Visualizing metrics with Graphana
- Altering mechanisms
- An example: Instrumenting a simple HTTP server
- Part I: Software quality
- Software quality management
- Reviewing
- // Comments
- Code refactoring
- Trustworthy software systems
- Formal verification
- Code compliance
- Part II: Project management
- Project management
- Work breakdown structure
- Team organization
- Communication mechanisms
- Part III: Exam preparation
- Exam format
- Q&A
Teaching and Learning Methods
The course offers three broad mediums for teaching and learning methods.
- Lectures: Lectures cover theoretical concepts related to the course topics listed above. We provide lecture slides with additional references (e.g., book chapters, papers, and blog posts). In-person lectures are accompanied by a live video stream and video recordings for asynchronous learning available at: https://live.rbg.tum.de/.
- Tutorials: Tutorials focus on the practical application of the course topics outlined above. We employ tutorial slides and in-tutorial programming exercises to reinforce the concepts introduced in lectures. These exercises will also prepare you for the homework assignments.
For detailed tutorial information, please visit the Course Wiki: https://collab.dvb.bayern/display/TUMeist/Tutorial+Information - Homework: We offer homework assignments in two formats:
- Graded programming exercises for bonus points
- Ungraded programming exercises for self-study
Course Material
We primarily use the following two mediums to disseminate the course material:
- Artemis: Slides and programming exercises
- Course Wiki page: Tutorial information and course forms
Communication Medium
We use the official Zulip chat server hosted by TUM for any communication related to general queries, tutorials, and homework exercises. Please see the Course Wiki page to locate the respective Zulip channels
Lecturers
Prof. Pramod Bhatotia
Course Lead
Dr. Marco Elver 🇩🇪
Guest Lecturer (L07 and L08)
Dr. Jörg Thalheim 🇩🇪
Guest Lecturer (L09)
Teaching Assistants (TAs)
Dimitrios Stavrakakis 🇬🇷
Head Tutorial Instructor
Manos Giortamis
Head Exercise Instructor
Exam Instructors
Martin Fink 🇮🇹
Head Exam Instructor