students | Marcin Copik

Open Thesis Topics
Active Theses
Finished Theses
Resources

Open Thesis Topics

For open thesis topics, see our SPCL website (available only in the ETH network).

Active Theses

2025

XaaS Containers

Eiman Alnuaimi

Master Thesis, 2025
Conducting a base crypto analysis of the SHArP protocol

Simone Kalbermatter

Semester Project, 2025

Co-supervised with Marcin Chrapek.
Practical Workload Co-location with rFaaS

Ritvik Ranjan

Semester Project, 2025

Finished Theses

2024

Serverless and Cloud Runtimes for Graph‐of‐Thoughts

Andrea Jiang

Master Thesis, 2024

Co-supervised with Maciej Besta.
Serverless Co‐location with ML

Entiol Liko

Semester Project, 2024

Co-supervised with Lukas Trümper.
Long‐Term Serverless Performance Variability

Oana Rosca

Semester Project, 2024
Adoption and Evolution of C++

Constantin Dragancea

Master Thesis, 2024

2023

Adoption and evolution of C++ in HPC Applications

Boyan Zhou

Master Thesis, 2023
Transparent Serverless Programming

Raul-Ronald Galea

Master Thesis, 2023

Abs HTML

In recent years, serverless computing has emerged as a popular cloud computing paradigm, focusing strongly on ease of use and automatic scalability. Serverless computing, such as Amazon Web Services (AWS) Lambda, offers several advantages over traditional cloud computing models: managed infrastructure, auto-scaling and millisecond-level billing, thereby significantly reducing development time and improving cost-efficiency. However, applications still need to be built explicitly for use with the cloud serverless APIs, and existing applications cannot directly run on e.g. AWS Lambda. In this thesis, we explore the feasibility of transparently running arbitrary unmodified programs on serverless computing platforms by intercepting the creation of OS processes and offloading them to the cloud. OS processes are a common denominator among many programs, which enables us to treat them as opaque. In contrast, existing approaches would either sacrifice transparency and require modifications to user programs or be specifically tailored/optimized to a particular programming language and/or application. Instead, operating at a lower level of abstraction - at the process level - enables both transparency and programming language agnosticism. We demonstrate our hypothesis by offloading a Python image processing program and compilation build systems. We extend the computational resources of a 16vCPU VM via offloading to speed up an image resizing task by a factor of more than 5 and improve the compilation time of the Inkscape project by more than a factor of 2, respectively.

2022

Process-as-a-Service computing on modern serverless platforms.

Gyorgy Rethy

Master Thesis, 2022

Abs HTML

Current serverless and FaaS offerings provide developers with a platform that frees them from the burden of infrastructure management while scaling automatically to demand. This made them increasingly popular, and as such there have been multiple initiatives to build cost-effective large-scale computational systems on top of it. Unfortunately, current platforms have severe limitations, especially around communication, that prevent these systems from achieving the expected results. Over the years, many solutions have been proposed and implemented. In this thesis, we look at such a new model, Process-as-a-Service (PraaS), and evaluate how it compares to the state of the art. PraaS combines ephemeral functions with transient state and a data plane that allows for MPI-style, point-to-point messaging. We will explore how PraaS could be implemented on top of an existing container orchestration system and then combine it with a new serverless workflow executor. In the end, we will see that workflows developed for this implementation can outperform even the most expensive current alternatives.
Serverless workflows benchmarking.

Laurin Brandner

Master Thesis, 2022

Abs HTML

In recent years, Serverless Computing has gained increasing attention in research and industry. Its potential in scalability and efficiency has led major cloud vendors to introduce workflow services that orchestrate serverless functions efficiently. However, these frameworks are based on entirely different architectures, whose characteristics have only poorly been studied. Moreover, the rapid development of these commercial systems makes it hard to keep track of their pros and cons. To fill the knowledge gap, we introduce a framework to compare and evaluate serverless workflow systems. It consists of three components: a model, a platform-agnostic workflow definition, and a benchmark suite. The model is a high-level abstraction of workflows and acts as the basis for a rigorous analysis. We introduce a new workflow definition that transcribes into multiple proprietary paradigms. We use it to implement the benchmark suite, composed of four micro-benchmarks and five application benchmarks. Together, they serve as a great tool to analyze in-depth the services offered by AWS, Azure, and Google Cloud. We evaluate them in terms of scalability, runtime, overhead, and more, yielding a great overview of the current state-of-the-art.
Serverless C++ Executor.

Lukas Möller

Bachelor Thesis, 2022

Abs PDF

Serverless functions have lately been getting traction in the world of high-performance applications where the dynamic scheduling features that serverless cloud environments exhibit can be used to offload CPU-intensive work to the cloud. This is especially advantageous for workloads where dynamic parallelism is required. However, using serverless platforms for this purpose remains difficult in languages like C++ which is traditionally used for high-performance application. To solve this problem we introduce cppless, a single-source programming model for high-performance serverless applications. Cppless allows users to write serverless functions together with the code that uses them to a low transparent offloading. This allows us to provide a common abstraction layer for serverless platforms and enables composable, modular architectures that make use of serverless functions. We evaluate Cppless using several high-performance problems. The results show that Cppless provides a low-overhead interface for serverless applications.
Profiling and optimizations of serverless functions.

Malte Wächter

Bachelor Thesis, 2022

Abs PDF

Serverless computing or Function-as-a-Service is an emerging and promising cloud execution model. The critical difference between Function-as-a-Service versus a traditional cloud execution method is that applications are broken down into smaller stateless functions that interact with each other driven by results, and the executing infrastructure is abstracted away for ease of use. In this process, the ability to collect performance metrics is hampered. We present the profiling and tracing framework FaaS-Profiler. The framework generalizes executions in a serverless environment. It is designed with the goal of instrumenting serverless functions without much effort, taking measurements to collect metrics, and tracing functions end to end. We evaluate our design with a concrete implementation in Python on Amazon Web Services and Google Cloud Platform.

2021

Serverless memory deduplication.

Qiu Wei

Master Thesis, 2021

Abs HTML

Serverless computing is an emerging cloud computing paradigm well- known for its high scalability and cost-efficiency. Serverless applications are defined as a collection of short-lived, stateless serverless functions that can be executed massively in parallel in lightweight containers. We notice that redundancy exists in the memory footprint of the functions which use the same libraries, frameworks or pull the same Docker images. However, this redundancy is not fully exploited by the cloud providers. Even with the page cache sharing in OverlayFS used by Docker containers, identical memory that is anonymous or comes from different disk files still cannot be shared. We propose USM, a build-in module in the Linux kernel which enables memory sharing among different serverless functions based on the content-based page sharing concept. USM allows the user to advise the kernel of a memory area that can be shared with others through the madvise system call, no matter the memory is anonymous or file-backed. We demonstrate that USM reduces up to 20% of memory consumption on 16 concurrent containers running a typical image recognition func- tion, allowing adding three more containers running the same function to the system.
Serverless GPU functions.

Lukas Tobler

Master Thesis, 2021

Abs PDF

For many years, serverless has been an emerging computing paradigm, with Function-as-a-Service (FaaS) being especially popular. When it comes to GPU-enabled machine learning applications, commercial op- tions for FaaS are limited. GPU execution nodes are not typically avail- able because of their high cost and the difficulty of efficiently sharing them between tenants in isolated environments. Multi-Instance GPU (MIG) is a new feature of the NVIDIA A100 device of the Ampere architecture that provides performance and security isolation by partitioning one physical GPU into multiple GPU instances of configurable size. MIG opens up the possibility to build serverless platforms with stronger isolation than what was possible previously. We present the GPUless system, a prototype client-server CUDA execu- tion service based on MIG isolation. The client intercepts the CUDA API and is compatible with current-generation PyTorch machine learning applications. The server provides dynamic resource management for MIG devices of requested size for clients and provides an execution environment. We present a novel way of transporting CUDA API calls over the network: Aggregating call traces and only synchronizing with the remote executor when necessary, reducing network overheads. We show that in cold-start scenarios, our system can be very effective. We apply some optimizations to overcome implementation inefficiencies in machine learning frameworks that lead to a cold-start performance of our system that is even faster than native execution in some cases. We can also show that in a hot execution setting (model is pre-initialized), we can achieve performance close to native execution while still being orders of magnitudes faster than execution in the AWS (Amazon Web Services) Lambda environment. An analysis of bandwidth requirements shows that our system will perform well if at least 1 Gbps of network bandwidth is available.
Verification of representativeness of benchmarking suite.

Arnet Colin

Bachelor Thesis, 2021

Abs

The representativeness of a benchmark is achieved with programs that cover different characteristics. A benchmark is more reliable when it shows the system performance of as many different applications as possible. In this thesis, we develop a new benchmark and compare it with the SPEC benchmark to check its representativeness. The idea for the new benchmark is to select popular C++ open-source projects. These programs should represent tasks that computers encounter every day. It is a challenge to generate inputs for programs that provide good code coverage. We tried using the KLEE symbolic execution engine to automatically generate inputs. KLEE is constructed to generate test inputs. It turned out that it is very hard to turn test inputs into suitable benchmark inputs. So we selected the inputs manually. With the completed new benchmark we started the comparison with SPEC. By measuring hardware events with perf we can compare the characteristics of both benchmarks. The comparison revealed that Linux kernel compilation has widely different program properties than every other benchmark. Showing that there can always be program behaviors that aren’t covered by a benchmark.
Serverless collectives.

Roman Böhringer

Master Thesis, 2021

Abs HTML

Serverless platforms provide massive parallelism with very high elasticity and fine-grained billing. Because of these properties, they are increasingly used for stateful, distributed jobs at large scales. How- ever, a major limitation of the commonly used platforms is communi- cation: Individual functions cannot communicate directly and using external storage or databases for ephemeral data can be slow and ex- pensive. We present FMI, the FaaS Message Interface, to overcome this limitation. FMI is an easy-to-use, high-performance framework for general-purpose communication in Function as a Service platforms. It supports different communication channels (including direct commu- nication with our TCP NAT hole punching system), a model-driven channel selection according to performance or cost, and provides op- timized collective implementations that exploit characteristics of the different channels. In our experiments, FMI can speed up communication for a distributed machine learning job by up to 1,200x, while reducing cost at the same time by factors of up to 365. It provides a simple interface and can be integrated into existing codebases with a few minor changes.
FaaStest collectives: reliable communication in serverless world

Emir İşman

Bachelor Thesis, 2021

Abs

Since its inception, serverless functions have been deployed for many different scenarios, yet there has been little research into performing ef- fective collective operations with serverless functions. Ephemeral nature of functions and the absence of native inter-function communication present challenges for communication-heavy workloads. We investi- gate the current state of the field on inter-function communication and propose a design for tree-based collective operations. We evaluate our design with different options for communication between functions that include both shared storage based and network based solutions on a serverless reduce prototype we have deployed on AWS Lambda.
Offloading serverless with sPIN.

Konrad Handrick

Bachelor Thesis, 2021

Co-supervised with Salvatore di Girolamo.

HTML
TaintImpact: Taint-Based Change Impact Analysis.

Tobias Lüscher

Bachelor Thesis, 2021

Abs HTML

Code reviews are an essential part of software development. In order to successfully perform code reviews, developers need to understand the impact of code changes thoroughly. However, understanding this impact is a notoriously difficult task. Therefore, we introduce TaintIm- pact. TaintImpact supports developers in the task of code reviews by showing the impact of a change. More specifically, TaintImpact starts with a git commit and then computes an impact set of the change using dynamic taint analysis. Further, we introduce dynamic blame, a tool that extends git blame to not only show the commit that last modified a given line, but also to show the commit that last impacted a given line. We have evaluated TaintImpact on three artificially produced ex- amples and on a real-world bug. The results show that TaintImpact helps developers to focus their attention on parts of the code that are not obviously impacted by a change. Further, comparing the impact sets of different program configurations supports developers in find- ing bugs in software changes.
Code-driven Language Development: Framework for Analysis of C/C++ Open-Source Projects

Siegfried Hartogs

Bachelor Thesis, 2021

Abs HTML

C++ has substantially grown during the last ten years and features such as move semantics, parameter packs, and keywords such as constexpr have been added. Along with that come guidelines to write correct and maintainable C++ code. While there has been work describing the adoption of features in C++ code, they seem to typically employ a specialized tool to analyze source code, making them inadequate to analyze new language features in the future. This thesis overcomes that limitation by building a framework that can be used to write analyses on top of, be it to study new keywords or the adoption of programming guidelines. We achieve this by leveraging Clang to provide us with the abstract syntax tree (AST) of the input code. Our tool has the advantage that it is fit to analyze upcoming features since future versions of Clang will parse the source code for us. This simplifies research about the adoption of features by avoiding the technicalities of parsing source code and should be accurate thanks to the rich representation of C/C++ in the AST. We demonstrate the results of the tool by showing insights gained about - among other features - the adoption of range-based loops, parameter packs, and C++ Standard Library containers and algorithms.
CppBuild: Large-Scale, Automatic Build System for Open Source C++ Repositories

Lukas Gygi

Bachelor Thesis, 2021

Abs HTML

A large collection of codebases can provide valuable insights into the requirements and features of real-world applications of programming language toolchains. In order to analyze C/C++ source-code and its build process as well as the resulting binaries, the code repository has to be downloaded, dependencies have to be installed and the build process has to be executed. Today, there is no standardized way to execute these steps, as they often differ from project to project. Our goal is to automate these processes, by leveraging existing open source repositories and continuous integration systems. To do this, we extend the current version of FBACode (Fetch Build Analyze Code), which uses Github repositories and Debian packages as sources repositories. The tool downloads the code base, identifies the used build system, attempts to install dependencies and builds the packages using the Clang/LLVM toolchain. As an output, the tool generates a wide range of statistics, for example errors, compile time and installed dependencies, as well as Clang AST files and LLVM IR for further analysis. The tools has a success rate of 82.8% for Debian packages and 41.9% of repositories with recognized build systems from Github compiled successfully.

2020

Control Flow Taint Analysis for Performance Modeling in LLVM

Nicolas Wicki

Bachelor Thesis, 2020

2019

Towards Extreme-Scale Cache Coherence Protocols and Simulations

Philipp J. Bomatter

Bachelor Thesis, 2019

Abs

Research in the field of high performance computing strives to opti- mize large-scale computations, especially for scientific applications, to the highest possible degree. The first step in the pursuit of this goal is usually to acquire a profound understanding of the relevant underly- ing mechanisms of the hardware. In this thesis, we focus on memory accesses and cache architectures along with scalable cache coherence protocols. Caches are often critical for performance but performance estimations become increasingly complex at large scales. To address this challenge, we focus on the development and the design of a simu- lation framework for large-scale cache coherent systems along with the theoretical work. As simulations in many other settings, cache coher- ence simulations can provide new insights and open up a way to under- stand the complex workings of cache coherence. Concretely, we extend the current version of LogGOPSim, an established lightweight simu- lation infrastructure, with the functionalities to simulate memory ac- cesses and distributed directory-based cache coherence schemes. Our work can facilitate the developent of future cache coherence protocols and architectures.

Resources

LLVM resources: general, introductions, compiler passes.