Efficiently Compiling Efficient Query Plans for Modern Hardware

This paper is about HyPer’s compiler
They argue the volcano style comes from a time when I/O really dominated runtimes so it wasn’t really worth optimising this part. The benefit of being able to develop faster was better
A lot of other databases remedy this by using the vector model, but that’s really just the volcano model with larger tuples to use cache locality a bit better. This however, also bring back the issue of adding complexity and removes the big advantage of pipelining
They tried using C++ only, but found that the C++ compiler would spend too much time compiling and not do particularly good things in its caches
Instead they opted to use LLVM for most code except for complicated functions like sorting
This dropped their compile times down significantly because the C++ components would be precompiled then they can use JIT on the LLVM instead
They talk about how writing the LLVM optimally had quite a big impact, especially around branch predication, avoiding inlining everything into one function,

Mutable

Mutable was initially created to enable their research on JIT in SQL to wasm. Their EDBT 2023 paper talks about it
As seen above, their benchmarks brag about significantly better compile times and execution times than HyPer

At this point I started recording my thoughts straight onto the slides instead.