The big things remaining to prepare my presentation are

  1. Get a solid understanding of HyPer and Mutable
  2. Finish up benchmarking
  3. … finish making the slides themselves

So let’s talk about slides, and T1W8 Benchmarking talks about benchmarks

Efficiently Compiling Efficient Query Plans for Modern Hardware

  1. This paper is about HyPer’s compiler
  2. They argue the volcano style comes from a time when I/O really dominated runtimes so it wasn’t really worth optimising this part. The benefit of being able to develop faster was better
  3. A lot of other databases remedy this by using the vector model, but that’s really just the volcano model with larger tuples to use cache locality a bit better. This however, also bring back the issue of adding complexity and removes the big advantage of pipelining
  4. They tried using C++ only, but found that the C++ compiler would spend too much time compiling and not do particularly good things in its caches
  5. Instead they opted to use LLVM for most code except for complicated functions like sorting
  6. This dropped their compile times down significantly because the C++ components would be precompiled then they can use JIT on the LLVM instead
  7. They talk about how writing the LLVM optimally had quite a big impact, especially around branch predication, avoiding inlining everything into one function,

Mutable

  1. Mutable was initially created to enable their research on JIT in SQL to wasm. Their EDBT 2023 paper talks about it

  2. As seen above, their benchmarks brag about significantly better compile times and execution times than HyPer

At this point I started recording my thoughts straight onto the slides instead.