Advanced Topics

  • Hardware Heterogenity is increasing. This means that systems are composed of a variety of different types of processors and accelerators, such as CPUs, GPUs, FPGAs, and specialized AI chips.
  • Data Model Heterogentiy is also increasing, with systems needing to handle diverse data formats and structures.
  • Workload Heterogenity is on the rise, as applications require different types of processing power and resources. Mixed workloads are also becoming more common, where different types of tasks are executed simultaneously on the same system.

1. Query Compilation

Query compilers compile logical plans to executable programs. One option is to traverse the plan tree using a visitor pattern, generating code for each operator. This can result in a large performance improvement.

1.1 Intermediate Algebra

This essentially replaces the kernel, introducing a boundary between the logical plan and the executable. Optimisation is now done by the compiler, which is more difficult to control & tune.

The DBMS kernel knew a lot about the data, and knew little about the hardware. The compiler knows a lot about the hardware, but little about the data.

We could introduce an intermediate algebra between the logical plan and the executable code. This algebra would be designed to be easy to compile, and would allow for optimisations to be applied at this level.

1.2 VOODOO

VOODOO is a query compiler that uses an intermediate algebra called Voodoo Algebra. This algebra uses vector operations to represent data processing tasks.

VOODOO can analyse the data, finding parallelizable tasks and generating vectorised code. It also abstracts away hardware details, allowing for optimisations to be applied at a higher level.

2. Adaptive Indexing

Cracking involves building indexes incrementally as queries are executed. Instead of fully sorting the data, we partition it into buckets based on the query predicates. This allows for faster query execution, and fast index creation. As queries keep coming in, the index becomes more refined, leading to better performance over time.

2.1 Hoare Paritioning

Hoare Partitioning works by selecting a pivot element and partitioning the data into two halves: elements less than the pivot and elements greater than the pivot. This is the heart of cracking.

However, Hoare Paritioning is very bad from a branch prediction point of view (branch chance is always 50%). To fix this, we use predication by performing pointer arithmetic - turning control dependencies into data dependencies.

2.2 Predicated Cracking

Predicated cracking works with an active and backup register:

  1. Select a pivot.
  2. Read the head of the data into active and tail into backup.
  3. Compare active with the pivot, simultaneously writing the active register to the head and tail of the data.
  4. Advance cursor on the side where the write happened (predicated). Also set the active register to the value at the moved cursor.
  5. Swap the active and backup registers, then repeat from step (2).

This has no branching, in-place (less cache misses). It also autotunes itself as the DBMS runs more query.

3. Stream Processing

The two stacks algorithm can be turned into a dataflow graph with no control flow, that can be used as an intermediate representation for stream processing. This corresponds to a prefix and suffix scan of the data.

The dataflow graph will have dead nodes that can be eliminated, optimising the execution.

4. Composable DPS

Composable Data Processing Systems allow us replacing components of a DBMS with specialised systems. For example, we could replace the storage engine with a cloud object store, or replace the query processor with a stream processing engine.

Back to Home