Yet another vectorized engine

From: Hubert Zhang <hzhang(at)pivotal(dot)io>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Gang Xiong <gxiong(at)pivotal(dot)io>, Asim R P <apraveen(at)pivotal(dot)io>, Ning Yu <nyu(at)pivotal(dot)io>
Subject: Yet another vectorized engine
Date: 2019-11-28 09:23:59
Message-ID: CAB0yrem3PYu2qQD4=JOg8y_5QAK+Q+K5yEptxs0t71j5cRyoOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

We just want to introduce another POC for vectorized execution engine
https://github.com/zhangh43/vectorize_engine and want to get some feedback
on the idea.

The basic idea is to extend the TupleTableSlot and introduce
VectorTupleTableSlot, which is an array of datums organized by projected
columns. The array of datum per column is continuous in memory. This makes
the expression evaluation cache friendly and SIMD could be utilized. We
have refactored the SeqScanNode and AggNode to support VectorTupleTableSlot
currently.

Below are features in our design.
1. Pure extension. We don't hack any code into postgres kernel.

2. CustomScan node. We use CustomScan framework to replace original
executor node such as SeqScan, Agg etc. Based on CustomScan, we could
extend the CustomScanState, BeginCustomScan(), ExecCustomScan(),
EndCustomScan() interface to implement vectorize executor logic.

3. Post planner hook. After plan is generated, we use plan_tree_walker to
traverse the plan tree and check whether it could be vectorized. If yes,
the non-vectorized nodes (SeqScan, Agg etc.) are replaced with vectorized
nodes (in form of CustomScan node) and use vectorized executor. If no, we
will revert to the original plan and use non-vectorized executor. In future
this part could be enhanced, for example, instead of revert to original
plan when some nodes cannot be vectorized, we could add Batch/UnBatch node
to generate a plan with both vectorized as well as non-vectorized node.

4. Support implement new vectorized executor node gradually. We currently
only vectorized SeqScan and Agg but other queries which including Join
could also be run when vectorize extension is enabled.

5. Inherit original executor code. Instead of rewriting the whole executor,
we choose a more smooth method to modify current Postgres executor node and
make it vectorized. We copy the current executor node's c file into our
extension, and add vectorize logic based on it. When Postgres enhance its
executor, we could relatively easily merge them back. We want to know
whether this is a good way to write vectorized executor extension?

6. Pluggable storage. Postgres has supported pluggable storage now.
TupleTableSlot is refactored as abstract struct TupleTableSlotOps.
VectorTupleTableSlot could be implemented under this framework when we
upgrade the extension to latest PG.

We run the TPCH(10G) benchmark and result of Q1 is 50sec(PG) V.S.
28sec(Vectorized PG). Performance gain can be improved by:
1. heap tuple deform occupy many CPUs. We will try zedstore in future,
since vectorized executor is more compatible with column store.

2. vectorized agg is not fully vectorized and we have many optimization
need to do. For example, batch compute the hash value, optimize hash table
for vectorized HashAgg.

3. Conversion cost from Datum to actual type and vice versa is also high,
for example DatumGetFloat4 & Float4GetDatum. One optimization maybe that we
store the actual type in VectorTupleTableSlot directly, instead of an array
of datums.

Related works:
1. VOPS is a vectorized execution extension. Link:
https://github.com/postgrespro/vops.
It doesn't use custom scan framework and use UDF to do the vectorized
operation e.g. it changes the SQL syntax to do aggregation.

2. Citus vectorized executor is another POC. Link:
https://github.com/citusdata/postgres_vectorization_test.
It uses ExecutorRun_hook to run the vectorized executor and uses cstore fdw
to support column storage.

Note that the vectorized executor engine is based on PG9.6 now, but it
could be ported to master / zedstore with some effort. We would appreciate
some feedback before moving further in that direction.

Thanks,
Hubert Zhang, Gang Xiong, Ning Yu, Asim Praveen

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2019-11-28 09:25:14 Re: Why JIT speed improvement is so modest?
Previous Message Amit Langote 2019-11-28 09:01:18 Re: [Incident report]Backend process crashed when executing 2pc transaction