Diagonal storage model

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Diagonal storage model
Date: 2018-04-01 12:48:07
Message-ID: 5AC0D507.5070105@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Vertical (columnar) storage mode is most optimal for analytic and this is why it is widely used in databases oriented on OLAP, such as Vertica, HyPer,KDB,...
In Postgres we have cstore extension which is not able to provide all benefits of vertical model because of lack of support of vector operations in executor.
Situation can be changed if we will have pluggable storage API with support of vectorized execution.

But veritcal model is not so good for updates and load of data (because data is mostly imported in horizontal format).
This is why in most of the existed systems data is presentin both formats (at least for some time).

I want to announce new model, "diagonal storage" which combines benefits of both approaches.
The idea is very simple: we first store column 1 of first record, then column 2 of second record, ... and so on until we reach the last column.
After it we store second column of first record, third column of the second record,...

Profiling of TPC-H queries shows that mode of the time of query exectution (about 17%) is spent is heap_deform_tuple.
New format will allow to significantly reduce time of heap deforming, because there is just of column if the particular record in each tile.
Moreover over we can perform deforming of many tuples in parallel, which ids especially efficient at quantum computers.

Attach please find patch with first prototype implementation. It provides about 3.14 times improvement of performance at most of TPC-H queries.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
diagonal.patch.gz application/x-gzip 505 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksandr Parfenov 2018-04-01 12:51:03 Re: new function for tsquery creartion
Previous Message Magnus Hagander 2018-04-01 12:04:38 Re: Online enabling of checksums