Re: Columnar storage support

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Columnar storage support
Date: 2017-10-10 08:05:18
Message-ID: a5d4fc3b-8508-4e20-9c1d-eab887eb5870@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Unfortunately C-Store doesn't allow to take all advantages of columnar
store: you still not be able to perform vector operation.s
C-Store allows to reduce size of data read from the disk because of
1. fetching only columns which are used in the query,
2. data compression.

It will lead to some benefits in query execution speed for cold data
(when it is not loaded in cache).
For warm data there is almost no difference (except very huge tables
which can not fit in memory).

But the main advantage of vertical data format - vector data processing
- is possible only with specialized executor.
There is prototype of vector executor for C-Store:
https://github.com/citusdata/postgres_vectorization_test
It provides 3-4x speedup of some queries, but it is really prototype and
research project, for from practical usage.

I have also developed two columnar store extensions for Postgres:
IMCS (In-Memory-Columnar-Store): https://github.com/knizhnik/imcs.git
VOPS (Vectorized Operations): https://github.com/postgrespro/vops.git

First one is more oriented on in-memory databases (although support
spilling data to the disk) and requires to use special functions to
manipulate with columnar data.
In this case columnar store is copy of main (horizontal) store (normal
Postgres tables).

VOPS is more recent work, allowing to use more or less normal SQL (using
foreign data wrapper and user defined types/operators).
In VOPS data is stored inside normal Postgres tables, but using vectors
(tiles) instead of scalars.

Both IMCS and VOPS provides 10-100 times speed improvement on queries
like Q1 in TPC-H (sequential scan with filtering and aggregation).
In queries involving joins there is almost no benefit comparing with
normal Postgres.

There is also columnar storage extension developed by Fujitsu:
https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN+hCy8xOLAh2doYhVg5d6HsAA@mail.gmail.com
But the published patch is only first step in this direction and it is
not possible neither to use it in practice, neither perform some
experiments measuring possible improvement of performance.

On 09.10.2017 23:06, Joshua D. Drake wrote:
> On 10/09/2017 01:03 PM, legrand legrand wrote:
>> Is there a chance that pluggable storage permits creation of a
>> columnar rdbms
>> as monetDB in PostgreSQL ?
>> Thanks un advance for thé answer
>
> The extension C-Store from Citus is probably what you are looking for.
>
> jD
>
>>
>>
>>
>> --
>> Sent from:
>> http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
>>
>>
>
>

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2017-10-10 09:23:41 Re: Parallel tuplesort (for parallel B-Tree index creation)
Previous Message David Rowley 2017-10-10 08:01:44 Re: Partition-wise aggregation/grouping