Re: Zedstore - compressed in-core columnar storage

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-09 15:45:26
Message-ID: 3978b57e-fe25-ca6b-f56c-48084417e115@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.04.2019 18:08, Heikki Linnakangas wrote:
> On 09/04/2019 18:00, Konstantin Knizhnik wrote:
>> On 09.04.2019 17:09, Konstantin Knizhnik wrote:
>>> standard Postgres heap and my VOPS extension.
>>> As test data I used TPC-H benchmark (actually only one lineitem table
>>> generated with tpch-dbgen utility with scale factor 10 (~8Gb database).
>>> I attached script which I have use to populate data (you have to to
>>> download, build and run tpch-dbgen yourself, also you can comment code
>>> related with VOPS).
>
> Cool, thanks!
>
>>> Unfortunately I failed to load data in zedstore:
>>>
>>> postgres=# insert into zedstore_lineitem_projection (select
>>> l_shipdate,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag::"char",l_linestatus::"char"
>>>
>>> from lineitem);
>>> psql: ERROR:  compression failed. what now?
>>> Time: 237804.775 ms (03:57.805)
>
> Yeah, it's still early days, it will crash and burn in a lot of cases.
> We wanted to publish this early, to gather ideas and comments on the
> high level design, and to validate that the table AM API that's in v12
> is usable.
>
>> Looks like the original problem was caused by internal postgres
>> compressor: I have not configured Postgres to use lz4.
>> When I configured Postgres --with-lz4, data was correctly inserted in
>> zedstore table, but looks it is not compressed at all:
>>
>> postgres=# select pg_relation_size('zedstore_lineitem_projection');
>>    pg_relation_size
>> ------------------
>>          9363010640
>
> The single-insert codepath isn't very optimized yet. If you populate
> the table with large "INSERT ... SELECT ...", you end up with a huge
> undo log. Try loading it with COPY.
>
> You can also see how many pages of each type there is with:
>
> select count(*), pg_zs_page_type('zedstore_lineitem_projection', g)
>   from generate_series(0,
> pg_table_size('zedstore_lineitem_projection') / 8192 - 1) g group by 2;
>
> - Heikki

postgres=# copy zedstore_lineitem from '/mnt/data/lineitem.tbl'
delimiter '|' csv;
COPY 59986052
Time: 232802.257 ms (03:52.802)
postgres=# select pg_relation_size('zedstore_lineitem');
 pg_relation_size
------------------
      10346504192
(1 row)
postgres=# select count(*), pg_zs_page_type('zedstore_lineitem', g)
  from generate_series(0, pg_table_size('zedstore_lineitem') / 8192 -
1) g group by 2;
  count  | pg_zs_page_type
---------+-----------------
       1 | META
 1262308 | BTREE
     692 | UNDO
(3 rows)

And now performance is much worser:
Time: 99819.476 ms (01:39.819)

It is strange, because the main advantage of columnar store is that it
has to fetch only accessed rows.
What I see is that in non-parallel mode (max_parallel_workers_per_gather
= 0)
backend consumes about 11GB of memory. It fits in my desktop RAM (16GB)
and speed is ~58 seconds.
But one I start 4 parallel workers, them cause huge swapping:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
28195 knizhnik  20   0 11.823g 6.553g 5.072g D   7.6 42.2   0:17.19
postgres
28074 knizhnik  20   0 11.848g 6.726g 5.223g D   7.3 43.3   4:14.96
postgres
28192 knizhnik  20   0 11.854g 6.586g 5.075g D   7.3 42.4   0:17.18
postgres
28193 knizhnik  20   0 11.870g 6.594g 5.064g D   7.3 42.4   0:17.19
postgres
28194 knizhnik  20   0 11.854g 6.589g 5.078g D   7.3 42.4   0:17.09
postgres

which is also strange because data should be present in shared buffers.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-04-09 15:48:38 block-level incremental backup
Previous Message Tom Lane 2019-04-09 15:45:20 Re: more isolation tests for update tuple routing