Re: Zedstore - compressed in-core columnar storage

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-09 15:00:39
Message-ID: cd2a0ed0-698f-85c8-a775-e25caa3ceb5a@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.04.2019 17:09, Konstantin Knizhnik wrote:
> Hi,
>
> On 09.04.2019 3:27, Ashwin Agrawal wrote:
>> Heikki and I have been hacking recently for few weeks to implement
>> in-core columnar storage for PostgreSQL. Here's the design and initial
>> implementation of Zedstore, compressed in-core columnar storage (table
>> access method). Attaching the patch and link to github branch [1] to
>> follow along.
>
> Thank you for publishing this patch. IMHO Postgres is really missing
> normal support of columnar store and table access method
> API is the best way of integrating it.
>
> I wanted to compare memory footprint and performance of zedstore with
> standard Postgres heap and my VOPS extension.
> As test data I used TPC-H benchmark (actually only one lineitem table
> generated with tpch-dbgen utility with scale factor 10 (~8Gb database).
> I attached script which I have use to populate data (you have to to
> download, build and run tpch-dbgen yourself, also you can comment code
> related with VOPS).
> Unfortunately I failed to load data in zedstore:
>
> postgres=# insert into zedstore_lineitem_projection (select
> l_shipdate,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag::"char",l_linestatus::"char"
> from lineitem);
> psql: ERROR:  compression failed. what now?
> Time: 237804.775 ms (03:57.805)
>
>
> Then I try to check if there is something in
> zedstore_lineitem_projection table:
>
> postgres=# select count(*) from zedstore_lineitem_projection;
> psql: WARNING:  terminating connection because of crash of another
> server process
> DETAIL:  The postmaster has commanded this server process to roll back
> the current transaction and exit, because another server process
> exited abnormally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> psql: server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> Time: 145710.828 ms (02:25.711)
>
>
> Backend consumes 16GB of RAM and 16Gb of swap and was killed by OOM
> killer (undo log?)
> Subsequent attempt to run the same command is failed with the
> following error:
>
> postgres=# select count(*) from zedstore_lineitem_projection;
> psql: ERROR:  unexpected level encountered when descending tree
>
>
> So the only thing I can do at this moment is report size of tables on
> the disk:
>
> postgres=# select pg_relation_size('lineitem');
>  pg_relation_size
> ------------------
>       10455441408
> (1 row)
>
>
> postgres=# select pg_relation_size('lineitem_projection');
>  pg_relation_size
> ------------------
>        3129974784
> (1 row)
>
> postgres=# select pg_relation_size('vops_lineitem_projection');
>  pg_relation_size
> ------------------
>        1535647744
> (1 row)
>
> postgres=# select pg_relation_size('zedstore_lineitem_projection');
>  pg_relation_size
> ------------------
>        2303688704
> (1 row)
>
>
> But I do not know how much data was actually loaded in zedstore table...
> Actually the main question is why this table is not empty if INSERT
> statement was failed?
>
> Please let me know if I can somehow help you to reproduce and
> investigate the problem.
>

Looks like the original problem was caused by internal postgres
compressor: I have not configured Postgres to use lz4.
When I configured Postgres --with-lz4, data was correctly inserted in
zedstore table, but looks it is not compressed at all:

postgres=# select pg_relation_size('zedstore_lineitem_projection');
 pg_relation_size
------------------
       9363010640

No wonder that zedstore shows the worst results:

lineitem                                      6240.261 ms
lineitem_projection                    5390.446 ms
zedstore_lineitem_projection   23310.341 ms
vops_lineitem_projection             439.731 ms

Updated version of vstore_bench.sql is attached (sorry, there was some
errors in previous version of this script).

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
vstore_bench.sql application/sql 4.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-04-09 15:03:33 Re: shared-memory based stats collector
Previous Message Konstantin Knizhnik 2019-04-09 14:09:21 Re: Zedstore - compressed in-core columnar storage