Re: Zedstore - compressed in-core columnar storage

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-09 14:09:21
Message-ID: bf4b8f5f-90c9-025d-013c-87cf998204ad@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 09.04.2019 3:27, Ashwin Agrawal wrote:
> Heikki and I have been hacking recently for few weeks to implement
> in-core columnar storage for PostgreSQL. Here's the design and initial
> implementation of Zedstore, compressed in-core columnar storage (table
> access method). Attaching the patch and link to github branch [1] to
> follow along.

Thank you for publishing this patch. IMHO Postgres is really missing
normal support of columnar store and table access method
API is the best way of integrating it.

I wanted to compare memory footprint and performance of zedstore with
standard Postgres heap and my VOPS extension.
As test data I used TPC-H benchmark (actually only one lineitem table
generated with tpch-dbgen utility with scale factor 10 (~8Gb database).
I attached script which I have use to populate data (you have to to
download, build and run tpch-dbgen yourself, also you can comment code
related with VOPS).
Unfortunately I failed to load data in zedstore:

postgres=# insert into zedstore_lineitem_projection (select
l_shipdate,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag::"char",l_linestatus::"char"
from lineitem);
psql: ERROR:  compression failed. what now?
Time: 237804.775 ms (03:57.805)

Then I try to check if there is something in
zedstore_lineitem_projection table:

postgres=# select count(*) from zedstore_lineitem_projection;
psql: WARNING:  terminating connection because of crash of another
server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
psql: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 145710.828 ms (02:25.711)

Backend consumes 16GB of RAM and 16Gb of swap and was killed by OOM
killer (undo log?)
Subsequent attempt to run the same command is failed with the following
error:

postgres=# select count(*) from zedstore_lineitem_projection;
psql: ERROR:  unexpected level encountered when descending tree

So the only thing I can do at this moment is report size of tables on
the disk:

postgres=# select pg_relation_size('lineitem');
 pg_relation_size
------------------
      10455441408
(1 row)

postgres=# select pg_relation_size('lineitem_projection');
 pg_relation_size
------------------
       3129974784
(1 row)

postgres=# select pg_relation_size('vops_lineitem_projection');
 pg_relation_size
------------------
       1535647744
(1 row)

postgres=# select pg_relation_size('zedstore_lineitem_projection');
 pg_relation_size
------------------
       2303688704
(1 row)

But I do not know how much data was actually loaded in zedstore table...
Actually the main question is why this table is not empty if INSERT
statement was failed?

Please let me know if I can somehow help you to reproduce and
investigate the problem.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
vstore_bench.sql application/sql 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2019-04-09 15:00:39 Re: Zedstore - compressed in-core columnar storage
Previous Message Alvaro Herrera 2019-04-09 13:30:36 Re: pg_dump is broken for partition tablespaces