Re: Zedstore - compressed in-core columnar storage

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-09 15:08:40
Message-ID: 29097cfc-4e60-57c1-12e8-074d82bd6f33@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/04/2019 18:00, Konstantin Knizhnik wrote:
> On 09.04.2019 17:09, Konstantin Knizhnik wrote:
>> standard Postgres heap and my VOPS extension.
>> As test data I used TPC-H benchmark (actually only one lineitem table
>> generated with tpch-dbgen utility with scale factor 10 (~8Gb database).
>> I attached script which I have use to populate data (you have to to
>> download, build and run tpch-dbgen yourself, also you can comment code
>> related with VOPS).

Cool, thanks!

>> Unfortunately I failed to load data in zedstore:
>>
>> postgres=# insert into zedstore_lineitem_projection (select
>> l_shipdate,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag::"char",l_linestatus::"char"
>> from lineitem);
>> psql: ERROR:  compression failed. what now?
>> Time: 237804.775 ms (03:57.805)

Yeah, it's still early days, it will crash and burn in a lot of cases.
We wanted to publish this early, to gather ideas and comments on the
high level design, and to validate that the table AM API that's in v12
is usable.

> Looks like the original problem was caused by internal postgres
> compressor: I have not configured Postgres to use lz4.
> When I configured Postgres --with-lz4, data was correctly inserted in
> zedstore table, but looks it is not compressed at all:
>
> postgres=# select pg_relation_size('zedstore_lineitem_projection');
>  pg_relation_size
> ------------------
>        9363010640

The single-insert codepath isn't very optimized yet. If you populate the
table with large "INSERT ... SELECT ...", you end up with a huge undo
log. Try loading it with COPY.

You can also see how many pages of each type there is with:

select count(*), pg_zs_page_type('zedstore_lineitem_projection', g)
from generate_series(0, pg_table_size('zedstore_lineitem_projection')
/ 8192 - 1) g group by 2;

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-04-09 15:21:00 Re: POC: GROUP BY optimization
Previous Message Tomas Vondra 2019-04-09 15:03:33 Re: shared-memory based stats collector