Re: Zedstore - compressed in-core columnar storage

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-11 04:01:45
Message-ID: fc2d4f99-1ca6-854f-7025-ec3db94f6e04@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/04/19 12:27 PM, Ashwin Agrawal wrote:

> Heikki and I have been hacking recently for few weeks to implement
> in-core columnar storage for PostgreSQL. Here's the design and initial
> implementation of Zedstore, compressed in-core columnar storage (table
> access method). Attaching the patch and link to github branch [1] to
> follow along.
>
>

Very nice. I realize that it is very early days, but applying this patch
I've managed to stumble over some compression bugs doing some COPY's:

benchz=# COPY dim1 FROM '/data0/dump/dim1.dat'
USING DELIMITERS ',';
psql: ERROR:  compression failed. what now?
CONTEXT:  COPY dim1, line 458

The log has:

2019-04-11 15:48:43.976 NZST [2006] ERROR:  XX000: compression failed.
what now?
2019-04-11 15:48:43.976 NZST [2006] CONTEXT:  COPY dim1, line 458
2019-04-11 15:48:43.976 NZST [2006] LOCATION: zs_compress_finish,
zedstore_compression.c:287
2019-04-11 15:48:43.976 NZST [2006] STATEMENT:  COPY dim1 FROM
'/data0/dump/dim1.dat'
    USING DELIMITERS ',';

The dataset is generated from and old DW benchmark I wrote
(https://sourceforge.net/projects/benchw/). The row concerned looks like:

457,457th interesting measure,1th measure
type,aqwycdevcmybxcnpwqgrdsmfelaxfpbhfxghamfezdiwfvneltvqlivstwralshsppcpchvdkdbraoxnkvexdbpyzgamajfp
458,458th interesting measure,2th measure
type,bjgdsciehjvkxvxjqbhtdwtcftpfewxfhfkzjsdrdabbvymlctghsblxucezydghjrgsjjjnmmqhncvpwbwodhnzmtakxhsg

I'll see if changing to LZ4 makes any different.

best wishes

Mark

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-04-11 04:06:23 Re: Should we add GUCs to allow partition pruning to be disabled?
Previous Message Michael Paquier 2019-04-11 03:59:52 Re: REINDEX CONCURRENTLY 2.0