Re: Zedstore - compressed in-core columnar storage

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Andreas Karlsson <andreas(at)proxel(dot)se>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-14 17:12:33
Message-ID: 20190414171233.vhukbvlfwnacc37u@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 14, 2019 at 09:45:10AM -0700, Andres Freund wrote:
>Hi,
>
>On 2019-04-14 18:36:18 +0200, Tomas Vondra wrote:
>> I think those comparisons are cute and we did a fair amount of them when
>> considering a drop-in replacement for pglz, but ultimately it might be a
>> bit pointless because:
>>
>> (a) it very much depends on the dataset (one algorithm may work great on
>> one type of data, suck on another)
>>
>> (b) different systems may require different trade-offs (high ingestion
>> rate vs. best compression ratio)
>>
>> (c) decompression speed may be much more important
>>
>> What I'm trying to say is that we shouldn't obsess about picking one
>> particular algorithm too much, because it's entirely pointless. Instead,
>> we should probably design the system to support different compression
>> algorithms, ideally at column level.
>
>I think we still need to pick a default algorithm, and realistically
>that's going to be used by like 95% of the users.
>

True. Do you expect it to be specific to the column store, or should be
set per-instance default (even for regular heap)?

FWIW I think the conclusion from past dev meetings was we're unlikely to
find anything better than lz4. I doubt that changed very much.

regard

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-04-14 17:32:14 Re: pg_dump is broken for partition tablespaces
Previous Message Magnus Hagander 2019-04-14 17:12:10 Re: Checksum errors in pg_stat_database