From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Ashwin Agrawal <aagrawal(at)pivotal(dot)io> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Zedstore - compressed in-core columnar storage |
Date: | 2019-04-15 20:17:09 |
Message-ID: | 20190415201709.iuekkfen4df54pbg@development |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 15, 2019 at 11:57:49AM -0700, Ashwin Agrawal wrote:
> On Mon, Apr 15, 2019 at 11:18 AM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> Maybe. I'm not going to pretend I fully understand the internals. Does
> that mean the container contains ZSUncompressedBtreeItem as elements? Or
> just the plain Datum values?
>
> First, your reading of code and all the comments/questions so far have
> been highly encouraging. Thanks a lot for the same.
;-)
> Container contains ZSUncompressedBtreeItem as elements. As for Item will
> have to store meta-data like size, undo and such info. We don't wish to
> restrict compressing only items from same insertion sessions only. Hence,
> yes doens't just store Datum values. Wish to consider it more tuple level
> operations and have meta-data for it and able to work with tuple level
> granularity than block level.
OK, thanks for the clarification, that somewhat explains my confusion.
So if I understand it correctly, ZSCompressedBtreeItem is essentially a
sequence of ZSUncompressedBtreeItem(s) stored one after another, along
with some additional top-level metadata.
> Definitely many more tricks can be and need to be applied to optimize
> storage format, like for fixed width columns no need to store the size in
> every item. Keep it simple is theme have been trying to maintain.
> Compression ideally should compress duplicate data pretty easily and
> efficiently as well, but we will try to optimize as much we can without
> the same.
I think there's plenty of room for improvement. The main problem I see
is that it mixes different types of data, which is bad for compression
and vectorized execution. I think we'll end up with a very different
representation of the container, essentially decomposing the items into
arrays of values of the same type - array of TIDs, array of undo
pointers, buffer of serialized values, etc.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2019-04-15 20:31:14 | Re: finding changed blocks using WAL scanning |
Previous Message | Peter Geoghegan | 2019-04-15 20:07:38 | Re: Zedstore - compressed in-core columnar storage |