Re: Compressed TOAST Slicing

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Compressed TOAST Slicing
Date: 2019-02-20 18:50:25
Message-ID: CA+TgmoauKwZxwnDx=7erkBE3S3F5EaFz2m3J6TroG0Z=jcscmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey <pramsey(at)cleverelephant(dot)ca> wrote:
> What this does not support: any function that probably wants less-than-everything, but doesn’t know how big a slice to look for. Stephen thinks I should put an iterator on decompression, which would be an interesting piece of work. Having looked at the json code a little doing partial searches would require a lot of re-work that is above my paygrade, but if there was an iterator in place, at least that next stop would then be open.
>
> Note that adding an iterator isn’t adding two ways to do the same thing, since the iterator would slot nicely underneath the existing slicing API, and just iterate to the requested slice size. So this is easily just “another step” along the train line to providing streaming access to compressed and TOASTed data.

Yeah. Plus, I'm not sure the iterator thing is even the right design
for the JSONB case. It might be better to think, for that case, about
whether there's someway to operate directly on the compressed data.
If you could somehow jigger the format and the chunking so that you
could jump directly to the right chunk and decompress from there,
rather than having to walk over all of the earlier chunks to figure
out where the data you want is, you could probably obtain a large
performance benefit. But figuring out how to design such a scheme
seems pretty far afield from the topic at hand.

I'd actually be inclined not to add an iterator until we have a real
user for it, for exactly the reason that we don't know that it is the
right thing. But there is certain value in decompressing partially,
to a known byte position, as your patch does, no matter what we decide
to do about that stuff.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-02-20 18:51:26 Re: propagating replica identity to partitions
Previous Message Daniel Verite 2019-02-20 18:50:07 Re: Compressed TOAST Slicing