Re: Optimize partial TOAST decompression

From: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
To: Binguo Bao <djydewang(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize partial TOAST decompression
Date: 2019-07-02 14:46:13
Message-ID: CACowWR1U3guqBzqPL1rHWxYBpU-yUQzr+s07MPT9qBg8LGb+uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 1, 2019 at 6:46 AM Binguo Bao <djydewang(at)gmail(dot)com> wrote:
> > Andrey Borodin <x4mmm(at)yandex-team(dot)ru> 于2019年6月29日周六 下午9:48写道:
>> I've took a look into the code.
>> I think we should extract function for computation of max_compressed_size and put it somewhere along with pglz code. Just in case something will change something about pglz so that they would not forget about compression algorithm assumption.
>>
>> Also I suggest just using 64 bit computation to avoid overflows. And I think it worth to check if max_compressed_size is whole data and use min of (max_compressed_size, uncompressed_data_size).
>>
>> Also you declared needsize and max_compressed_size too far from use. But this will be solved by function extraction anyway.
>>
> Thanks for the suggestion.
> I've extracted function for computation for max_compressed_size and put the function into pg_lzcompress.c.

This looks good to me. A little commentary around why
pglz_maximum_compressed_size() returns a universally correct answer
(there's no way the compressed size can ever be larger than this
because...) would be nice for peasants like myself.

If you're looking to continue down this code line in your next patch,
the next TODO item is a little more involved: a user-land (ala
PG_DETOAST_DATUM) iterator API for access of TOAST datums would allow
the optimization of searching of large objects like JSONB types, and
so on, where the thing you are looking for is not at a known location
in the object. So, things like looking for a particular substring in a
string, or looking for a particular key in a JSONB. "Iterate until you
find the thing." would allow optimization of some code lines that
currently require full decompression of the objects.

P.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rui Hai Jiang 2019-07-02 14:47:32 TopoSort() fix
Previous Message Anthony Nowocien 2019-07-02 14:43:20 Re: progress report for ANALYZE