Re: Optimize partial TOAST decompression

From: Binguo Bao <djydewang(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimize partial TOAST decompression
Date: 2019-06-24 02:53:49
Message-ID: CAL-OGktyTE=Svgv8YTXou4UxjK-zFLtxU6Wy3RndgCesGexwkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> This is not correct: L bytes of compressed data do not always can be
decoded into at least L bytes of data. At worst we have one control byte
per 8 bytes of literal bytes. This means at most we need (L*9 + 8) / 8
bytes with current pglz format.

Good catch! I've corrected the related code in the patch.

> Also, I'm not sure you use SET_VARSIZE_COMPRESSED correctly...
I followed the code in toast_fetch_datum function[1], and I didn't see any
wrong with it.

Best regards, Binguo Bao

[1]
https://github.com/postgres/postgres/blob/master/src/backend/access/heap/tuptoaster.c#L1898

Andrey Borodin <x4mmm(at)yandex-team(dot)ru> 于2019年6月23日周日 下午5:23写道:

> Hi, Binguo!
>
> > 2 июня 2019 г., в 19:48, Binguo Bao <djydewang(at)gmail(dot)com> написал(а):
> >
> > Hi, hackers!
> ....
> > This seems to have a 10x improvement. If the number of toast data chunks
> is more, I believe that patch can play a greater role, there are about 200
> related TOAST data chunks for each entry in the case.
>
> That's really cool that you could produce meaningful patch long before end
> of GSoC!
>
> I'll describe what is going on a little:
> 1. We have compressed value, which resides in TOAST table.
> 2. We want only some fraction of this value. We want some prefix with
> length L.
> 3. Previously Paul Ramsey submitted patch that omits decompression of
> value beyond desired L bytes.
> 4. Binguo's patch tries to do not fetch compressed data which will not bee
> needed to decompressor. In fact it fetches L bytes from TOAST table.
>
> This is not correct: L bytes of compressed data do not always can be
> decoded into at least L bytes of data. At worst we have one control byte
> per 8 bytes of literal bytes. This means at most we need (L*9 + 8) / 8
> bytes with current pglz format.
>
> Also, I'm not sure you use SET_VARSIZE_COMPRESSED correctly...
>
> Best regards, Andrey Borodin.

Attachment Content-Type Size
0001-Optimize-partial-TOAST-decompression-2.patch text/x-patch 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-06-24 03:45:37 Re: Problem with default partition pruning
Previous Message Michael Paquier 2019-06-24 02:27:30 Re: Plugging some testing holes