Re: pglz performance

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vladimir Leskov <vladimirlesk(at)yandex-team(dot)ru>
Subject: Re: pglz performance
Date: 2019-09-25 21:24:18
Message-ID: 704b2f0a-fef3-cee0-bd55-b5704074f4c3@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 05/08/2019 09:26, Andres Freund wrote:
> Hi,
>
> On 2019-08-05 00:19:28 +0200, Petr Jelinek wrote:
>> It carries that information inside the compressed value, like I said in the
>> other reply, that's why the extra byte.
>
> I'm not convinced that that is a good plan - imo the reference to the
> compressed data should carry that information.
>
> I.e. in the case of toast, at least toast pointers should hold enough
> information to determine the compression algorithm. And in the case of
> WAL, the WAL record should contain that.
>
Point taken.

>
> For external datums I suggest encoding the compression method as a
> distinct VARTAG_ONDISK_COMPRESSED, and then have that include the
> compression method as a field.

So the new reads/writes will use this and reads of old format won't
change? Sounds fine.

>
> For in-line compressed values (so VARATT_IS_4B_C), doing something
> roughly like you did, indicating the type of metadata following using
> the high bit sounds reasonable. But I think I'd make it so that if the
> highbit is set, the struct is instead entirely different, keeping a full
> 4 byte byte length, and including the compression type header inside the
> struct. Perhaps I'd combine the compression type with the high-bit-set
> part? So when the high bit is set, it'd be something like
>
> {
> int32 vl_len_; /* varlena header (do not touch directly!) */
>
> /*
> * Actually only 7 bit, the high bit determines whether this
> * is the old compression header (if unset), or this type of header
> * (if set).
> */
> uint8 type;
>
> /*
> * Stored as uint8[4], to avoid unnecessary alignment padding.
> */
> uint8[4] length;
>
> char va_data[FLEXIBLE_ARRAY_MEMBER];
> }
>

Won't this break BW compatibility on big-endian (if I understand
corretly what you are trying to achieve here)?

> I think it's worth spending some effort trying to get this right - we'll
> be bound by design choices for a while.
>

Sure, however I am not in business of redesigning TOAST from scratch
right now, even if I do agree that the current header is far from ideal.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yuli Khodorkovskiy 2019-09-25 21:40:35 Re: add a MAC check for TRUNCATE
Previous Message Petr Jelinek 2019-09-25 21:14:18 Re: DROP SUBSCRIPTION with no slot