Re: PostgreSQL in-transit compression for a client connection

From: Dominique Devienne <ddevienne(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Tushar Takate <tushar11(dot)takate(at)gmail(dot)com>, daniel(at)yesql(dot)se, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: PostgreSQL in-transit compression for a client connection
Date: 2023-04-28 11:33:55
Message-ID: CAFCRh-9ihsf1WQNJGc0UMQcBW7SnjCeCZN+mkm7SScZOVV2Tpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Apr 28, 2023 at 9:03 AM Magnus Hagander <magnus(at)hagander(dot)net> wrote:

> On Thu, Apr 27, 2023 at 11:55 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> wrote:
> >
> > On Thu, 2023-04-27 at 11:44 +0200, Dominique Devienne wrote:
> > > as someone who must store ZLIB (from ZIP files)
> > > and sometimes LZ4 compressed `bytea` values, I often find it's a shame
> that I have
> > > to decompress them, send them over the wire uncompressed, to have the
> PostgreSQL
> > > backend recompress them when TOAST'ed. That's a waste of CPU and IO
> bandwidth...
> >
> > That's not what you were looking for, but why not store the compressed
> data
> > in the database (after SET STORAGE EXTERNAL on the column) and uncompress
> > them after you have received them on the client side?
>

Laurenz is right of course. But then like Magnus is saying, I lose
transparent decompression,
on read. But also for server-side processing. Some of those compressed
values are actually
XML, and sometimes it can be useful to process (usualyl extract subset) of
those server-side.

Unless there are ways to uncompress values explicitly in SQL or PG/PLSQL?

> That assumes you only have one client. You may want to use the
> transparent compression/decompression from some clients and not for
> others.
>

Which brings up something I forgot to mention earlier, where I
concrentrated from the write-side,
which is that clients would also ideally need a way to fetch the values
still-compressed, when
explicitly requesting it, while others implicitly get the transparent
decompression.

BTW, such a mechanism would open the door for libpq doing that itself
transparently too, I guess.
That would allow network-transport of the compressed values, and
client-side decompression.
Of course, when getting the whole value only, not when getting a subset.
And possibly opt-in only.

> I think it'd be a useful feature to have, but it's not something that
> we have today or that I'm aware of being on anybodys radar. So most
> likely, for now you're stuck with either what you're doing today, or
> as Laurenz suggests handle it completely in the application. You can't
> do the mix.
>

It's a tricky feature, because the client and server have to cooperate and
agree on the exact
format used for compressed values, including meta-data (uncompressed size,
and checksum,
and compression algo+setting and checksum type; e.g. CRC32 vs XXHASH).
That's why I
suspect this won't happen anytime soon or ever. It's just a pie-in-the-sky
brainstorming exercise.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Amit Kapila 2023-04-28 11:41:03 Re: Support logical replication of DDLs
Previous Message Magnus Hagander 2023-04-28 07:02:57 Re: PostgreSQL in-transit compression for a client connection