Re: extensible external toast tuple support & snappy prototype

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: extensible external toast tuple support & snappy prototype
Date: 2013-06-07 14:04:15
Message-ID: CA+TgmoYNtMOtfefH_5fEhA94GzrHD_3ZyhQEpVLX+0RHRgNd4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 5, 2013 at 11:01 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-05-31 23:42:51 -0400, Robert Haas wrote:
>> > This should allow for fairly easy development of a new compression
>> > scheme for out-of-line toast tuples. It will *not* work for compressed
>> > inline tuples (i.e. VARATT_4B_C). I am not convinced that that is a
>> > problem or that if it is, that it cannot be solved separately.
>
>> Seems pretty sensible to me. The patch is obviously WIP but the
>> direction seems fine to me.
>
> So, I played a bit more with this, with an eye towards getting this into
> a non WIP state, but: While I still think the method for providing
> indirect external Datum support is fine, I don't think my sketch for
> providing extensible compression is.

I didn't really care about doing (and don't really want to do) both
things in the same patch. I just didn't want the patch to shut the
door to extensible compression in the future.

> Important questions are:
> 1) Which algorithms do we want? I think snappy is a good candidate but I
> mostly chose it because it's BSD licenced, widely used, and has a C
> implementation with a useable API. LZ4 might be another interesting
> choice. Another slower algorithm with higher compression ratio
> would also be a good idea for many applications.

I have no opinion on this.

> 2) Do we want to build infrastructure for more than 3 compression
> algorithms? We could delay that decision till we add the 3rd.

I think we should leave the door open, but I don't have a compelling
desire to get too baroque for v1. Still, maybe if the first byte has
a 1 in the high-bit, the next 7 bits should be defined as specifying a
compression algorithm. 3 compression algorithms would probably last
us a while; but 127 should last us, in effect, forever.

> 3) Surely choosing the compression algorithm via GUC ala SET
> toast_compression_algo = ... isn't the way to go. I'd say a storage
> attribute is more appropriate?

The way we do caching right now supposes that attoptions will be
needed only occasionally. It might need to be revised if we're going
to need it all the time. Or else we might need to use a dedicated
pg_class column.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-06-07 14:14:29 Re: Cost limited statements RFC
Previous Message Tom Lane 2013-06-07 14:00:11 Re: system catalog pg_rewrite column ev_attr document description problem