Re: WIP patch: reducing overhead for repeat de-TOASTing

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch: reducing overhead for repeat de-TOASTing
Date: 2008-08-21 20:39:16
Message-ID: 200808212039.m7LKdGo12488@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Added to TODO:

Eliminate de-TOASTing of values when not needed

---------------------------------------------------------------------------

Tom Lane wrote:
> Attached is a worked-out patch for the approach proposed here:
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg00777.php
> namely, that cache management for de-TOASTed datums is handled
> by TupleTableSlots.
>
> To avoid premature detoasting of values that we might never need, the
> patch introduces a concept of an "indirect TOAST pointer", which has
> the same 0x80 or 0x01 header as an external TOAST pointer, but can
> be told apart by having a different length byte. Within that we have
> * pointer to original toasted field within the Slot's tuple
> * pointer to the owning Slot
> * pointer to decompressed copy, or NULL if not decompressed yet
> Some fairly straightforward extensions to the TupleTableSlot code,
> heaptuple.c, and tuptoaster.c make it all go.
>
> My original thoughts had included turning FREE_IF_COPY() into a no-op,
> but on investigation that seems impractical. One case that still
> depends on that pfree is where we have palloc'd a 4-byte-header copy
> of a short-header datum to support code that needs properly aligned
> datum content. The solution adopted in the patch is to arrange for
> pfree() applied to a cacheable detoasted object to be a no-op, whereas
> it still works normally for non-cached detoasted objects. We do this
> by inserting a dummy chunk header that points to a dummy memory context
> whose pfree support method does nothing. I think this part of the patch
> would be required for any toast caching method, not just this one.
>
> What I like about this patch is that it's a fairly small-footprint
> change, it doesn't add much overhead, and it covers caching of
> decompression for in-line-compressed datums as well as the out-of-line
> case.
>
> One thing I really *don't* like about it is that it requires everyplace
> that copies Datums to know about indirect pointers: in general, the copy
> must be a copy of the original toasted Datum, not of the indirect
> pointer, else we have indirect pointers that can outlive their owning
> TupleTableSlot (or at least outlive its current tuple cycle). There
> only seem to be about half a dozen such places in the current code,
> but still it seems a rather fragile coding rule.
>
> After playing with it for a little bit, I'm not convinced that it buys
> enough performance win to be worth applying --- the restriction of cache
> lifespan to one tuple cycle of a TupleTableSlot is awfully restrictive.
> (For example, sorts that involve toasted sort keys continue to suck,
> because the tuples being sorted aren't in Slots.) It would probably
> fix the specific case that the PostGIS hackers were complaining of,
> but I think we need something more.
>
> Still, I wanted to get it into the archives because the idea of indirect
> toast pointers might be useful for something else.
>
> regards, tom lane
>

Content-Description: toast-indirect-1.patch.gz

[ Type application/octet-stream treated as attachment, skipping... ]

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2008-08-21 21:19:17 Re: [HACKERS] [FINALLY] the TODO list has migrated to Wiki
Previous Message D'Arcy J.M. Cain 2008-08-21 20:35:54 Re: Proposal: new border setting in psql