Re: GIST and TOAST

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: andrew(at)supernews(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: GIST and TOAST
Date: 2007-03-06 15:06:54
Message-ID: 45ED838E.5000204@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> The problem is that this is the only place in the code where we make wholesale
> assumptions that a datum that comes from a tuple (heap tuple or index tuple)
> isn't toasted. There are other places but they're always flagged with big
> comments explaining *why* the datum can't be toasted and they're minor
> localized instances, not a whole subsystem.
>
> This was one of the assumptions that the packed varlena code depended on: that
> anyone looking at a datum from a tuple would always detoast it even if they
> had formed the tuple themselves and never passed it through the toaster. The
> *only* place this has come up as a problem is in GIST.

I'm afraid that we have some lack of understanding. Flow of operation with
indexed tuple in gist is:
- read tuple
- get n-th attribute with a help of index_getattr
- call user-defined decompress method which should, at least, detoast value
- result value is passed to other user-defined method

Any new value, produced by user-defined method of GiST, before packing into
tuple should be compressed by user-defined compress method. Compress method
should not toast value - that is not its task.

New values are always modified by compress method before insertion. See
gistinsert:gist.c and gistFormTuple:gistutil.c.

So, index_form_tuple should toast value, but value is already compressed and
live in memory. Detoasting of value should be done by decompress method and live
in memory, and so, only after that value can be passed to other user-defined method.

As I understand, packing/unpacking varlena header is doing during
toasting/detoastiong. So, I'm not understand the problem here.

What is more, GiST API doesn't limit type of keys passed between user-defined
GiST methods. It just says that new value should be a type on which opclass was
defined and output of compress method should be a type pointed by STORAGE option
in CREATE OPERATOR CLASS.

> There may be places that assume they won't leak detoasted copies of datums. If
> you could help point those places out they should just need PG_FREE_IF_COPY()

GiST code works in separate memory context to prevent memory leaks. See
gistinsert/gistbuildCallback/gistfindnext.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2007-03-06 15:14:40 Re: user-defined tree methods in GIST
Previous Message Tom Lane 2007-03-06 15:02:34 Re: Aggressive freezing in lazy-vacuum