Re: Rules for accessing tuple data in backend code

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Rules for accessing tuple data in backend code
Date: 2002-01-29 01:24:20
Message-ID: 850.1012267460@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Tuples obtained from heap scans (heap_getnext, etc.) can always be
> dissected with heap_getattr().

Check. Index scans the same.

> Tuples obtained from syscache lookups (SearchSysCache) can always be
> dissected with SysCacheGetAttr().

Check.

> What happens when I try heap_getattr() on a syscache tuple?

Works fine; in fact, SysCacheGetAttr is just a convenience routine that
invokes heap_getattr. The reason it's convenient is that you don't
necessarily have a tuple descriptor handy for the catalog that underlies
a particular syscache. SysCacheGetAttr knows where to find a matching
descriptor.

> Tuples obtained from heap scans or syscache lookups may be dissected via
> GETSTRUCT if and only if the attribute and all attributes prior to it are
> fixed-length and non-nullable.

Right. GETSTRUCT per se isn't very interesting; a more helpful way to
phrase the above is that "a C struct definition can be overlaid onto
the contents of a tuple, but it's only useful out to the last
fixed-length, non-null field. We try to arrange the contents of system
catalogs so that that usefulness extends as far as possible."

> (Probably there should be cases about explicit index scans here, but I
> haven't done those and they should be rare.)

For these purposes index and heap scans are the same; either one
ultimately gives back a pointer to a tuple sitting in a disk buffer.

> The question I'm particularly struggling with is, when does TOASTing and
> de-TOASTing happen?

It doesn't, at the level of heap_getattr(). For a pass-by-reference
datatype (which includes all toastable types, a fortiori), heap_getattr
simply gives you back a Datum which is a pointer to the relevant place
in the tuple. In general, you are not supposed to do anything with a
Datum except pass it around, unless you know the specific datatype of
the value and know how to operate on it. For toastable datatypes, part
of "knowing how to operate on it" is to know to call pg_detoast_datum()
anytime you are handed a Datum that might possibly point at a toasted
value.

For the most part, datatype-specific operations are localized in
fmgr-callable functions, so it's possible to hide most of the knowledge
about detoasting in PG_GET_FOO macros for the affected datatypes.

> I've found PG_DETOAST_DATUM and PG_DETOAST_DATUM_COPY. Why would I want a
> copy? (How can detoasting happen without copying?)

PG_DETOAST_DATUM_COPY guarantees to give you a copy, even if the
original wasn't toasted. This allows you to scribble on the input,
in case that happens to be a useful way of forming your result.
Without a forced copy, a routine for a pass-by-ref datatype must
NEVER, EVER scribble on its input ... because very possibly it'd
be scribbling on a valid tuple in a disk buffer, or a valid entry
in the syscache.

> And if I want a copy, in what memory context does it live?

It's just palloc'd, so it's whatever is CurrentMemoryContext.

> And can I just pfree() the copy if I don't want it any longer?

Yes. In many scenarios you don't have to because CurrentMemoryContext
is short-lived, though. There are a lot of pfree's in the system that
are really just wasted cycles.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Lau NH 2002-01-29 03:01:58 Backup database through web and php
Previous Message Tom Lane 2002-01-29 01:01:49 Re: Per-database and per-user GUC settings