Re: Datum values consistency within one query

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Datum values consistency within one query
Date: 2020-04-02 23:30:20
Message-ID: 8196.1585870220@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Paul Ramsey <pramsey(at)cleverelephant(dot)ca> writes:
> Getting the datum value is really fast, so I can have a cache that
> keeps the latest detoasted object around, and update it when the datum
> changes, and store the cache information in the parent context. Like
> so:

Jeez, no, not like that. You're just testing a pointer. Most likely,
if this is happening in a table scan, the pointer is pointing into
some shared buffer. If that buffer gets re-used to hold some other
page, you could receive the identical pointer value but it's pointing
to completely different data. The risk of false pointer match would
be even higher at plan levels above a scan, I think, because it'd
possibly just be pointing into a plan node's output tuple slot.

The case where this would actually be worth doing, probably, is where
you are receiving a toasted-out-of-line datum. In that case you could
legitimately use the toast pointer ID values (va_valueid + va_toastrelid)
as a lookup key for a cache, as long as it had a lifespan of a statement
or less. You'd have to get a bit in bed with the details of toast
pointers, but it's not like those are going anywhere.

It would be interesting to tie that into the "expanded object"
infrastructure, perhaps, especially if the contents of the objects
you're interested in aren't just flat blobs of data.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-04-02 23:44:34 Re: Berserk Autovacuum (let's save next Mandrill)
Previous Message Tom Lane 2020-04-02 23:18:29 Re: [PATCH] Fix for slow GIN index queries when "gin_fuzzy_search_limit" setting is relatively small for large tables