Re: VACUUM FULL versus system catalog cache invalidation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: VACUUM FULL versus system catalog cache invalidation
Date: 2011-08-13 16:18:02
Message-ID: 20970.1313252282@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Yeah. Also, to my mind this is only a fix that will be used in 9.0 and
> 9.1 --- now that it's occurred to me that we could use tuple xmin/xmax
> to invalidate catcaches instead of recording individual TIDs, I'm
> excited about doing that instead for 9.2 and beyond. I believe that
> that could result in a significant reduction in sinval traffic, which
> would be a considerable performance win.

On closer inspection this idea doesn't seem workable. I was imagining
that after a transaction commits, we could find obsoleted catcache
entries by looking for tuples with xmax equal to the transaction's XID.
But a catcache entry made before the transaction had done the update
wouldn't contain the correct xmax, so we'd fail to remove it. The only
apparent way to fix that would be to go out to disk and consult the
current on-disk xmax, which would hardly be any cheaper than just
dropping the cache entry and then reloading it when/if needed.

All is not entirely lost, however: there's still some possible
performance benefit to be gained here, if we go to the scheme of
identifying victim catcache entries by hashvalue only. Currently,
each heap_update in a cached catalog has to issue two sinval messages
(per cache!): one against the old TID and one against the new TID.
We'd be able to reduce that to one message in the common case where the
hashvalue remains the same because the cache key columns didn't change.

Another thing we could consider doing, if one-in-2^32 hash collisions
seems too high, is widening the hash values to 64 bits. I'm not
terribly excited about that, because a lot of the caches are on OID
columns for which there'd be zero benefit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2011-08-13 17:56:15 Re: psql: bogus descriptions displayed by \d+
Previous Message Tom Lane 2011-08-13 14:33:53 Re: Inserting heap tuples in bulk in COPY