Re: VACUUM FULL versus system catalog cache invalidation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: VACUUM FULL versus system catalog cache invalidation
Date: 2011-08-13 19:27:50
Message-ID: CA+TgmobxW=qfa7=nKXJByGMAZYagztjOdA-ijqaqXN8SwxpE4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 13, 2011 at 12:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> Yeah.  Also, to my mind this is only a fix that will be used in 9.0 and
>> 9.1 --- now that it's occurred to me that we could use tuple xmin/xmax
>> to invalidate catcaches instead of recording individual TIDs, I'm
>> excited about doing that instead for 9.2 and beyond.  I believe that
>> that could result in a significant reduction in sinval traffic, which
>> would be a considerable performance win.
>
> On closer inspection this idea doesn't seem workable.  I was imagining
> that after a transaction commits, we could find obsoleted catcache
> entries by looking for tuples with xmax equal to the transaction's XID.
> But a catcache entry made before the transaction had done the update
> wouldn't contain the correct xmax, so we'd fail to remove it.  The only
> apparent way to fix that would be to go out to disk and consult the
> current on-disk xmax, which would hardly be any cheaper than just
> dropping the cache entry and then reloading it when/if needed.
>
> All is not entirely lost, however: there's still some possible
> performance benefit to be gained here, if we go to the scheme of
> identifying victim catcache entries by hashvalue only.  Currently,
> each heap_update in a cached catalog has to issue two sinval messages
> (per cache!): one against the old TID and one against the new TID.
> We'd be able to reduce that to one message in the common case where the
> hashvalue remains the same because the cache key columns didn't change.

Cool.

> Another thing we could consider doing, if one-in-2^32 hash collisions
> seems too high, is widening the hash values to 64 bits.  I'm not
> terribly excited about that, because a lot of the caches are on OID
> columns for which there'd be zero benefit.

Yeah, and I can't get excited about the increased memory usage, either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-13 19:33:12 Re: index-only scans
Previous Message Peter Eisentraut 2011-08-13 17:56:15 Re: psql: bogus descriptions displayed by \d+