Re: Avoiding redundant fetches of btree index metapages

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Avoiding redundant fetches of btree index metapages
Date: 2006-04-26 14:48:04
Message-ID: 1146062884.8202.220.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2006-04-25 at 13:43 -0400, Tom Lane wrote:

> What I'm considering doing to fix that is require any change to a
> btree's metapage data to send out a relcache inval message for the
> index. That will force all backends to flush the stale cache item
> not later than the start of their next transaction, and thereby
> guarantee that they aren't using pointers that are too old to be safe
> against vacuuming. (There are other ways we could handle this, but
> that one seems like the simplest and least invasive.)
>
> Comments? Anyone see any flaws in the reasoning?

Hmmm.... I'm slightly worried that frequently-churning small tables will
be made even slower by this. What do you think?

> * Re-using rd_targblock was OK for a quick hack because we don't use it
> for indexes, but it's too klugy for real use, and it's not quite enough
> info anyway (we really ought to cache the root level as well as root
> block number). I'm planning to add a "void *" pointer to the Relation
> struct that the index AM is allowed to use as it sees fit, with the
> understanding that any pointed-to data lives in rd_indexcxt and the
> pointer will be reset to NULL on any relcache clear. btree would store
> a copy of the BTMetaPageData struct. The other AMs might have some
> other use for this.

So we would be able to cache other items also? I'd also considered
caching the rightmost page, for when the table grows via a monotonically
increasing key. Similar benefits, same problems, so same-ish solution
sounds the right way.

For that case we'd save N block accesses to locate the rightmost leaf
page. If the cache is wrong, we could seek again from the root at a cost
of 1 additional access (or move right until we find it depending upon
how stale we think the cache is, if we can find a heuristic to gauge
that). We would only need to send an invalidation message when VACUUM
removes a page, as well as for any insertion other than the rightmost
page. Maybe do this as an index option, e.g. APPEND (MONOTONIC seems
like a poor choice, even if it would be more correct)?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Drake 2006-04-26 15:37:32 Re: GIN - Generalized Inverted iNdex. Try 2.
Previous Message Teodor Sigaev 2006-04-26 14:23:20 GIN - Generalized Inverted iNdex. Try 2.