Re: Cache Hash Index meta page.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cache Hash Index meta page.
Date: 2016-08-05 13:33:48
Message-ID: CAA4eK1+_XJpq4aTn_-qVEegUMuFfeK2=PxBr5fshiAHaiA33kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 4, 2016 at 3:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>> On Fri, Jul 22, 2016 at 3:02 AM, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> wrote:
>>> I have created a patch to cache the meta page of Hash index in
>>> backend-private memory. This is to save reading the meta page buffer every
>>> time when we want to find the bucket page. In “_hash_first” call, we try to
>>> read meta page buffer twice just to make sure bucket is not split after we
>>> found bucket page. With this patch meta page buffer read is not done, if the
>>> bucket is not split after caching the meta page.
>
> Is this really safe? The metapage caching in btree is all right because
> the algorithm is guaranteed to work even if it starts with a stale idea of
> where the root page is. I do not think the hash code is equally robust
> about stale data in its metapage.
>

I think stale data in metapage could only cause problem if it leads to
a wrong calculation of bucket based on hashkey. I think that
shouldn't happen. It seems to me that the safety comes from the fact
that required fields (lowmask/highmask) to calculate the bucket won't
be changed more than once without splitting the current bucket (which
we are going to scan). Do you see a problem in hashkey to bucket
mapping (_hash_hashkey2bucket), if the lowmask/highmask are changed by
one additional table half or do you have something else in mind?

>
>> What happens on a system which has gone through pg_upgrade?
>
> That being one reason why. It might be okay if we add another hasho_flag
> bit saying that hasho_prevblkno really contains a maxbucket number, and
> then add tests for that bit everyplace that hasho_prevblkno is referenced.
>

Good idea.

- if (retry)
+ if (opaque->hasho_prevblkno <= metap->hashm_maxbucket)

This code seems to be problematic with respect to upgrades, because
hasho_prevblkno will be initialized to 0xFFFFFFFF without the patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2016-08-05 13:38:29 Re: Refactoring of heapam code.
Previous Message Kevin Grittner 2016-08-05 13:30:49 Re: Refactoring of heapam code.