Re: Cache Hash Index meta page.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Cache Hash Index meta page.
Date: 2016-12-05 20:50:02
Message-ID: CA+Tgmoa4b_34D3vp7nCdyHGR+QCKCBUFGZenNsHDfDFwLEb60Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 5, 2016 at 2:58 PM, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
wrote:

> On Thu, Dec 1, 2016 at 8:10 PM, Jesper Pedersen <
> jesper(dot)pedersen(at)redhat(dot)com> wrote:
> >As the concurrent hash index patch was committed in 6d46f4 this patch
> needs a rebase.
>
> Thanks Jesper,
>
> Adding the rebased patch.
>

- bucket = _hash_hashkey2bucket(hashkey,
- metap->hashm_maxbucket,
+ bucket = _hash_hashkey2bucket(hashkey, metap->hashm_maxbucket,
metap->hashm_highmask,
metap->hashm_lowmask);

This hunk appears useless.

+ metap = (HashMetaPage)rel->rd_amcache;

Whitespace.

+ /* Cache the metapage data for next time*/

Whitespace.

+ /* Check if this bucket is split after we have cached the metapage.

Whitespace.

Shouldn't _hash_doinsert() be using the cache, too?

I think it's probably not a good idea to cache the entire metapage. The
only things that you are "really" trying to cache, I think, are
hashm_maxbucket, hashm_lowmask, and hashm_highmask. The entire
HashPageMetaData structure is 696 bytes on my machine, and it doesn't
really make sense to copy the whole thing into memory if you only need 16
bytes of it. It could even be dangerous -- if somebody tries to rely on
the cache for some other bit of data and we're not really guaranteeing that
it's fresh enough for that.

I'd suggest defining a new structure HashMetaDataCache with members
hmc_maxbucket, hmc_lowmask, and hmc_highmask. The structure can have a
comment explaining that we only care about having the data be fresh enough
to test whether the bucket mapping we computed for a tuple is still
correct, and that for that to be the case we only need to know whether a
bucket has suffered a new split since we last refreshed the cache.

The comments in this patch need some work, e.g.:

-
+ oopaque->hasho_prevblkno = maxbucket;

No comment?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2016-12-05 20:51:02 pgsql: Add support for restrictive RLS policies
Previous Message David Fetter 2016-12-05 20:14:02 Re: Separate connection handling from backends