Re: Cache Hash Index meta page.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Cache Hash Index meta page.
Date: 2017-01-24 09:40:24
Message-ID: CAA4eK1K5+uipx0B_s-9CdU-W5AybNP9yoqwR43osuyMaiN4GRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 18, 2017 at 11:51 AM, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> wrote:
> On Tue, Jan 17, 2017 at 10:07 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
>> (b) Another somewhat bigger problem is that with this new change it
>> won't retain the pin on meta page till the end which means we might
>> need to perform an I/O again during operation to fetch the meta page.
>> AFAICS, you have just changed it so that you can call new API
>> _hash_getcachedmetap, if that's true, then I think you have to find
>> some other way of doing it. BTW, why can't you design your new API
>> such that it always take pinned metapage? You can always release the
>> pin in the caller if required. I understand that you don't always
>> need a metapage in that API, but I think the current usage of that API
>> is also not that good.
>
>
> -- Yes what you say is right. I wanted to make _hash_getcachedmetap
> API self-sufficient. But always 2 possible consecutive reads for
> metapage are connected as we pin the page to buffer to avoid I/O. Now
> redesigned the API such a way that caller pass pinned meta page if we
> want to set the metapage cache; So this gives us the flexibility to
> use the cached meta page data in different places.
>

1.
@@ -505,26 +505,22 @@ hashbulkdelete(IndexVacuumInfo *info,
IndexBulkDeleteResult *stats,
..
..

+ metabuf = _hash_getbuf(rel, HASH_METAPAGE, HASH_NOLOCK, LH_META_PAGE);
+ cachedmetap = _hash_getcachedmetap(rel, metabuf);

In the above flow, do we really need an updated metapage, can't we use
the cached one? We are already taking care of bucket split down in
that function.

2.
+HashMetaPage
+_hash_getcachedmetap(Relation rel, Buffer metabuf)
+{
..
..
+ if (BufferIsInvalid(metabuf))
+ return (HashMetaPage) rel->rd_amcache;
..

+_hash_getbucketbuf_from_hashkey(Relation rel, uint32 hashkey, int access,
+ HashMetaPage *cachedmetap)
{
..

+ if (!(metap = _hash_getcachedmetap(rel, InvalidBuffer)))
+ {
+ metabuf = _hash_getbuf(rel, HASH_METAPAGE, HASH_NOLOCK, LH_META_PAGE);
+ metap = _hash_getcachedmetap(rel, metabuf);
+ Assert(metap != NULL);
+ }
..
}

The above two chunks of code look worse as compare to your previous
patch. I think what we can do is keep the patch ready with both the
versions of _hash_getcachedmetap API (as you have in _v11 and _v12)
and let committer take the final call.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-01-24 09:46:16 Re: parallelize queries containing subplans
Previous Message Peter Moser 2017-01-24 09:32:51 Re: [PROPOSAL] Temporal query processing with range types