Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2021-03-30 11:42:14
Message-ID: CAApHDvrz4f+i1wu-8hyqJ=pxYDroGA5Okgo5rWPOj47RZ6QTmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 29 Mar 2021 at 15:56, Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> For show_resultcache_info()
>
> + if (rcstate->shared_info != NULL)
> + {
>
> The negated condition can be used with a return. This way, the loop can be unindented.

OK. I change that.

> + * ResultCache nodes are intended to sit above a parameterized node in the
> + * plan tree in order to cache results from them.
>
> Since the parameterized node is singular, it would be nice if 'them' can be expanded to refer to the source of result cache.

I've done a bit of rewording in that paragraph.

> + rcstate->mem_used -= freed_mem;
>
> Should there be assertion that after the subtraction, mem_used stays non-negative ?

I'm not sure. I ended up adding one and also adjusting the #ifdef in
remove_cache_entry() which had some code to validate the memory
accounting so that it compiles when USE_ASSERT_CHECKING is defined.
I'm unsure if that's a bit too expensive to enable during debugs but I
didn't really want to leave the code in there unless it's going to get
some exercise on the buildfarm.

> + if (found && entry->complete)
> + {
> + node->stats.cache_hits += 1; /* stats update */
>
> Once inside the if block, we would return.

OK change.

> + else
> + {
> The else block can be unindented (dropping else keyword).

changed.

> + * return 1 row. XXX is this worth the check?
> + */
> + if (unlikely(entry->complete))
>
> Since the check is on a flag (with minimal overhead), it seems the check can be kept, with the question removed.

I changed the comment, but I did leave a mention that I'm still not
sure if it should be an Assert() or an elog.

The attached patch is an updated version of the Result Cache patch
containing the changes for the things you highlighted plus a few other
things.

I pushed the change to simplehash.h and the estimate_num_groups()
change earlier, so only 1 patch remaining.

Also, I noticed the CFBof found another unstable parallel regression
test. This was due to some code in show_resultcache_info() which
skipped parallel workers that appeared to not help out. It looks like
on my machine the worker never got a chance to do anything, but on one
of the CFbot's machines, it did. I ended up changing the EXPLAIN
output so that it shows the cache statistics regardless of if the
worker helped or not.

David

Attachment Content-Type Size
v19-0001-Add-Result-Cache-executor-node.patch text/plain 154.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-30 12:30:15 Outdated comment for CreateStmt.inhRelations
Previous Message Etsuro Fujita 2021-03-30 11:40:35 Re: Asynchronous Append on postgres_fdw nodes.