Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: dgrowleyml(at)gmail(dot)com
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-12-05 03:51:48
Message-ID: CALNJ-vTi+MNGjPJn8WDcY2DODEBbnxGYA4_CcG==-2JCA3BQFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There are two blocks with almost identical code (second occurrence in
cache_store_tuple):

+ if (rcstate->mem_used > rcstate->mem_upperlimit)
+ {

It would be nice if the code can be extracted to a method and shared.

node->rc_status = RC_END_OF_SCAN;
return NULL;
}
else

There are several places where the else keyword for else block can be
omitted because the if block ends with return.
This would allow the code in else block to move leftward (for easier
reading).

if (!get_op_hash_functions(hashop, &left_hashfn, &right_hashfn))

I noticed that right_hashfn isn't used. Would this cause some warning from
the compiler (for some compiler the warning would be treated as error) ?
Maybe NULL can be passed as the last parameter. The return value
of get_op_hash_functions would keep the current meaning (find both hash
fn's).

rcstate->mem_lowerlimit = rcstate->mem_upperlimit * 0.98;

Maybe (in subsequent patch) GUC variable can be introduced for tuning the
constant 0.98.

For +paraminfo_get_equal_hashops :

+ else
+ Assert(false);

Add elog would be good for debugging.

Cheers

On Fri, Dec 4, 2020 at 5:09 PM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:

> Hi, David:
> For nodeResultCache.c :
>
> +#define SH_EQUAL(tb, a, b) ResultCacheHash_equal(tb, a, b) == 0
>
> I think it would be safer if the comparison is enclosed in parentheses (in
> case the macro appears in composite condition).
>
> +ResultCacheHash_equal(struct resultcache_hash *tb, const ResultCacheKey
> *key1,
> + const ResultCacheKey *key2)
>
> Since key2 is not used, maybe name it unused_key ?
>
> + /* Make a guess at a good size when we're not given a valid size. */
> + if (size == 0)
> + size = 1024;
>
> Should the default size be logged ?
>
> + /* Update the memory accounting */
> + rcstate->mem_used -= freed_mem;
>
> Maybe add an assertion that mem_used is >= 0 after the decrement (there is
> an assertion in remove_cache_entry however, that assertion is after another
> decrement).
>
> + * 'specialkey', if not NULL, causes the function to return false if the
> entry
> + * entry which the key belongs to is removed from the cache.
>
> duplicate entry (one at the end of first line and one at the beginning of
> second line).
>
> For cache_lookup(), new key is allocated before checking
> whether rcstate->mem_used > rcstate->mem_upperlimit. It seems new entries
> should probably have the same size.
> Can we check whether upper limit is crossed (assuming the addition of new
> entry) before allocating new entry ?
>
> + if (unlikely(!cache_reduce_memory(rcstate, key)))
> + return NULL;
>
> Does the new entry need to be released in the above case?
>
> Cheers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-12-05 03:52:29 Re: Proposed patch for key managment
Previous Message Masahiko Sawada 2020-12-05 03:38:12 Re: Add Information during standby recovery conflicts