Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-11-22 15:23:42
Message-ID: CAKU4AWpW=zS-O90ykGUyGYUCjoTeYJVQSe3pOAo8uHGm71e4Hg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 22, 2020 at 9:21 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:

> Hi David:
>
> I did a review on the v8, it looks great to me. Here are some tiny
> things noted,
> just FYI.
>
> 1. modified src/include/utils/selfuncs.h
> @@ -70,9 +70,9 @@
> * callers to provide further details about some assumptions which were
> made
> * during the estimation.
> */
> -#define SELFLAG_USED_DEFAULT (1 << 0) /* Estimation fell back on one of
> - * the DEFAULTs as defined above.
> - */
> +#define SELFLAG_USED_DEFAULT (1 << 0) /* Estimation fell back on one
> + * of the DEFAULTs as defined
> + * above. */
>
> Looks nothing has changed.
>
>
> 2. leading spaces is not necessary in comments.
>
> /*
> * ResultCacheTuple Stores an individually cached tuple
> */
> typedef struct ResultCacheTuple
> {
> MinimalTuple mintuple; /* Cached tuple */
> struct ResultCacheTuple *next; /* The next tuple with the same parameter
> * values or NULL if it's the last one */
> } ResultCacheTuple;
>
>
> 3. We define ResultCacheKey as below.
>
> /*
> * ResultCacheKey
> * The hash table key for cached entries plus the LRU list link
> */
> typedef struct ResultCacheKey
> {
> MinimalTuple params;
> dlist_node lru_node; /* Pointer to next/prev key in LRU list */
> } ResultCacheKey;
>
> Since we store it as a MinimalTuple, we need some FETCH_INNER_VAR step for
> each element during the ResultCacheHash_equal call. I am thinking if we
> can
> store a "Datum *" directly to save these steps.
> exec_aggvalues/exec_aggnulls looks
> a good candidate for me, except that the name looks not good. IMO, we can
> rename exec_aggvalues/exec_aggnulls and try to merge
> EEOP_AGGREF/EEOP_WINDOW_FUNC into a more generic step which can be
> reused in this case.
>
> 4. I think the ExecClearTuple in prepare_probe_slot is not a must, since
> the
> data tts_values/tts_flags/tts_nvalid are all reset later, and tts_tid is
> not
> real used in our case. Since both prepare_probe_slot
> and ResultCacheHash_equal are in pretty hot path, we may need to consider
> it.
>
> static inline void
> prepare_probe_slot(ResultCacheState *rcstate, ResultCacheKey *key)
> {
> ...
> ExecClearTuple(pslot);
> ...
> }
>
>
> static void
> tts_virtual_clear(TupleTableSlot *slot)
> {
> if (unlikely(TTS_SHOULDFREE(slot)))
> {
> VirtualTupleTableSlot *vslot = (VirtualTupleTableSlot *) slot;
>
> pfree(vslot->data);
> vslot->data = NULL;
>
> slot->tts_flags &= ~TTS_FLAG_SHOULDFREE;
> }
>
> slot->tts_nvalid = 0;
> slot->tts_flags |= TTS_FLAG_EMPTY;
> ItemPointerSetInvalid(&slot->tts_tid);
> }
>
> --
> Best Regards
> Andy Fan
>

add 2 more comments.

1. I'd suggest adding Assert(false); in RC_END_OF_SCAN case to make the
error clearer.

case RC_END_OF_SCAN:
/*
* We've already returned NULL for this scan, but just in case
* something call us again by mistake.
*/
return NULL;

2. Currently we handle the (!cache_store_tuple(node, outerslot))) case by
set it
to RC_CACHE_BYPASS_MODE. The only reason for the cache_store_tuple
failure is
we can't cache_reduce_memory. I guess if cache_reduce_memory
failed once, it would not succeed later(no more tuples can be stored,
nothing is changed). So I think we can record this state and avoid any
new
cache_reduce_memory call.

/*
* If we failed to create the entry or failed to store the
* tuple in the entry, then go into bypass mode.
*/
if (unlikely(entry == NULL ||
!cache_store_tuple(node, outerslot)))

to

if (unlikely(entry == NULL || node->memory_cant_be_reduced ||
!cache_store_tuple(node, outerslot)))

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2020-11-22 15:59:59 Re: More time spending with "delete pending"
Previous Message Andy Fan 2020-11-22 13:21:06 Re: Hybrid Hash/Nested Loop joins and caching results from subplans