Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-11-22 13:21:06
Message-ID: CAKU4AWoVL9k=EGUtuxVJ-c6owQXFzyey+sdv_j-2QF24FczQuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi David:

I did a review on the v8, it looks great to me. Here are some tiny things
noted,
just FYI.

1. modified src/include/utils/selfuncs.h
@@ -70,9 +70,9 @@
* callers to provide further details about some assumptions which were
made
* during the estimation.
*/
-#define SELFLAG_USED_DEFAULT (1 << 0) /* Estimation fell back on one of
- * the DEFAULTs as defined above.
- */
+#define SELFLAG_USED_DEFAULT (1 << 0) /* Estimation fell back on one
+ * of the DEFAULTs as defined
+ * above. */

Looks nothing has changed.

2. leading spaces is not necessary in comments.

/*
* ResultCacheTuple Stores an individually cached tuple
*/
typedef struct ResultCacheTuple
{
MinimalTuple mintuple; /* Cached tuple */
struct ResultCacheTuple *next; /* The next tuple with the same parameter
* values or NULL if it's the last one */
} ResultCacheTuple;

3. We define ResultCacheKey as below.

/*
* ResultCacheKey
* The hash table key for cached entries plus the LRU list link
*/
typedef struct ResultCacheKey
{
MinimalTuple params;
dlist_node lru_node; /* Pointer to next/prev key in LRU list */
} ResultCacheKey;

Since we store it as a MinimalTuple, we need some FETCH_INNER_VAR step for
each element during the ResultCacheHash_equal call. I am thinking if we can
store a "Datum *" directly to save these steps.
exec_aggvalues/exec_aggnulls looks
a good candidate for me, except that the name looks not good. IMO, we can
rename exec_aggvalues/exec_aggnulls and try to merge
EEOP_AGGREF/EEOP_WINDOW_FUNC into a more generic step which can be
reused in this case.

4. I think the ExecClearTuple in prepare_probe_slot is not a must, since
the
data tts_values/tts_flags/tts_nvalid are all reset later, and tts_tid is not
real used in our case. Since both prepare_probe_slot
and ResultCacheHash_equal are in pretty hot path, we may need to consider
it.

static inline void
prepare_probe_slot(ResultCacheState *rcstate, ResultCacheKey *key)
{
...
ExecClearTuple(pslot);
...
}

static void
tts_virtual_clear(TupleTableSlot *slot)
{
if (unlikely(TTS_SHOULDFREE(slot)))
{
VirtualTupleTableSlot *vslot = (VirtualTupleTableSlot *) slot;

pfree(vslot->data);
vslot->data = NULL;

slot->tts_flags &= ~TTS_FLAG_SHOULDFREE;
}

slot->tts_nvalid = 0;
slot->tts_flags |= TTS_FLAG_EMPTY;
ItemPointerSetInvalid(&slot->tts_tid);
}

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-11-22 15:23:42 Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Previous Message Phil Florent 2020-11-22 12:50:55 Parallel plans and "union all" subquery