Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-08-19 22:58:11
Message-ID: 20200819225811.GA24985@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-Aug-19, David Rowley wrote:

> Andres' suggestion:
>
> regression=# explain (analyze, costs off, timing off, summary off)
> select count(*) from tenk1 t1 inner join tenk1 t2 on
> t1.twenty=t2.unique1;
> QUERY PLAN
> ---------------------------------------------------------------------------------------
> Aggregate (actual rows=1 loops=1)
> -> Nested Loop (actual rows=10000 loops=1)
> Cache Key: t1.twenty Hits: 9980 Misses: 20 Evictions: 0 Overflows: 0
> -> Seq Scan on tenk1 t1 (actual rows=10000 loops=1)
> -> Index Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=20)
> Index Cond: (unique1 = t1.twenty)
> (6 rows)

I think it doesn't look terrible in the SubPlan case -- it kinda makes
sense there -- but for nested loop it appears really strange.

On the performance aspect, I wonder what the overhead is, particularly
considering Tom's point of making these nodes more expensive for cases
with no caching. And also, as the JIT saga continues, aren't we going
to get plan trees recompiled too, at which point it won't matter much?

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-08-19 23:27:28 Re: Optimising compactify_tuples()
Previous Message Andres Freund 2020-08-19 22:53:43 Re: Creating a function for exposing memory usage of backend process