Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-09-02 04:02:54
Message-ID: CAApHDvpDdQDFSM+u19ROinT0qw41OX=MW4-B2mO003v6-X0AjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 29 Aug 2020 at 02:54, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Wed, 26 Aug 2020 at 03:52, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > There'll be a significant reduction in increase in performance.
>
> So I did a very rough-cut change to the patch to have the caching be
> part of Nested Loop. It can be applied on top of the other 3 v7
> patches.
>
> For the performance, the test I did results in the performance
> actually being reduced from having the Result Cache as a separate
> node. The reason for this is mostly because Nested Loop projects.

I spoke to Andres off-list this morning in regards to what can be done
to remove this performance regression over the separate Result Cache
node version of the patch. He mentioned that I could create another
ProjectionInfo for when reading from the cache's slot and use that to
project with.

I've hacked this up in the attached. It looks like another version of
the joinqual would also need to be created to that the MinimalTuple
from the cache is properly deformed. I've not done this yet.

The performance does improve this time. Using the same two test
queries from [1], I get:

v7 (Separate Result Cache node)

Query 1:
postgres=# explain (analyze, timing off) select count(l.a) from
hundredk hk inner join lookup100 l on hk.one = l.a;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=152861.19..152861.20 rows=1 width=8) (actual rows=1 loops=1)
-> Nested Loop (cost=0.45..127891.79 rows=9987763 width=4)
(actual rows=10000000 loops=1)
-> Seq Scan on hundredk hk (cost=0.00..1637.00 rows=100000
width=4) (actual rows=100000 loops=1)
-> Result Cache (cost=0.45..2.53 rows=100 width=4) (actual
rows=100 loops=100000)
Cache Key: hk.one
Hits: 99999 Misses: 1 Evictions: 0 Overflows: 0
-> Index Only Scan using lookup100_a_idx on lookup100
l (cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
Index Cond: (a = hk.one)
Heap Fetches: 0
Planning Time: 0.045 ms
Execution Time: 894.003 ms
(11 rows)

Query 2:
postgres=# explain (analyze, timing off) select * from hundredk hk
inner join lookup100 l on hk.one = l.a;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.45..127891.79 rows=9987763 width=28) (actual
rows=10000000 loops=1)
-> Seq Scan on hundredk hk (cost=0.00..1637.00 rows=100000
width=24) (actual rows=100000 loops=1)
-> Result Cache (cost=0.45..2.53 rows=100 width=4) (actual
rows=100 loops=100000)
Cache Key: hk.one
Hits: 99999 Misses: 1 Evictions: 0 Overflows: 0
-> Index Only Scan using lookup100_a_idx on lookup100 l
(cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
Index Cond: (a = hk.one)
Heap Fetches: 0
Planning Time: 0.077 ms
Execution Time: 854.950 ms
(10 rows)

v7 + hacks_V3 (caching done in Nested Loop)

Query 1:
explain (analyze, timing off) select count(l.a) from hundredk hk inner
join lookup100 l on hk.one = l.a;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=378570.41..378570.42 rows=1 width=8) (actual rows=1 loops=1)
-> Nested Loop Cached (cost=0.43..353601.00 rows=9987763 width=4)
(actual rows=10000000 loops=1)
Cache Key: $0
Hits: 99999 Misses: 1 Evictions: 0 Overflows: 0
-> Seq Scan on hundredk hk (cost=0.00..1637.00 rows=100000
width=4) (actual rows=100000 loops=1)
-> Index Only Scan using lookup100_a_idx on lookup100 l
(cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
Index Cond: (a = hk.one)
Heap Fetches: 0
Planning Time: 0.103 ms
Execution Time: 770.470 ms
(10 rows)

Query 2
explain (analyze, timing off) select * from hundredk hk inner join
lookup100 l on hk.one = l.a;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Nested Loop Cached (cost=0.43..353601.00 rows=9987763 width=28)
(actual rows=10000000 loops=1)
Cache Key: $0
Hits: 99999 Misses: 1 Evictions: 0 Overflows: 0
-> Seq Scan on hundredk hk (cost=0.00..1637.00 rows=100000
width=24) (actual rows=100000 loops=1)
-> Index Only Scan using lookup100_a_idx on lookup100 l
(cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
Index Cond: (a = hk.one)
Heap Fetches: 0
Planning Time: 0.090 ms
Execution Time: 779.181 ms
(9 rows)

Also, I'd just like to reiterate that the attached is a very rough cut
implementation that I've put together just to use for performance
comparison in order to help move this conversation along. (I do know
that I'm breaking the const qualifier on PlanState's innerops.)

David

[1] https://www.postgresql.org/message-id/CAApHDvqt5U6VcKSm2G9Q1n4rsHejL-VX7QG9KToAQ0HyZymSzQ@mail.gmail.com

Attachment Content-Type Size
resultcache_in_nestloop_hacks_v3.patch.txt text/plain 57.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hao Wu 2020-09-02 04:25:16 Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY
Previous Message Rahila Syed 2020-09-02 03:57:33 Re: More tests with USING INDEX replident and dropped indexes