From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Hybrid Hash/Nested Loop joins and caching results from subplans |
Date: | 2020-11-09 21:38:41 |
Message-ID: | CAApHDvoccsfKtZ1JDnwDvZTFkGFWJcVW2eCbwY6vN+5oTL2xjA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 9 Nov 2020 at 16:29, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> I think either version is OK for me and I like this patch overall.
That's good to know. Thanks.
> However I believe v9
> should be no worse than v8 all the time, Is there any theory to explain
> your result?
Nothing jumps out at me from looking at profiles. The only thing I
noticed was the tuple deforming is more costly with v9. I'm not sure
why.
The other part of v9 that I don't have a good solution for yet is the
code around the swapping of the projection info for the Nested Loop.
The cache always uses a MinimalTupleSlot, but we may have a
VirtualSlot when we get a cache miss. If we do then we need to
initialise 2 different projection infos so when we project from the
cache that we have the step to deform the minimal tuple. That step is
not required when the inner slot is a virtual slot.
I did some further testing on performance. Basically, I increased the
size of the tests by 2 orders of magnitude. Instead of 100k rows, I
used 10million rows. (See attached
resultcache_master_vs_v8_vs_v9_big.csv)
Loading that in with:
# create table resultcache_bench2 (tbl text, target text, col text,
latency_master numeric(10,3), latency_v8 numeric(10,3), latency_v9
numeric(10,3));
# copy resultcache_bench2 from
'/path/to/resultcache_master_vs_v8_vs_v9_big.csv' with(format csv);
I see that v8 still wins.
postgres=# select round(avg(latency_v8/latency_master)*100,1) as
v8_vs_master, round(avg(latency_v9/latency_master)*100,1) as
v9_vs_master, round(avg(latency_v8/latency_v9)*100,1) as v8_vs_v9 from
resultcache_bench2;
v8_vs_master | v9_vs_master | v8_vs_v9
--------------+--------------+----------
56.7 | 58.8 | 97.3
Execution for all tests for v8 runs in 56.7% of master, but v9 runs in
58.8% of master's time. Full results in
resultcache_master_v8_vs_v9_big.txt. v9 wins in 7 of 24 tests this
time. The best example test for v8 shows that v8 takes 90.6% of the
time of v9, but in the tests where v9 is faster, it only has a 4.3%
lead on v8 (95.7%). You can see that overall v8 is 2.7% faster than v9
for these tests.
David
Attachment | Content-Type | Size |
---|---|---|
resultcache_master_v8_vs_v9_big.txt | text/plain | 3.2 KB |
resultcache_master_vs_v8_vs_v9_big.csv | application/vnd.ms-excel | 1.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2020-11-09 22:43:26 | Re: -Wformat-signedness |
Previous Message | Dave Cramer | 2020-11-09 21:26:59 | Re: Error on failed COMMIT |