Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-11-09 03:28:59
Message-ID: CAKU4AWqA7weLsrvBzenfm_2TCEMw1MwRQ=eG4p0y6xGFg7ms9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 9, 2020 at 10:07 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Mon, 9 Nov 2020 at 03:52, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> > then I did a perf on the 2 version, Is it possible that you called
> tts_minimal_clear twice in
> > the v9 version? Both ExecClearTuple and ExecStoreMinimalTuple called
> tts_minimal_clear
> > on the same slot.
> >
> > With the following changes:
>
> Thanks for finding that. After applying that fix I did a fresh set of
> benchmarks on the latest master, latest master + v8 and latest master
> + v9 using the attached script. (resultcachebench2.sh.txt)
>
> I ran this on my zen2 AMD64 machine and formatted the results into the
> attached resultcache_master_vs_v8_vs_v9.csv file
>
> If I load this into PostgreSQL:
>
> # create table resultcache_bench (tbl text, target text, col text,
> latency_master numeric(10,3), latency_v8 numeric(10,3), latency_v9
> numeric(10,3));
> # copy resultcache_bench from
> '/path/to/resultcache_master_vs_v8_vs_v9.csv' with(format csv);
>
> and run:
>
> # select col,tbl,target, sum(latency_v8) v8, sum(latency_v9) v9,
> round(avg(latency_v8/latency_v9)*100,1) as v8_vs_v9 from
> resultcache_bench group by 1,2,3 order by 2,1,3;
>
> I've attached the results of the above query. (resultcache_v8_vs_v9.txt)
>
> Out of the 24 tests done on each branch, only 6 of 24 are better on v9
> compared to v8. So v8 wins on 75% of the tests.

I think either version is OK for me and I like this patch overall. However
I believe v9
should be no worse than v8 all the time, Is there any theory to explain
your result?

v9 never wins using
> the lookup1 table (1 row per lookup). It only runs on 50% of the
> lookup100 queries (100 inner rows per outer row). However, despite the
> draw in won tests for the lookup100 test, v8 takes less time overall,
> as indicated by the following query:
>
> postgres=# select round(avg(latency_v8/latency_v9)*100,1) as v8_vs_v9
> from resultcache_bench where tbl='lookup100';
> v8_vs_v9
> ----------
> 99.3
> (1 row)
>
> Ditching the WHERE clause and simply doing:
>
> postgres=# select round(avg(latency_v8/latency_v9)*100,1) as v8_vs_v9
> from resultcache_bench;
> v8_vs_v9
> ----------
> 96.2
> (1 row)
>
> indicates that v8 is 3.8% faster than v9. Altering that query
> accordingly indicates v8 is 11.5% faster than master and v9 is only 7%
> faster than master.
>
> Of course, scaling up the test will yield both versions being even
> more favourable then master, but the point here is comparing v8 to v9.
>
> David
>

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-11-09 04:53:19 Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Previous Message osumi.takamichi@fujitsu.com 2020-11-09 03:27:55 RE: extension patch of CREATE OR REPLACE TRIGGER