Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2021-02-21 10:11:39
Message-ID: CAKU4AWpFJqwdupZEZZZT6XUGoR3WOj=AFiGaeYT8yf3ys1Pm1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 16, 2021 at 6:16 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Wed, 3 Feb 2021 at 19:51, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > I've attached a spreadsheet with the results of each of the tests.
> >
> > The attached file v13_costing_hacks.patch.txt is the quick and dirty
> > patch I put together to run test 5.
>
> I've attached an updated set of patches. I'd forgotten to run make
> check-world with the 0005 patch and that was making the CF bot
> complain. I'm not intending 0005 for commit in the state that it's
> in, so I've just dropped it.
>
> I've also done some further performance testing with the attached set
> of patched, this time I focused solely on planner performance using
> the Join Order Benchmark. Some of the queries in this benchmark do
> give the planner quite a bit of exercise. Queries such as 29b take my
> 1-year old, fairly powerful AMD hardware about 78 ms to make a plan
> for.
>
> The attached spreadsheet shows the details of the results of these
> tests. Skip to the "Test6 no parallel 100 stats EXPLAIN only" sheet.
>
> To get these results I just ran pgbench for 10 seconds on each query
> prefixed with "EXPLAIN ".
>
> To summarise here, the planner performance gets a fair bit worse with
> the patched code. With master, summing the average planning time over
> each of the queries resulted in a total planning time of 765.7 ms.
> After patching, that went up to 1097.5 ms. I was pretty disappointed
> about that.
>
> On looking into why the performance gets worse, there's a few factors.
> One factor is that I'm adding a new path to consider and if that path
> sticks around then subsequent joins may consider that path. Changing
> things around so I only ever add the best path, the time went down to
> 1067.4 ms. add_path() does tend to ditch inferior paths anyway, so
> this may not really be a good thing to do. Another thing that I picked
> up on was the code that checks if a Result Cache Path is legal to use,
> it must check if the inner side of the join has any volatile
> functions. If I just comment out those checks, then the total planning
> time goes down to 985.6 ms. The estimate_num_groups() call that the
> costing for the ResultCache path must do to estimate the cache hit
> ratio is another factor. When replacing that call with a constant
> value the total planning time goes down to 905.7 ms.
>
> I can see perhaps ways that the volatile function checks could be
> optimised a bit further, but the other stuff really is needed, so it
> appears if we want this, then it seems like the planner is going to
> become slightly slower. That does not exactly fill me with joy. We
> currently have enable_partitionwise_aggregate and
> enable_partitionwise_join which are both disabled by default because
> of the possibility of slowing down the planner. One option could be
> to make enable_resultcache off by default too. I'm not really liking
> the idea of that much though since anyone who leaves the setting that
> way won't ever get any gains from caching the inner side of
> parameterised nested loop results.
>
> The idea I had to speed up the volatile function call checks was along
> similar lines to what parallel query does when it looks for parallel
> unsafe functions in the parse. Right now those checks are only done
> under a few conditions where we think that parallel query might
> actually be used. (See standard_planner()). However, with Result
> Cache, those could be used in many other cases too, so we don't really
> have any means to short circuit those checks. There might be gains to
> be had by checking the parse once rather than having to call
> contains_volatile_functions in the various places we do call it. I
> think both the parallel safety and volatile checks could then be done
> in the same tree traverse. Anyway. I've not done any hacking on this.
> It's just an idea so far.
>
>

> Does anyone have any particular thoughts on the planner slowdown?
>

I used the same JOB test case and testing with 19c.sql, I can get a similar
result with you (There are huge differences between master and v14). I
think the reason is we are trying the result cache path on a very hot line (
nest loop inner path), so the cost will be huge. I see
get_resultcache_path
has some fastpath to not create_resultcache_path, but the limitation looks
too broad. The below is a small adding on it, the planing time can be
reduced from 79ms to 52ms for 19c.sql in my hardware.

+ /*
+ * If the inner path is cheap enough, no bother to try the result
+ * cache path. 20 is just an arbitrary value. This may reduce some
+ * planning time.
+ */
+ if (inner_path->total_cost < 20)
+ return NULL;

> I used imdbpy2sql.py to parse the imdb database files and load the
> data into PostgreSQL. This seemed to work okay apart from the
> movie_info_idx table appeared to be missing. Many of the 113 join
> order benchmark queries need this table.

I followed the steps in [1] and changed something with the attached patch.
At last I got 2367725 rows. But probably you are running into a different
problem since no change is for movie_info_idx table.

[1] https://github.com/gregrahn/join-order-benchmark

--
Best Regards
Andy Fan (https://www.aliyun.com/)

Attachment Content-Type Size
0001-fix.patch application/octet-stream 3.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-02-21 10:47:54 Re: Use pgstat_progress_update_multi_param instead of single param update
Previous Message Markus Wanner 2021-02-21 10:05:34 Re: [PATCH] Present all committed transaction to the output plugin