Re: Planning time grows exponentially with levels of nested views

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joel Jacobson" <joel(at)compiler(dot)org>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Planning time grows exponentially with levels of nested views
Date: 2021-04-18 20:14:07
Message-ID: 140747.1618776847@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

[ redirecting to -hackers so the cfbot can see it ]

"Joel Jacobson" <joel(at)compiler(dot)org> writes:
> I assumed the cost for each nested VIEW layer would grow linear,
> but my testing shows it appears to grow exponentially:

I think it's impossible to avoid less-than-O(N^2) growth on this sort
of case. For example, the v2 subquery initially has RTEs for v2 itself
plus v1. When we flatten v1 into v2, v2 acquires the RTEs from v1,
namely v1 itself plus foo. Similarly, once vK-1 is pulled up into vK,
there are going to be order-of-K entries in vK's rtable, and that stacking
makes for O(N^2) work overall just in manipulating the rtable.

We can't get rid of these rtable entries altogether, since all of them
represent table privilege checks that the executor will need to do.
It occurs to me though that we don't need the rte->subquery trees anymore
once those are flattened, so maybe we could do something like the
attached. For me, this reduces the slowdown in your example from
O(N^3) to O(N^2).

I'm slightly worried though by this comment earlier in
pull_up_simple_subquery:

/*
* Need a modifiable copy of the subquery to hack on. Even if we didn't
* sometimes choose not to pull up below, we must do this to avoid
* problems if the same subquery is referenced from multiple jointree
* items (which can't happen normally, but might after rule rewriting).
*/

If multiple references are actually possible then this'd break it. There
seem to be no such cases in the regression tests though, and I'm having a
hard time wrapping my brain around what would cause it. "git blame"
traces this text to my own commit f44639e1b, which has the log entry

Don't crash if subquery appears multiple times in jointree. This should
not happen anyway, but let's try not to get completely confused if it does
(due to rewriter bugs or whatever).

so I'm thinking that this was only hypothetical.

It's possible that we could do something similar in the sibling functions
pull_up_simple_union_all etc, but I'm not sure it's worth troubling over.
TBH, for the size of effect you're showing here, I wouldn't be worried
at all; it's only because it seems to be a one-liner to improve it that
I'm interested in doing something.

regards, tom lane

Attachment Content-Type Size
save-cycles-in-repeated-subquery-pullup-1.patch text/x-diff 785 bytes

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2021-04-18 20:41:53 Re: Planning time grows exponentially with levels of nested views
Previous Message Laurenz Albe 2021-04-18 20:10:50 Re: Vulnerability PostgreSQL 11.2

Browse pgsql-hackers by date

  From Date Subject
Next Message Yura Sokolov 2021-04-18 20:29:03 Re: Old Postgresql version on i7-1165g7
Previous Message Robert Haas 2021-04-18 19:18:33 Re: More info on pg_stat_activity Wait Event Name when is DataFileRead