Re: Inheritance planner CPU and memory usage change since 9.3.2

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inheritance planner CPU and memory usage change since 9.3.2
Date: 2015-06-21 04:27:25
Message-ID: CA+TgmoZKO0e+LADPJKbHcoWX6qOfb+3Gvaxuv7n1XMnWzvgBFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 20, 2015 at 6:48 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Meanwhile, here is an updated patch.
>
> I don't care for that patch too much: it seems a bit brute-force, and I'm
> quite worried by the assumption that it's okay to destroy each child's
> append_rel_list after processing the child. That would fail if any of
> the Vars/subexpressions in the lists get incorporated into the resulting
> child Plan, which does not seem impossible. (I think in many cases we'd
> do a copyObject() when extracting an append_rel_list expression, but this
> hardly seems guaranteed.)

Well, if such a thing is possible, the regression tests don't catch it
- can we add one that would?

> I propose instead the attached patch, which operates by identifying which
> of the append_rel_list entries actually contain subquery references, and
> copying only those; the other ones are just linked into the child's
> append_rel_list by reference, which is okay because they won't get
> modified.

I thought about that approach, but wasn't sure if I could make it
simple enough to pass muster. Note that I generally erred on the side
of deferring all work as long as possible and to the greatest extent
possible in a way that your patch does not. We don't need to compute
modifiableARIindexes if subqueryRTindexes ends up empty, and we
definitely don't need to generate O(n^2) list cells in that case. I
think that latter point, at least, is quite likely to be worth
optimizing. Granted, spewing out extra ListCells is far less harmful
than doing the same thing with AppendRelInfos and their entire list of
translated_vars, but it's still not good. Can't we move the loop that
copies root.append_rel_list inside if (final_rtable != NIL &&
!bms_is_empty(subqueryRTindexes))? If we don't take that branch,
root.append_rel_list isn't getting modified at all, so a shallow copy
is good enough.

> On my laptop, I get the following timings for your test queries from
> unmodified HEAD (--cassert build):
>
> # Q1: 41260.239 ms
> # Q2: 45225.768 ms
> # Q3: 43066.958 ms
> # Q4: 193360.726 ms
> # Q5: 40746.503 ms
>
> and these with my patch:
>
> # Q1: 1767.753 ms
> # Q2: 3662.131 ms
> # Q3: 814.293 ms
> # Q4: 64468.914 ms
> # Q5: 881.295 ms
>
> which seems to be generally a better result.

Better than unpatched, definitely! Not sure how it compares to my patch.

>> The extraordinarily planning time for query 4 is caused by a
>> completely different problem: SearchCatCache eats up huge amounts of
>> CPU; its callers are get_attavgwidth and get_typlen. It's not clear
>> to me why doubling the number of relations causes such an enormous
>> explosion in calls to those functions; I would expect the number of
>> calls to double, but presumably the actual increase is much more.
>
> Actually, Q4 necessarily involves O(N^2) planning time, because for
> each of N target relations you're considering a join to an N-member
> inheritance tree. A lot of those ultimately get thrown away by
> constraint exclusion, but not before we've expended significant
> cycles on them. I do not think we are going to get much traction
> on that --- even if we do something to knock off whatever the leading
> term is, there'll still be more O(N^2) behavior right behind it.

Hmm, maybe so. On the other hand, if there's a way to significantly
shrink the constant factor on that O(N^2) stuff, it could bring a lot
of people some much-needed relief.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2015-06-21 08:12:07 Re: pgbench - allow backslash-continuations in custom scripts
Previous Message Robert Haas 2015-06-21 03:55:53 Re: pg_stat_*_columns?