Re: Performance regression with PostgreSQL 11 and partitioning

From: Thomas Reiss <thomas(dot)reiss(at)dalibo(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance regression with PostgreSQL 11 and partitioning
Date: 2018-06-18 10:02:44
Message-ID: 47f9b2b8-6b6e-1c5e-c2e6-3f50b1ca735d@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le 18/06/2018 à 10:46, David Rowley a écrit :
> On 12 June 2018 at 01:49, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, Jun 8, 2018 at 3:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>> That being said, I don't mind a bit if you want to look for further
>>>> speedups here, but if you do, keep in mind that a lot of queries won't
>>>> even use partition-wise join, so all of the arrays will be of length
>>>> 1. Even when partition-wise join is used, it is quite likely not to
>>>> be used for every table in the query, in which case it will still be
>>>> of length 1 in some cases. So pessimizing nappinfos = 1 even slightly
>>>> is probably a regression overall.
>>>
>>> TBH, I am way more concerned about the pessimization introduced for
>>> every pre-existing usage of these functions by putting search loops
>>> into them at all. I'd like very much to revert that. If we can
>>> replace those with something along the line of root->index_array[varno]
>>> we'll be better off across the board.
>>
>> I think David's analysis is correct -- this doesn't quite work. We're
>> trying to identify whether a given varno is one of the ones to be
>> translated, and it's hard to come up with a faster solution than
>> iterating over a (very short) array of those values. One thing we
>> could do is have two versions of each function, or else an optimized
>> path for the very common nappinfos=1 case. I'm just not sure it would
>> be worthwhile. Traversing a short array isn't free, but it's pretty
>> cheap.
>
> So this is the current situation as far as I see it:
>
> We could go and add a new version of adjust_appendrel_attrs() and
> adjust_appendrel_attrs_mutator() that accept a Relids for the child
> rels rather than an array of AppendRelInfos. However, that's quite a
> lot of code duplication. We could perhaps cut down on duplication by
> having a callback function stored inside of
> adjust_appendrel_attrs_context which searches for the correct
> AppendRelInfo to use. However, it does not seem to be inline with
> simplifying the code.
>
> We've not yet heard back from Tom with more details about his
> root->index_array[varno] idea. I can't quite see how this is possible
> and for the moment I assume Tom misunderstood that it's the parent
> varno that's known, not the varno of the child.
>
> I've attached a patch which cleans up my earlier version and moves the
> setup of the append_rel_array into its own function instead of
> sneaking code into setup_simple_rel_arrays(). I've also now updated
> the comment above find_childrel_appendrelinfo(), which is now an
> unused function.
>
> I tried the following test case:
>
> CREATE TABLE partbench (date TIMESTAMP NOT NULL, i1 INT NOT NULL, i2
> INT NOT NULL, i3 INT NOT NULL, i4 INT NOT NULL, i5 INT NOT NULL)
> PARTITION BY RANGE (date);
> \o /dev/null
> select 'CREATE TABLE partbench' || x::text || ' PARTITION OF partbench
> FOR VALUES FROM (''' || '2017-03-06'::date + (x::text || '
> hours')::interval || ''') TO (''' || '2017-03-06'::date + ((x+1)::text
> || ' hours')::interval || ''');'
> from generate_Series(0,9999) x;
> \gexec
> \o
>
> SELECT * FROM partbench WHERE date = '2018-04-23 00:00:00';
>
> Patched
>
> Time: 190.989 ms
> Time: 187.313 ms
> Time: 190.239 ms
>
> Unpatched
>
> Time: 775.771 ms
> Time: 781.631 ms
> Time: 762.777 ms
>
> Is this good enough for v11?

It works pretty well with your last patch. IMHO, this issue should be
addressed in v11. I used a pretty unrealistic test-case to show this
regression but it appear with a reasonnable count of partitions, v11 is
slower than v10 even with 10 partitions.

Thanks a lot !

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-06-18 10:29:58 Runtime partition pruning for MergeAppend
Previous Message Amit Khandekar 2018-06-18 09:41:18 Re: server crashed with TRAP: FailedAssertion("!(!parallel_aware || pathnode->path.parallel_safe)"