Re: Performance regression with PostgreSQL 11 and partitioning

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Thomas Reiss <thomas(dot)reiss(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance regression with PostgreSQL 11 and partitioning
Date: 2018-06-06 11:35:18
Message-ID: CAFjFpReM33EhNHrrDFEsOhzxCG68fg5=7DZQA4yf5AHudNj7KQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 6, 2018 at 11:27 AM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>
> I was trying to be realistic for something we can do to fix v11. It's
> probably better to minimise the risky surgery on this code while in
> beta. What I proposed was intended to fix a performance regression new
> in v11. I'm not sure what you're proposing has the same intentions.

Agreed that we want a less risky fix at this stage. What I am worried
is with your implementation there are two ways to get AppendRelInfo
corresponding to a child, append_rel_list and append_rel_array. Right
now we will change all the code to use append_rel_array but there is
no guarantee that some code in future will use append_rel_list. Bugs
would rise if these two get out of sync esp. considering
append_rel_list is a list which can be easily modified. I think we
should avoid that. See what we do to rt_fetch() and
planner_rt_fetch(), but we still have two ways to get an RTE. That's a
source of future bugs.

>
> Probably, if you do want an efficient way to store the children
> belonging to a parent we could just have another array of Bitmapsets
> which would just serve as a set of indexes into the array I've added.
>

I was actually wrong when I proposed that we store AppendRelInfos
indexed by parent_id in the same array. You are right; there can be
multiple children with same parent id. I like your solution of having
an array of bitmapsets, members within which are indexes into
append_rel_array.

> I'd prefer to get a committers thoughts on this before doing any further work.

Yes. I think so too.

>
> I've attached a cleaned up patch from the last one. This just adds
> some sanity checks to make sure we get an ERROR if we do ever see two
> AppendRelInfos with the same child relation id. I've also made it so
> the append_rel_array is only allocated when there are append rels.
>

In find_appinfos_by_relids(), we should Assert that the child_id in
the AppendRelInfo obtained from the array is same as its index. My
guess is that we could actually get rid of child_relid member of
AppendRelInfo altogether if we directly store AppendRelInfos in the
array without creating a list first. But that may be V12 material.

Also in the same function we want to Assert that cnt never exceeds *nappinfo.

+ Assert(root->append_rel_array);
root->append_rel_array != NULL seems to be a preferred style in the code.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-06-06 11:51:06 Re: Remove mention in docs that foreign keys on partitioned tables are not supported
Previous Message Etsuro Fujita 2018-06-06 11:30:52 Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.