Re: [PoC] Reducing planning time when tables have many partitions

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2022-08-09 07:10:00
Message-ID: CAApHDvoWaCEOA2PbsBWoF95QFo1W8-8O-QW99ZYf-3S18ja6pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 8 Aug 2022 at 23:28, Yuya Watari <watari(dot)yuya(at)gmail(dot)com> wrote:
> If you have already applied David's patch, please start the 'git am'
> command from 0002-Fix-bugs.patch. All regression tests passed with
> this patch on my environment.

Thanks for fixing those scope bugs.

In regards to the 0002 patch, you have;

+ * TODO: "bms_add_members(ec1->ec_member_indexes, ec2->ec_member_indexes)"
+ * did not work to combine two EquivalenceClasses. This is probably because
+ * the order of the EquivalenceMembers is different from the previous
+ * implementation, which added the ec2's EquivalenceMembers to the end of
+ * the list.

as far as I can see, the reason the code I that wrote caused the
following regression test failure;

- Index Cond: ((ff = '42'::bigint) AND (ff = '42'::bigint))
+ Index Cond: (ff = '42'::bigint)

was down to how generate_base_implied_equalities_const() marks the EC
as ec_broken = true without any regard to cleaning up the work it's
partially already complete.

Because the loop inside generate_base_implied_equalities_const() just
breaks as soon as we're unable to find a valid equality operator for
the two given types, with my version, since the EquivalenceMember's
order has effectively changed, we just discover the EC is broken
before we call process_implied_equality() ->
distribute_restrictinfo_to_rels(). In the code you've added, the
EquivalenceMembers are effectively still in the original order and the
process_implied_equality() -> distribute_restrictinfo_to_rels() gets
done before we discover the broken EC. The same qual is just added
again during generate_base_implied_equalities_broken(), which is why
the plan has a duplicate ff=42.

This is all just down to the order that the ECs are merged. If you'd
just swapped the order of the items in the query's WHERE clause to
become:

where ec1.ff = 42::int8 and ss1.x = ec1.f1 and ec1.ff = ec1.f1;

then my version would keep the duplicate qual. For what you've changed
the code to, the planner would not have produced the duplicate ff=42
qual if you'd written the WHERE clause as follows:

where ss1.x = ec1.f1 and ec1.ff = ec1.f1 and ec1.ff = 42::int8;

In short, I think the code I had for that was fine and it's just the
expected plan that you should be editing. If we wanted to this
behaviour to be consistent then the fix should be to make
generate_base_implied_equalities_const() better at only distributing
the quals down to the relations after it has discovered that the EC is
not broken, or at least cleaning up the partial work that it's done if
it discovers a broken EC. The former seems better to me, but I doubt
that it matters too much as broken ECs should be pretty rare and it
does not seem worth spending too much effort making this work better.

I've not had a chance to look at the 0003 patch yet.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-08-09 07:10:55 Re: [RFC] building postgres with meson
Previous Message John Naylor 2022-08-09 06:21:41 Re: optimize lookups in snapshot [sub]xip arrays