Re: [PoC] Reducing planning time when tables have many partitions

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2022-12-12 04:50:09
Message-ID: CAApHDvpELk-FTdwrJx0820i6d7N8LCo-ss5a_UYE0ZOO-HEQLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for running all the benchmarks on v10.

On Thu, 8 Dec 2022 at 00:31, Yuya Watari <watari(dot)yuya(at)gmail(dot)com> wrote:
> The above results show that the reverts I have made in v9-0002 and
> v9-0003 are very important in avoiding degradation. I think we should
> apply these changes again. It is unclear whether v9 or v10 + v9-0002 +
> v9-0003 is better, but the latter performed better in my experiments.

I was hoping to keep the logic which decides to loop over ec_members
or use the bitmap indexes all in equivclass.c, ideally in the iterator
code.

I've looked at the v9-0002 patch and I'm thinking maybe it's ok since
it always loops over ec_nonchild_indexes. We process the base
relations first, so all the EquivalenceMember in PlannerInfo for these
will be at the start of the eq_members list and the Bitmapset won't
have many bitmapwords to loop over. Additionally, it's only looping
over the nonchild ones, so a large number of partitions existing has
no effect on the number of loops performed.

For v9-0003, I was really hoping to find some kind of workaround so we
didn't need the "if (root->simple_rel_array_size < 32)". The problem
I have with that is; 1) why is 32 a good choice?, and 2)
simple_rel_array_size is just not a great thing to base the decision
off of. For #1, we only need to look at the EquivalenceMembers
belonging to base relations here and simple_rel_array_size includes
all relations, including partitions, so even if there's just a few
members belonging to base rels, we may still opt to use the Bitmapset
method. Additionally, it does look like this patch should be looping
over ec_nonchild_indexes rather than ec_member_indexes and filtering
out the !em->em_is_const && !em->em_is_child EquivalenceMembers.

Since both the changes made in v9-0002 and v9-0003 can just be made to
loop over ec_nonchild_indexes, which isn't going to get big with large
numbers of partitions, then I wonder if we're ok just to do the loop
in all cases rather than conditionally try to do something more
fanciful with counting bits like I had done in
select_outer_pathkeys_for_merge(). I've made v11 work like what
v9-0003 did and I've used v9-0002. I also found a stray remaining
"bms_membership(eclass->ec_member_indexes) != BMS_MULTIPLE" in
eclass_useful_for_merging() that should have been put back to
"list_length(eclass->ec_members) <= 1".

I've still got a couple of things in mind that I'd like to see done to
this patch.

a) I think the iterator code should have some additional sanity checks
that the results of both methods match when building with
USE_ASSERT_CHECKING. I've got some concerns that we might break
something. The logic about what the em_relids is set to for child
members is a little confusing. See add_eq_member().
b) We still need to think about if adding a RelOptInfo to
PlannerInfo->simple_rel_array[0] is a good idea for solving the append
relation issue. Ideally, we'd have a proper varno for these Vars
instead of setting varno=0 per what's being done in
generate_append_tlist().

I've attached the v11 set of patches.

David

Attachment Content-Type Size
v11-0001-Add-Bitmapset-indexes-for-faster-lookup-of-Equiv.patch text/plain 81.6 KB
v11-0002-Adjust-bms_int_members-so-that-it-shortens-the-l.patch text/plain 2.5 KB
v11-0003-Add-iterators-for-looping-over-EquivalenceMember.patch text/plain 22.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zheng Li 2022-12-12 04:58:27 Re: Support logical replication of DDLs
Previous Message Michael Paquier 2022-12-12 04:09:40 Re: Checksum errors in pg_stat_database