Quick Links

Re: [PoC] Reducing planning time when tables have many partitions

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: [PoC] Reducing planning time when tables have many partitions
Date:	2022-07-03 21:28:22
Message-ID:	1384699.1656883702@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Yuya Watari <watari(dot)yuya(at)gmail(dot)com> writes:
> On Thu, Mar 24, 2022 at 11:03 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> I think a better way to solve this would be just to have a single hash
>> table over all EquivalenceClasses that allows fast lookups of
>> EquivalenceMember->em_expr.

> If the predicate were "em->em_expr == something", the hash table whose
> key is em_expr would be effective. However, the actual predicates are
> not of this type but the following.

> // Find EquivalenceMembers whose relids is equal to the given relids
> (1) bms_equal(em->em_relids, relids)

> // Find EquivalenceMembers whose relids is a subset of the given relids
> (2) bms_is_subset(em->em_relids, relids)

Yeah, that's a really interesting observation, and I agree that
David's suggestion doesn't address it. Maybe after we fix this
problem, matching of em_expr would be the next thing to look at,
but your results say it isn't the first thing.

I'm not real thrilled with trying to throw hashtables at the problem,
though. As David noted, they'd be counterproductive for simple
queries. Sure, we could address that with duplicate code paths,
but that's a messy and hard-to-tune approach. Also, I find the
idea of hashing on all subsets of relids to be outright scary.
"m is not so large in most cases" does not help when m *is* large.

For the bms_equal class of lookups, I wonder if we could get anywhere
by adding an additional List field to every RelOptInfo that chains
all EquivalenceMembers that match that RelOptInfo's relids.
The trick here would be to figure out when to build those lists.
The simple answer would be to do it lazily on-demand, but that
would mean a separate scan of all the EquivalenceMembers for each
RelOptInfo; I wonder if there's a way to do better?

Perhaps the bms_is_subset class could be handled in a similar
way, ie do a one-time pass to make a List of all EquivalenceMembers
that use a RelOptInfo.

regards, tom lane

In response to

Re: [PoC] Reducing planning time when tables have many partitions at 2022-06-22 09:05:43 from Yuya Watari

Responses

Re: [PoC] Reducing planning time when tables have many partitions at 2022-07-05 08:57:14 from Yuya Watari
Re: [PoC] Reducing planning time when tables have many partitions at 2022-07-27 03:31:08 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2022-07-03 21:28:34	Re: JSON/SQL: jsonpath: incomprehensible error message
Previous Message	Przemysław Sztoch	2022-07-03 20:51:56	Re: [PATCH] Completed unaccent dictionary with many missing characters