Re: [PoC] Reducing planning time when tables have many partitions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2022-07-03 21:28:22
Message-ID: 1384699.1656883702@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Yuya Watari <watari(dot)yuya(at)gmail(dot)com> writes:
> On Thu, Mar 24, 2022 at 11:03 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> I think a better way to solve this would be just to have a single hash
>> table over all EquivalenceClasses that allows fast lookups of
>> EquivalenceMember->em_expr.

> If the predicate were "em->em_expr == something", the hash table whose
> key is em_expr would be effective. However, the actual predicates are
> not of this type but the following.

> // Find EquivalenceMembers whose relids is equal to the given relids
> (1) bms_equal(em->em_relids, relids)

> // Find EquivalenceMembers whose relids is a subset of the given relids
> (2) bms_is_subset(em->em_relids, relids)

Yeah, that's a really interesting observation, and I agree that
David's suggestion doesn't address it. Maybe after we fix this
problem, matching of em_expr would be the next thing to look at,
but your results say it isn't the first thing.

I'm not real thrilled with trying to throw hashtables at the problem,
though. As David noted, they'd be counterproductive for simple
queries. Sure, we could address that with duplicate code paths,
but that's a messy and hard-to-tune approach. Also, I find the
idea of hashing on all subsets of relids to be outright scary.
"m is not so large in most cases" does not help when m *is* large.

For the bms_equal class of lookups, I wonder if we could get anywhere
by adding an additional List field to every RelOptInfo that chains
all EquivalenceMembers that match that RelOptInfo's relids.
The trick here would be to figure out when to build those lists.
The simple answer would be to do it lazily on-demand, but that
would mean a separate scan of all the EquivalenceMembers for each
RelOptInfo; I wonder if there's a way to do better?

Perhaps the bms_is_subset class could be handled in a similar
way, ie do a one-time pass to make a List of all EquivalenceMembers
that use a RelOptInfo.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2022-07-03 21:28:34 Re: JSON/SQL: jsonpath: incomprehensible error message
Previous Message Przemysław Sztoch 2022-07-03 20:51:56 Re: [PATCH] Completed unaccent dictionary with many missing characters