Re: [PoC] Reducing planning time when tables have many partitions

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2022-08-26 03:18:30
Message-ID: CAApHDvqMut8wjN8vgHAot6DoHZ1FcGjtuRGaeCrGVvfWVqcmdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 26 Aug 2022 at 12:40, Yuya Watari <watari(dot)yuya(at)gmail(dot)com> wrote:
> This performance degradation is due to the heavy processing of the
> get_ec***_indexes***() functions. These functions are the core part of
> the optimization we are working on in this thread, but they are
> relatively heavy when the number of partitions is small.
>
> I noticed that these functions were called repeatedly with the same
> arguments. During planning Query B with one partition, the
> get_ec_source_indexes_strict() function was called 2087 times with
> exactly the same parameters. Such repeated calls occurred many times
> in a single query.

How about instead of doing this caching like this, why don't we code
up some iterators that we can loop over to fetch the required EMs.

I'll attempt to type out my thoughts here without actually trying to
see if this works:

typedef struct EquivalenceMemberIterator
{
EquivalenceClass *ec;
Relids relids;
Bitmapset *em_matches;
int position; /* last found index of em_matches or -1 */
bool use_index;
bool with_children;
bool with_norel_members;
} EquivalenceMemberIterator;

We'd then have functions like:

static void
get_ecmember_indexes_iterator(EquivalenceMemberIterator *it,
PlannerInfo *root, EquivalenceClass *ec, Relids relids, bool
with_children, bool with_norel_members)
{
it->ec = ec;
it->relids = relids;
it->position = -1;

it->use_index = (root->simple_rel_array_size > 32); /* or whatever
threshold is best */
it->with_children = with_children;
it->with_norel_members = with_norel_members;

if (it->use_index)
it->em_matches = get_ecmember_indexes(root, ec, relids,
with_children, with_norel_members);
else
it->em_matches = NULL;
}

static EquivalenceMember *
get_next_matching_member(PlannerInfo *root, EquivalenceMemberIterator *it)
{
if (it->use_index)
{
it->position = bms_next_member(it->ec_matches, it->position);
if (it->position >= 0)
return list_nth(root->eq_members, it->position);
return NULL;
}
else
{
int i = it->position;
while ((i = bms_next_member(it->ec->ec_member_indexes, i) >= 0)
{
/* filter out the EMs we don't want here "break" when
we find a match */
}
it->position = i;
if (i >= 0)
return list_nth(root->eq_members, i);
return NULL;
}
}

Then the consuming code will do something like:

EquivalenceMemberIterator iterator;
get_ecmember_indexes_iterator(&iterator, root, ec, relids, true, false);

while ((cur_em = get_next_matching_member(root, &it)) != NULL)
{
// do stuff
}

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-08-26 03:26:25 Re: use SSE2 for is_valid_ascii
Previous Message Nathan Bossart 2022-08-26 03:14:37 Re: [PATCH] Optimize json_lex_string by batching character copying