Re: Performance issue in foreign-key-aware join estimation

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: Performance issue in foreign-key-aware join estimation
Date: 2018-12-23 20:38:54
Message-ID: CAKJS1f8w+1a51L1pmiE_-YSUmV5qwQKXsfueQb7=Nnhft4_F0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Sat, 22 Dec 2018 at 10:53, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Back in [1], I mentioned that I'd like to start moving away from our
> linked list based implementation of List and start using an array
> based version instead. If we had this we could easily further improve
> this code fkey matching code to not even look near equivalence classes
> that don't contain members for the relations in question with a design
> something like:
>
> 1. Make PlannerInfo->eq_classes an array list instead of an array,
> this will significantly improve the performance of list_nth().
> 2. Have a Bitmapset per relation that indexes which items in
> eq_classes have ec_members for the given relation.
> 3. In match_eclasses_to_foreign_key_col() perform a bms_overlap() on
> the eq_classes index bitmapsets for the relations at either side of
> the foreign key then perform a bms_next_member() type loop over the
> result of that in order to skip over the eq_classes items that can't
> match.

Using the above idea, but rather than going to the trouble of storing
PlannerInfo->eq_classes as an array type list, if we build the array
on the fly inside match_foreign_keys_to_quals(), then build a
Bitmapset type index to mark which of the eclasses contains members
for each relation, then I can get the run-time for the function down
to just 0.89%. Looking at other functions appearing high on the
profile I also see; have_relevant_eclass_joinclause() (14%),
generate_join_implied_equalities_for_ecs() (23%).

I think if we seriously want to improve planning performance when
there are many stored equivalence classes, then we need to have
indexing along the lines of what I've outlined above.

I've attached the patch I used to test this idea. It might be
possible to develop this into something committable, perhaps if we
invent a new function in equivclass.c that builds the index into a
single struct and we pass a pointer to that down to the functions that
require the index. Such a function could also optionally skip
indexing eclasses such as ones with ec_has_volatile or ec_has_const
when the use case for the index requires ignoring such eclasses.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
poc_eclass_indexing_v1.patch application/octet-stream 4.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2018-12-24 03:44:11 Move regression.diffs of pg_upgrade test suite
Previous Message Alexander Korotkov 2018-12-23 20:14:50 Re: Function to track shmem reinit time