Re: Performance issue in foreign-key-aware join estimation

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Performance issue in foreign-key-aware join estimation
Date: 2018-12-25 00:46:51
Message-ID: 3faaa4df-2fa1-9162-2e1b-db48d4334aca@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/24/18 1:07 PM, David Rowley wrote:
> On Mon, 24 Dec 2018 at 09:38, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> Using the above idea, but rather than going to the trouble of storing
>> PlannerInfo->eq_classes as an array type list, if we build the array
>> on the fly inside match_foreign_keys_to_quals(), then build a
>> Bitmapset type index to mark which of the eclasses contains members
>> for each relation, then I can get the run-time for the function down
>> to just 0.89%. Looking at other functions appearing high on the
>> profile I also see; have_relevant_eclass_joinclause() (14%),
>> generate_join_implied_equalities_for_ecs() (23%).
>
> I've now expanded the proof of concept patch to use this indexing
> technique for have_relevant_eclass_joinclause() and
> generate_join_implied_equalities_for_ecs(). With Tom's test from
> up-thread, I get:
>
> Master:
> latency average = 14125.374 ms
>
> Patched:
> latency average = 2417.164 ms
>

Yes, I can confirm these measurements. On my machine, timing on master
is about 10530ms, with v1 of the patch it drops to 2600ms and v2 pushes
it down to 1610ms.

I however observe failures on 4 regression test suites - inherit,
equivclass, partition_join and partition_prune (diff attached). That's a
bit surprising, because AFAICS the patch merely optimizes the execution
and should not change the planning otherwise (all the failures seem to
be a query plan changing in some way). I'm not sure if the plans are
correct or better than the old ones.

The other thing is that we probably should not use a single test case to
measure the optimization - I doubt it can improve less extreme queries,
but maybe we should verify it does not regress them?

With the patch attached, bms_overlap gets quite high in the profiles. I
think one reason for that is that all bitmap operations (not just
bms_overlap) always start the loop at 0. For the upper boundary is
usually determined as Min() of the lengths, but we don't do that for
lower boundary because we don't track that. The bitmaps for eclasses are
likely sparse, so this is quite painful. Attaches is a simple patch that
adds tracking of "minword" and uses it in various bms_ methods to skip
initial part of the loop. On my machine this reduces the timings by
roughly 5% (from 1610 to 1530 ms).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
regression.diffs text/plain 17.3 KB
bitmapset.patch text/x-patch 8.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-12-25 00:55:29 Re: pg_dumpall --exclude-database option
Previous Message Mitar 2018-12-25 00:13:44 Re: Feature: triggers on materialized views