|From:||David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>|
|To:||Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>|
|Cc:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>|
|Subject:||Re: Performance issue in foreign-key-aware join estimation|
|Views:||Raw Message | Whole Thread | Download mbox|
On Tue, 25 Dec 2018 at 13:46, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> I however observe failures on 4 regression test suites - inherit,
> equivclass, partition_join and partition_prune (diff attached). That's a
> bit surprising, because AFAICS the patch merely optimizes the execution
> and should not change the planning otherwise (all the failures seem to
> be a query plan changing in some way). I'm not sure if the plans are
> correct or better than the old ones.
Seems I didn't run the tests after doing a last-minute move of the
create_eclass_index() call in make_one_rel(). I'd previously had it
just below set_base_rel_pathlists(), which meant that the index didn't
exist in some cases so it would fall back on the original code. It
appears the use case call from set_base_rel_pathlists() require
is_child eclass members to be indexed too, but these were not in v1 or
v2 since ec_relids does not record child rel members.
It seems simple enough to fix this just by adding ec_allrelids and
setting it for all members rather than just non-children members.
This also allows me to add the two additional cases to allow
has_relevant_eclass_joinclause() to also make use of the eclass index.
This further reduces the total planning time for the query on my
machine to 2304 ms.
> The other thing is that we probably should not use a single test case to
> measure the optimization - I doubt it can improve less extreme queries,
> but maybe we should verify it does not regress them?
Agreed. That still needs to be verified. Although I'm yet unsure what
sort of form we could use this idea in. I wonder if it's fine to
create this eclass index on the fly, or if it's something we should
keep maintained all the time. The problem I saw with keeping it all
the time is down to eq_classes being a List and list_nth() is slow,
which means the bms_next_member() loops I've added would perform
poorly when compiled with a linked list lookup.
> With the patch attached, bms_overlap gets quite high in the profiles. I
> think one reason for that is that all bitmap operations (not just
> bms_overlap) always start the loop at 0. For the upper boundary is
> usually determined as Min() of the lengths, but we don't do that for
> lower boundary because we don't track that. The bitmaps for eclasses are
> likely sparse, so this is quite painful. Attaches is a simple patch that
> adds tracking of "minword" and uses it in various bms_ methods to skip
> initial part of the loop. On my machine this reduces the timings by
> roughly 5% (from 1610 to 1530 ms).
Seems like an interesting idea, although for the most part, Bitmapsets
are small and this adds some overhead to all use cases. I doubt it is
worth the trouble for bms_is_member(), since that does not need to
perform a loop over each bitmapword. It looks like most of the
bms_overlap() usages are caused by functions like
check_outerjoin_delay() and add_paths_to_joinrel(), all of which are
performing a loop over the join_info_list. Perhaps that could be
indexed a similar way to the eq_classes List. I'd imagine there's
about another 20-30% of performance to squeeze out of this by doing
that, plus a bit more by making the fkey_list per RelOptInfo.
Attached is v3 of the hacked up proof of concept performance demo patch.
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
|Next Message||Michael Paquier||2018-12-25 04:34:38||Re: Cache lookup errors with functions manipulation object addresses|
|Previous Message||Mitar||2018-12-25 02:20:01||Re: Feature: triggers on materialized views|