Re: Performance issue in foreign-key-aware join estimation

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: Performance issue in foreign-key-aware join estimation
Date: 2019-07-21 05:43:27
Message-ID: CAKJS1f8Y=DR-xphKhqdvFHUh9ozA+hbyevxNBV9ys5uNHfavmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 18 Jul 2019 at 19:24, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Unless there's some objection, I'll be looking into pushing both 0001
> and 0002 in a single commit in the next few days.

I've pushed this after doing a bit of final tweaking.

After a re-read, I didn't really like all the code that rechecked that
ec->ec_relids matched the relation we're searching for. The only code
that seems to be able to put additional members into eclass_indexes
that there's no mention of in the actual class was in
add_child_rel_equivalences(). The comments for the definition of
eclass_indexes said nothing about there being a possibility of the
field containing an index for an EC that knows nothing about the given
relation. Fixing that either meant fixing the comment to say that
"they *may* contain", or fixing up the code so that it's strict about
what ECs can be mentioned in eclass_indexes. I went with the fixing
the code option since it also allows us to get rid of some redundant
checks, to which I turned into Asserts() to catch any possible future
bugs that might be introduced by any code that might one day remove
rels from an EC, e.g something like
https://commitfest.postgresql.org/23/1712/

I also did some performance tests on the most simple query I could
think of that uses eclasses.

select2.sql: SELECT * FROM t1 INNER JOIN t2 ON t1.a=t2.a

Master:

$ pgbench -n -f select2.sql -T 60 postgres
tps = 12143.597276 (excluding connections establishing)
tps = 12100.773839 (excluding connections establishing)
tps = 12086.209389 (excluding connections establishing)
tps = 12098.194927 (excluding connections establishing)
tps = 12105.140058 (excluding connections establishing)

Patched:

$ pgbench -n -f select2.sql -T 60 postgres
tps = 12224.597530 (excluding connections establishing)
tps = 12097.286522 (excluding connections establishing)
tps = 12035.306729 (excluding connections establishing)
tps = 11965.848289 (excluding connections establishing)
tps = 12059.846474 (excluding connections establishing)

There's a bit of noise there, but on average we're just 0.25% slower
on the worse case and the benchmarks shown above put us ~406% better
on with the fairly complex query that Tom posted in the initial email
on this thread.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michail Nikolaev 2019-07-21 05:46:37 Re: thoughts on "prevent wraparound" vacuum
Previous Message Alexander Lakhin 2019-07-21 05:28:53 Fix typos and inconsistencies for HEAD (take 7)