Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct

From: Casey Allen Shobe <casey(dot)shobe(at)messagesystems(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct
Date: 2012-01-12 18:03:25
Message-ID: 20120112130325.00fc8782@cshobe.int.messagesystems.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, 11 Jan 2012 19:40:34 -0500, Andres wrote:
> Unfortunately the pastebins where raptelan provided plans expired by
> now... Perhaps he can provide them again?

Sure, the original curiosity I noticed was that adjusting the block
size of results returned by the CTE had widely different effects, 5000
seemed to be some sort of "magic number", while either 6000 or 4000
worked poorly. Originally, our design used blocks of 100,000.

I then noticed a regression with the 5,000 block size. The difference
in the queries between a prototype (that was pretty fast (5-10s) and
what was generated (slow, (5m+)) was that the order of columns in one
of the join conditions in the main query was reversed. ON
source.column = joined.column was slow, while ON joined.column =
source.column was fast. Apparently this is enough to get a different
possible plan to hit the planner first, while another slow plan with
nearly identical estimates is hitting the planner first in other cases.

Attached are three files - one shows the fast plan, another the slow
plan, and another with the query in question. The ON clause where
reversal makes a difference is the td_13 one. I use a different range
in the CTE for both queries as otherwise filesystem cache makes the
timings look better, but in both cases, the CTE returns exactly 5,000
results.

Yes, the query could be written a lot better, but currently it's
generated this way to conform to an expectation of user-defined custom
where clauses that do not qualify column names. Breaking that
compatibility and redoing this better is a longer-term plan.

Please let me know if I'm omitting any important details.

Regards,
--
Casey Allen Shobe | Senior Software Engineer/DBA
Message Systems | http://messagesystems.com
casey(dot)shobe(at)messagesystems(dot)com | 443-656-3311 x248

Attachment Content-Type Size
230final.sql text/x-sql 3.9 KB
plan.fast.txt text/plain 8.7 KB
plan.slow.txt text/plain 8.4 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-01-12 18:19:14 Re: BUG #6393: cluster sometime fail under heavy concurrent write load
Previous Message Tom Lane 2012-01-12 17:24:28 Re: BUG #6393: cluster sometime fail under heavy concurrent write load