planner weirdness: a join uses nestloop with checking condition when there are two subplan-or-hashed subqueries

From: Alexey Bashtanov <bashtanov(at)imap(dot)cc>
To: pgsql-bugs(at)postgresql(dot)org
Subject: planner weirdness: a join uses nestloop with checking condition when there are two subplan-or-hashed subqueries
Date: 2020-02-25 14:32:17
Message-ID: ff42b25b-ff03-27f8-ed11-b8255d658cd5@imap.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

Planner seems to make a weird choice of join algorithm -- O(M*N)
nestloop -- under certain circumstances.
I'm not exactly sure what is the criteria but I have a self-contained
example, albeit a large one.

So if you unpack the pb-dump.sql.bz2 attached and run the pb-test.sql
you'll have some plan explained.
Despite an index present on "qux" it's accessed using a plain seq-seq
nestloop, see pb-plan.txt

Why? Cardinalities look like predicted reasonably well, and with those
predictions hash join or index scan would
be obviously faster: the planner thinks we are joining 149 and 175728 rows.
The distribution for "qux"."foo_id" is not too skew, the average number
of rows per "foo_id" in "qux" is about 9.
Slight data or query variations make it use the index.
With "set enable_nestloop to off; set enable_mergejoin to off;" the plan
generated is better. It has smaller cost of the final join, though the
costs for the outer relation increase, probably due to the "never
executed" path.

Playing with json/from_collapse_limit does not make any difference.

I can observe this on both master and v 10.11 .

I haven't investigated it any further yet, so for now just asking
whether it's a known behavior?
If not, I'll try to find out what's going on.

Best, Alex

Attachment Content-Type Size
pb-plan.txt text/plain 4.8 KB
pb-test.sql application/sql 1.6 KB
pb-dump.sql.bz2 application/x-bzip 6.8 MB

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2020-02-25 15:05:34 Re: BUG #16275: we are facing error as psycopg2.errors.ProgramLimitExceeded: row is too big: size 24520, maximum
Previous Message eli.mach 2020-02-25 13:59:27 Re: pretty_bool in pg_get_constraintdef has no effect since pg >= 9