Re: Bug in planner

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug in planner
Date: 2015-04-25 10:44:08
Message-ID: CAApHDvpyO-wOp2RP+AghLDfwMo7XAsxgMm3GRTJ-tW3yG1gOvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 April 2015 at 21:43, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:

> Hi!
>
> I faced with planner error:
> ERROR: could not find RelOptInfo for given relids
>
>
Good find!

I've simplified your query a bit, the following still shows the issue
(using your schema):

SELECT *
FROM t1
WHERE NOT EXISTS (SELECT t2.c2 AS c1
FROM t2
LEFT OUTER JOIN t3 ON t2.c3 = t3.c1
LEFT OUTER JOIN (SELECT t5.c1 AS c1
FROM t4
LEFT OUTER JOIN t5 ON t4.c2 = t5.c1 --offset 0
) a1 ON a1.c1 = t3.c2
WHERE t1.c1 = t2.c2
);

I've done a little debugging on this too and I get the idea that in
eqjoinsel() that min_righthand incorrectly does not have a bit set for "t3"

When the failing call is made to find_join_rel() with the above query
relids being searched for has a decimal value of 388 (binary 110000100 i.e
t5, t4, t2)

find_join_rel makes a pass over join_rel_list to search for the 388 valued
relids.

join_rel_list contains the following:

1 -> 396 (110001100) t5, t4, t3, t2
2 -> 384 (110000000) t5, t4
3 -> 392 (110001000) t5, t4, t3
4 -> 396 (110001100) t5, t4, t3, t2

I looked up simple_rte_array to determine which bits are for which relation.

simple_rte_array:
1 -> t1
2 -> t2
3 -> t3
4 -> join
5 -> a1
6 -> join
7 -> t4
8 -> t5

I'd imagine that the find_join_input_rel() search should actually be for
relids 396. I need to spend more time in this area to get a better grasp of
what's actually meant to be happening, but I think the problem lies
in make_outerjoininfo() when it determines what min_righthand should be set
to with the following:

/*
* Similarly for required RHS. But here, we must also include any lower
* inner joins, to ensure we don't try to commute with any of them.
*/
min_righthand = bms_int_members(bms_union(clause_relids, inner_join_rels),
right_rels);

I think the problem seems to be down to the fact that inner_join_rels and
clause_relids are built from deconstruct_jointree() which I'd imagine does
not get modified when the subquery for t4 and t5 is pulled up, therefore is
out-of-date. </theory>

I've attached a patch which appears to fix the problem, but this is more
for the purposes of a demonstration of where the problem lies. I don't have
enough knowledge of what's meant to be happening here, I'll need to spend
more time reading code and debugging.

On a side note, I just discovered another join removal opportunity:

explain select * from t1 where not exists(select 1 from t2 left join t3 on
t2.c1 = t3.c1 where t1.c1=t2.c1);

The join to t3 here is useless, as since it's a left join, the join could
only cause duplication of t2 rows, and this does not matter as we're
performing an anti join anyway (same applies for semi join).

Regards

David Rowley

Attachment Content-Type Size
anti_join_with_pulledup_outer_joins_fix_attempt.patch application/octet-stream 553 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2015-04-25 11:30:24 Re: Feedback on getting rid of VACUUM FULL
Previous Message Andres Freund 2015-04-25 09:01:14 INSERT ... ON CONFLICT syntax issues