Re: BUG #17976: Inconsistent results of SELECT using CASE WHEN clause

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: zuming(dot)jiang(at)inf(dot)ethz(dot)ch
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17976: Inconsistent results of SELECT using CASE WHEN clause
Date: 2023-06-14 23:34:23
Message-ID: 2023628.1686785663@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> My fuzzer finds a logical bug in Postgres, which makes Postgres return
> inconsistent results.

Thanks for the report!

I poked at this a little bit. v16 is actually generating an incorrect
plan for both query variants; the bad plan for the second variant just
happens to not give visibly wrong answers for this input data.

regression=# explain (costs off)
regression-# select count(*) as c_6
regression-# from
regression-# (t1 as ref_15
regression(# left outer join t4 as ref_16
regression(# on (ref_15.pkey = ref_16.c_2))
regression-# where (case when true
regression(# then ref_16.c_3 else ref_16.c_3 end
regression(# ) = pg_catalog.dcbrt(case when (((ref_15.c5 like '7%z')
regression(# and (not (ref_15.c5 like '7%z')))
regression(# and ((ref_15.c5 like '7%z') is not null))
regression(# then ref_16.c_6 else ref_15.c8 end);
QUERY PLAN
---------------------------------------------------
Aggregate
-> Hash Join
Hash Cond: (ref_1.c9 = ref_15.pkey)
-> Nested Loop Left Join
Join Filter: ('1'::text = ref_1.c5)
-> Seq Scan on t1 ref_1
-> Materialize
-> Seq Scan on t2 ref_1_1
Filter: (c12 > c12)
-> Hash
-> Seq Scan on t1 ref_15
(11 rows)

v15 and before give

Aggregate
-> Hash Join
Hash Cond: (ref_1.c9 = ref_15.pkey)
Join Filter: (dcbrt(CASE WHEN ((ref_15.c5 ~~ '7%z'::text) AND (ref_15.c5 !~~ '7%z'::text) AND ((ref_15.c5 ~~ '7%z'::text) IS NOT NULL)) THEN ((1))::double precision ELSE ref_15.c8 END) = ref_1.c8)
-> Nested Loop Left Join
Join Filter: ('1'::text = ref_1.c5)
-> Seq Scan on t1 ref_1
-> Materialize
-> Hash Left Join
Hash Cond: (ref_1_1.c10 = ref_0.c2)
-> Seq Scan on t2 ref_1_1
Filter: (c12 > c12)
-> Hash
-> HashAggregate
Group Key: ref_0.c2
-> Seq Scan on t0 ref_0
-> Hash
-> Seq Scan on t1 ref_15

So the good news is that v16 correctly recognizes that the left join
to ref_0 can be discarded. (Older versions recognize this if you
just select directly from t3 or t4, but fail to make that deduction
when it's buried under an additional layer of outer join. I believe
this better result is due to the outer-join-aware-Vars changes.)
The bad news is that the top-level join filter condition has gone
missing. That still happens even with a greatly simplified WHERE
condition, ie this isn't really about CASE:

regression=# explain (costs off)
select count(*) as c_6
from
(t1 as ref_15
left outer join t4 as ref_16
on (ref_15.pkey = ref_16.c_2))
where ref_16.c_6 = ref_15.c8;
QUERY PLAN
---------------------------------------------------
Aggregate
-> Hash Right Join
Hash Cond: (ref_1.c9 = ref_15.pkey)
-> Nested Loop Left Join
Join Filter: ('1'::text = ref_1.c5)
-> Seq Scan on t1 ref_1
-> Materialize
-> Seq Scan on t2 ref_1_1
Filter: (c12 > c12)
-> Hash
-> Seq Scan on t1 ref_15
(11 rows)

We are converting this WHERE condition to an EquivalenceClass with
members "ref_16.c_6" and "ref_15.c8", and what seems to be the
problem is that analyzejoins.c fails to strip the removed rel(s)
from the EquivalenceMember for "ref_16.c_6", so it never looks
like we've reached a join level where it's time to enforce that.

I'd always kind of wondered how we got away with not updating
EquivalenceClasses during join removal, and the answer evidently
is that we can't anymore. I've not tried to write a patch yet.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-06-14 23:55:27 Re: BUG #17976: Inconsistent results of SELECT using CASE WHEN clause
Previous Message David Rowley 2023-06-14 23:22:55 Re: BUG #17976: Inconsistent results of SELECT using CASE WHEN clause