From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | zuming(dot)jiang(at)inf(dot)ethz(dot)ch |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17976: Inconsistent results of SELECT using CASE WHEN clause |
Date: | 2023-06-14 23:34:23 |
Message-ID: | 2023628.1686785663@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> My fuzzer finds a logical bug in Postgres, which makes Postgres return
> inconsistent results.
Thanks for the report!
I poked at this a little bit. v16 is actually generating an incorrect
plan for both query variants; the bad plan for the second variant just
happens to not give visibly wrong answers for this input data.
regression=# explain (costs off)
regression-# select count(*) as c_6
regression-# from
regression-# (t1 as ref_15
regression(# left outer join t4 as ref_16
regression(# on (ref_15.pkey = ref_16.c_2))
regression-# where (case when true
regression(# then ref_16.c_3 else ref_16.c_3 end
regression(# ) = pg_catalog.dcbrt(case when (((ref_15.c5 like '7%z')
regression(# and (not (ref_15.c5 like '7%z')))
regression(# and ((ref_15.c5 like '7%z') is not null))
regression(# then ref_16.c_6 else ref_15.c8 end);
QUERY PLAN
---------------------------------------------------
Aggregate
-> Hash Join
Hash Cond: (ref_1.c9 = ref_15.pkey)
-> Nested Loop Left Join
Join Filter: ('1'::text = ref_1.c5)
-> Seq Scan on t1 ref_1
-> Materialize
-> Seq Scan on t2 ref_1_1
Filter: (c12 > c12)
-> Hash
-> Seq Scan on t1 ref_15
(11 rows)
v15 and before give
Aggregate
-> Hash Join
Hash Cond: (ref_1.c9 = ref_15.pkey)
Join Filter: (dcbrt(CASE WHEN ((ref_15.c5 ~~ '7%z'::text) AND (ref_15.c5 !~~ '7%z'::text) AND ((ref_15.c5 ~~ '7%z'::text) IS NOT NULL)) THEN ((1))::double precision ELSE ref_15.c8 END) = ref_1.c8)
-> Nested Loop Left Join
Join Filter: ('1'::text = ref_1.c5)
-> Seq Scan on t1 ref_1
-> Materialize
-> Hash Left Join
Hash Cond: (ref_1_1.c10 = ref_0.c2)
-> Seq Scan on t2 ref_1_1
Filter: (c12 > c12)
-> Hash
-> HashAggregate
Group Key: ref_0.c2
-> Seq Scan on t0 ref_0
-> Hash
-> Seq Scan on t1 ref_15
So the good news is that v16 correctly recognizes that the left join
to ref_0 can be discarded. (Older versions recognize this if you
just select directly from t3 or t4, but fail to make that deduction
when it's buried under an additional layer of outer join. I believe
this better result is due to the outer-join-aware-Vars changes.)
The bad news is that the top-level join filter condition has gone
missing. That still happens even with a greatly simplified WHERE
condition, ie this isn't really about CASE:
regression=# explain (costs off)
select count(*) as c_6
from
(t1 as ref_15
left outer join t4 as ref_16
on (ref_15.pkey = ref_16.c_2))
where ref_16.c_6 = ref_15.c8;
QUERY PLAN
---------------------------------------------------
Aggregate
-> Hash Right Join
Hash Cond: (ref_1.c9 = ref_15.pkey)
-> Nested Loop Left Join
Join Filter: ('1'::text = ref_1.c5)
-> Seq Scan on t1 ref_1
-> Materialize
-> Seq Scan on t2 ref_1_1
Filter: (c12 > c12)
-> Hash
-> Seq Scan on t1 ref_15
(11 rows)
We are converting this WHERE condition to an EquivalenceClass with
members "ref_16.c_6" and "ref_15.c8", and what seems to be the
problem is that analyzejoins.c fails to strip the removed rel(s)
from the EquivalenceMember for "ref_16.c_6", so it never looks
like we've reached a join level where it's time to enforce that.
I'd always kind of wondered how we got away with not updating
EquivalenceClasses during join removal, and the answer evidently
is that we can't anymore. I've not tried to write a patch yet.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-06-14 23:55:27 | Re: BUG #17976: Inconsistent results of SELECT using CASE WHEN clause |
Previous Message | David Rowley | 2023-06-14 23:22:55 | Re: BUG #17976: Inconsistent results of SELECT using CASE WHEN clause |