BUG #17213: Wrong result from a query involving Merge Semi Join and Memoize

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: elprans(at)gmail(dot)com
Subject: BUG #17213: Wrong result from a query involving Merge Semi Join and Memoize
Date: 2021-10-04 23:04:38
Message-ID: 17213-988ed34b225a2862@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17213
Logged by: Elvis Pranskevichus
Email address: elprans(at)gmail(dot)com
PostgreSQL version: 14.0
Operating system: Linux
Description:

It appears a combination of Merge Semi Join and Memoize in PostgreSQL 14
produces incorrect results on a particular query. The bug might be
present
in earlier versions, but I was only able to reproduce a particular plan
under version 14.

======
Schema
======

CREATE TABLE issues (
id int PRIMARY KEY
);

CREATE TABLE users (
id int PRIMARY KEY
);

CREATE TABLE watchers (
issue_id int,
user_id int,

CONSTRAINT unique_pair UNIQUE (issue_id, user_id)
);

CREATE INDEX watchers_user_idx ON watchers(user_id);

INSERT INTO issues (id) VALUES (1);
INSERT INTO issues (id) VALUES (2);
INSERT INTO issues (id) VALUES (3);
INSERT INTO issues (id) VALUES (4);

INSERT INTO users (id) VALUES (1);
INSERT INTO users (id) VALUES (2);

INSERT INTO watchers (issue_id, user_id) VALUES (1, 2);
INSERT INTO watchers (issue_id, user_id) VALUES (2, 1);
INSERT INTO watchers (issue_id, user_id) VALUES (3, 1);

=====
Query
=====

SELECT
i1.id
FROM
issues AS i1
WHERE
EXISTS (
SELECT
FROM
(SELECT
i3.id
FROM
issues AS i3
WHERE
i3.id
IN
(SELECT
w3.issue_id
FROM
(SELECT
u1.id
FROM
users AS u1
WHERE
EXISTS (
SELECT
FROM
watchers AS w1
INNER JOIN users AS u2 ON (w1.user_id = u2.id)
INNER JOIN watchers AS w2 ON (u1.id =
w2.user_id)
WHERE
(i1.id = w1.issue_id)
AND (w2.issue_id != i1.id)
AND u1.id = u2.id
)
) AS q
INNER JOIN users AS superfluous ON (q.id = superfluous.id)
INNER JOIN watchers AS w3 ON (q.id = w3.user_id)
)
) AS q
);

===============
Expected Output
===============

id
----
2
3
(2 rows)

===============================
Actual Output (with below plan)
===============================

id
----
(0 rows)

=================
Plan with Memoize
=================

Seq Scan on issues i1 (cost=0.00..11577.47 rows=145 width=248) (actual
time=0.091..0.093 rows=0 loops=1)
Filter: (SubPlan 2)
Rows Removed by Filter: 4
SubPlan 2
-> Merge Semi Join (cost=0.61..5694.44 rows=145 width=0) (actual
time=0.020..0.020 rows=0 loops=4)
Merge Cond: (i3.id = w3.issue_id)
-> Index Only Scan using issues_pkey on issues i3
(cost=0.15..52.50 rows=290 width=16) (actual time=0.002..0.003 rows=1
loops=4)
Heap Fetches: 4
-> Nested Loop (cost=0.46..5632.72 rows=680 width=16) (actual
time=0.017..0.017 rows=0 loops=4)
-> Nested Loop (cost=0.30..344.25 rows=1360 width=48)
(actual time=0.004..0.010 rows=3 loops=4)
-> Index Only Scan using unique_pair on watchers w3
(cost=0.15..68.55 rows=1360 width=32) (actual time=0.001..0.002 rows=3
loops=4)
Heap Fetches: 12
-> Index Only Scan using users_pkey on users
superfluous (cost=0.15..0.20 rows=1 width=16) (actual time=0.001..0.001
rows=1 loops=12)
Index Cond: (id = w3.user_id)
Heap Fetches: 12
-> Memoize (cost=0.16..7.02 rows=1 width=16) (actual
time=0.002..0.002 rows=0 loops=12)
Cache Key: superfluous.id
Hits: 10 Misses: 2 Evictions: 0 Overflows: 0
Memory Usage: 1kB
-> Index Only Scan using users_pkey on users u1
(cost=0.15..7.01 rows=1 width=16) (actual time=0.008..0.009 rows=0
loops=2)
Index Cond: (id = superfluous.id)
Filter: (SubPlan 1)
Rows Removed by Filter: 1
Heap Fetches: 2
SubPlan 1
-> Nested Loop (cost=0.45..44.73 rows=7
width=0) (actual time=0.006..0.006 rows=0 loops=2)
-> Index Scan using watchers_user_idx
on watchers w2 (cost=0.15..28.29 rows=7 width=0) (actual time=0.003..0.003
rows=1 loops=2)
Index Cond: (user_id = u1.id)
Filter: (issue_id <> i1.id)
Rows Removed by Filter: 0
-> Materialize (cost=0.30..16.36
rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=2)
-> Nested Loop
(cost=0.30..16.35 rows=1 width=0) (actual time=0.002..0.003 rows=0
loops=1)
-> Index Only Scan using
unique_pair on watchers w1 (cost=0.15..8.17 rows=1 width=16) (actual
time=0.002..0.002 rows=0 loops=1)
Index Cond:
((issue_id = i1.id) AND (user_id = u1.id))
Heap Fetches: 0
-> Index Only Scan using
users_pkey on users u2 (cost=0.15..8.17 rows=1 width=16) (never executed)
Index Cond: (id =
u1.id)
Heap Fetches: 0
Planning Time: 0.755 ms
Execution Time: 0.146 ms

=============================
With SET enable_memoize = off
=============================

Seq Scan on issues i1 (cost=0.00..19705.90 rows=145 width=32) (actual
time=0.076..0.116 rows=2 loops=1)
Filter: (SubPlan 2)
Rows Removed by Filter: 2
SubPlan 2
-> Merge Semi Join (cost=0.60..9760.10 rows=145 width=0) (actual
time=0.025..0.025 rows=0 loops=4)
Merge Cond: (i3.id = w3.issue_id)
-> Index Only Scan using issues_pkey on issues i3
(cost=0.15..52.50 rows=290 width=16) (actual time=0.003..0.003 rows=2
loops=4)
Heap Fetches: 6
-> Nested Loop (cost=0.45..9698.38 rows=680 width=16) (actual
time=0.021..0.022 rows=0 loops=4)
Join Filter: (u1.id = superfluous.id)
-> Nested Loop (cost=0.30..9552.04 rows=680 width=48)
(actual time=0.020..0.020 rows=0 loops=4)
-> Index Only Scan using unique_pair on watchers w3
(cost=0.15..68.55 rows=1360 width=32) (actual time=0.001..0.002 rows=2
loops=4)
Heap Fetches: 10
-> Index Only Scan using users_pkey on users u1
(cost=0.15..6.98 rows=1 width=16) (actual time=0.007..0.007 rows=0
loops=10)
Index Cond: (id = w3.user_id)
Filter: (SubPlan 1)
Rows Removed by Filter: 1
Heap Fetches: 10
SubPlan 1
-> Nested Loop (cost=0.45..44.73 rows=7
width=0) (actual time=0.005..0.005 rows=0 loops=10)
-> Index Scan using watchers_user_idx
on watchers w2 (cost=0.15..28.29 rows=7 width=0) (actual time=0.001..0.002
rows=1 loops=10)
Index Cond: (user_id = u1.id)
Filter: (issue_id <> i1.id)
Rows Removed by Filter: 0
-> Materialize (cost=0.30..16.36
rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=13)
-> Nested Loop
(cost=0.30..16.35 rows=1 width=0) (actual time=0.002..0.002 rows=0
loops=9)
-> Index Only Scan using
unique_pair on watchers w1 (cost=0.15..8.17 rows=1 width=16) (actual
time=0.001..0.001 rows=0 loops=9)
Index Cond:
((issue_id = i1.id) AND (user_id = u1.id))
Heap Fetches: 2
-> Index Only Scan using
users_pkey on users u2 (cost=0.15..8.17 rows=1 width=16) (actual
time=0.001..0.001 rows=1 loops=2)
Index Cond: (id =
u1.id)
Heap Fetches: 2
-> Index Only Scan using users_pkey on users superfluous
(cost=0.15..0.20 rows=1 width=16) (actual time=0.002..0.002 rows=1
loops=2)
Index Cond: (id = w3.user_id)
Heap Fetches: 2
Planning Time: 0.706 ms
Execution Time: 0.164 ms

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2021-10-04 23:10:05 Re: BUG #17212: pg_amcheck fails on checking temporary relations
Previous Message Mark Dilger 2021-10-04 22:36:06 Re: BUG #17212: pg_amcheck fails on checking temporary relations