Quick Links

Postgress 13.x: wrong result for delete with subquery

From:	s(dot)p(dot)e(at)gmx-topmail(dot)de
To:	pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Postgress 13.x: wrong result for delete with subquery
Date:	2021-01-28 12:08:21
Message-ID:	trinity-1c565d44-159f-488b-a518-caf13883134f-1611835701633@3c-app-gmx-bap78
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Hallo PSQL-Team,

I found a dataset (attachment bug13.csv) which produces a wrong result on postgreSQL 13 for a special delete command:

Table t1 and t2 contain the same dataset, every row of the data (id1, id2) is unique

So the command

DELETE FROM t1 WHERE (id1,id2) IN (SELECT t2.id1, t2.id2 FROM t2);

should remove all lines. So it does in postgres 10 and 12.5. In Version 13.0 and 13.1 there is on line left after the command.

In detail:

I have tested on PostgreSQL 10 (local instalaltion on SuSE-Linux) and three docker-Images (12.5, 13.0, 13.1).

The docker container are installed from scratch

docker run --name psqlXX \

--shm-size=880MB \

-v $PWD/dXX:/var/lib/postgresql/data \

-v $PWD/extra:/var/extra \

-e POSTGRES_PASSWORD=comirnaty \

-d postgres:XX.X

docker exec -it psqlXX bash

psql -U postgres < /var/extra/bug13.sql > /var/extra/XXX.out

bug13.sql does the following:

-- wrong result on postgres 13.1 (docker)

-- also tested postgres 13.0 (docker): same bug

-- also tested postgres 12.5 (docker): no bug

\set VERBOSITY verbose

SELECT version();

-- prepare the test-data-set, all rows are unique

DROP TABLE IF EXISTS tstdata;

CREATE TABLE tstdata (

id1 VARCHAR( 3),

id2 VARCHAR( 25)

);

CREATE UNIQUE INDEX tst_idx ON tstdata (id1, id2);

COPY tstdata (id1, id2) FROM '/var/extra/bug13.csv';

-- tables for the test

DROP TABLE IF EXISTS t1, t2;

CREATE TEMPORARY TABLE t1 (

id1 VARCHAR( 32),

id2 VARCHAR( 32)

);

CREATE UNIQUE INDEX t1_idx ON t1 (id1,id2);

CREATE TEMPORARY TABLE t2 (

id1 VARCHAR( 32),

id2 VARCHAR( 32)

);

-- fill the tst tables with data

INSERT INTO t1 SELECT id1,id2 from tstdata;

INSERT INTO t2 SELECT id1,id2 from tstdata;

-- here ist the test for the bug with wrong result

-- t1 and t2 contain the identical set

-- so this command should remove all entries in t1

EXPLAIN DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);

-- QUERY PLAN

-- ----------------------------------------------------------------------------------------------

-- Delete on t1 (cost=595.12..1201.93 rows=3602 width=12)

-- -> Hash Join (cost=595.12..1201.93 rows=3602 width=12)

-- Hash Cond: (((t1.id1)::text = (t2.id1)::text) AND ((t1.id2)::text = (t2.id2)::text))

-- -> Seq Scan on t1 (cost=0.00..487.06 rows=14406 width=170)

-- -> Hash (cost=573.50..573.50 rows=1441 width=170)

-- -> HashAggregate (cost=559.09..573.50 rows=1441 width=170)

-- Group Key: (t2.id1)::text, (t2.id2)::text

-- -> Seq Scan on t2 (cost=0.00..487.06 rows=14406 width=170)

DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);

-- on version 13.x there is still one entry left

-- MED | 0000000000000010035567472

SELECT * from t1;

-- sometimes after multiple tries: the bug disappears, after recreating t2 the bug recurs

-- - creating an index on t2 -> correct result

-- - define t2 non temporary -> correct result

-- - analyse t2 -> correct result

-- postgres 12.5 uses the same query plan without any error

-- second service ----------------------------------------------------------------------

-- different query plan after analyse and the bug disappears

-- refill t1

DELETE FROM t1;

INSERT INTO t1 SELECT id1,id2 from tstdata;

ANALYSE t2;

EXPLAIN DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);

-- QUERY PLAN

-- ----------------------------------------------------------------------------------------------

-- Delete on t1 (cost=1506.92..2718.77 rows=7192 width=12)

-- -> Hash Semi Join (cost=1506.92..2718.77 rows=7192 width=12)

-- Hash Cond: (((t1.id1)::text = (t2.id1)::text) AND ((t1.id2)::text = (t2.id2)::text))

-- -> Seq Scan on t1 (cost=0.00..972.70 rows=28770 width=170)

-- -> Hash (cost=808.57..808.57 rows=46557 width=36)

-- -> Seq Scan on t2 (cost=0.00..808.57 rows=46557 width=36)

DELETE FROM t1 WHERE (id1,id2) IN (SELECT id1,id2 FROM t2);

SELECT * from t1;

Data, Command and Output are attached to this mail.

I reduced the bug-inducing datsaset from initial 8 Mio entries to this example file. The bug is very sensitive to the exact data and disappears on the most changes.

I hope, that a added everything necessary for further analysis.

Best regards

Stephan Endres

Attachment	Content-Type	Size
unknown_filename	text/html	9.5 KB
125.out	application/octet-stream	1.8 KB
130.out	application/octet-stream	1.9 KB
131.out	application/octet-stream	1.9 KB
bug13.csv	text/csv	1.3 MB
bug13.sql	application/sql	1.9 KB

Responses

Re: Postgress 13.x: wrong result for delete with subquery at 2021-01-29 22:26:17 from Tom Lane

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	PG Bug reporting form	2021-01-28 12:19:52	BUG #16842: pg_dump uses seek calls on pipe files: suggesting adding a flag to disable seek calls
Previous Message	Ajay Chitale	2021-01-28 09:36:32	Re: BUG #16841: psql -- \d tablename , displays "Error : column c.relhasoids does not exit"