Re: delete statement returning too many results

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Ron <ronljohnsonjr(at)gmail(dot)com>, "Arlo Louis O'Keeffe" <gnomelver(at)k5d(dot)de>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: delete statement returning too many results
Date: 2022-11-28 17:11:53
Message-ID: 632644.1669655513@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> There is a nice big caution regarding the default read committed isolation
> mode, order by, and for update, in the documentation, but I cannot work out
> exactly why this example seems to be triggering it.

The <caution> is talking about a rather different scenario.

I managed to reproduce this locally. I find that initially, with an
empty queue table, you get a query plan like

Delete on queue (cost=0.38..8.42 rows=1 width=38)
-> Nested Loop (cost=0.38..8.42 rows=1 width=38)
-> HashAggregate (cost=0.23..0.24 rows=1 width=40)
Group Key: "ANY_subquery".id
-> Subquery Scan on "ANY_subquery" (cost=0.15..0.22 rows=1 width=40)
-> Limit (cost=0.15..0.21 rows=1 width=14)
-> LockRows (cost=0.15..74.15 rows=1200 width=14)
-> Index Scan using queue_pkey on queue queue_1 (cost=0.15..62.15 rows=1200 width=14)
-> Index Scan using queue_pkey on queue (cost=0.15..8.17 rows=1 width=14)
Index Cond: (id = "ANY_subquery".id)

which is fine because the LockRows bit will be run only once.

However, after the table's been stomped on for awhile (and probably
not till after autovacuum runs), that switches to

Delete on queue (cost=0.25..16.31 rows=1 width=38)
-> Nested Loop Semi Join (cost=0.25..16.31 rows=1 width=38)
Join Filter: (queue.id = "ANY_subquery".id)
-> Index Scan using queue_pkey on queue (cost=0.12..8.14 rows=1 width=14)
-> Subquery Scan on "ANY_subquery" (cost=0.12..8.16 rows=1 width=40)
-> Limit (cost=0.12..8.15 rows=1 width=14)
-> LockRows (cost=0.12..8.15 rows=1 width=14)
-> Index Scan using queue_pkey on queue queue_1 (cost=0.12..8.14 rows=1 width=14)

and then you start to get failures, because each re-execution of
the subquery produces a fresh row thanks to the silent SKIP LOCKED.

So basically it's unsafe to run the sub-select more than once,
but the query as written leaves it up to the planner whether
to do that. I'd suggest rephrasing as

WITH target_rows AS MATERIALIZED (
SELECT id
FROM queue
ORDER BY id
LIMIT 1
FOR UPDATE
SKIP LOCKED
)
DELETE FROM queue
WHERE id IN (SELECT * FROM target_rows)
RETURNING *;

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kirk Wolak 2022-11-28 22:22:19 Re: Getting PSQL in Windows to support TAB/Autocomplete via modified readline...
Previous Message David G. Johnston 2022-11-28 14:34:45 Re: delete statement returning too many results