Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: lars(dot)vonk(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery
Date: 2019-03-26 17:03:17
Message-ID: 1134.1553619797@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> We are using a ruby library called delayed-job
> (https://github.com/collectiveidea/delayed_job) that generates the following
> sql

> LOG: duration: 82797.554 ms execute <unnamed>: UPDATE "delayed_jobs" SET
> locked_at = '2019-03-26 13:25:20.808244', locked_by = 'host:myhost pid:9958'
> WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at
> <= '2019-03-26 13:25:20.807815' AND (locked_at IS NULL OR locked_at <
> '2019-03-26 09:25:20.807828') OR locked_by = 'host:myhost pid:9958') AND
> failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE)
> RETURNING *

> We noticed that all the "delayed_jobs" were getting locked instead of just
> 1.

That'd depend on the query plan you're getting, but in principle this
query is unstable if there are other transactions changing the table
contents: if the sub-SELECT is executed more than once then it could
return (and lock) a different row each time, since the FOR UPDATE
implies it will see latest values not those that applied when the
query started.

For that matter it could be unstable even without any concurrent
transactions, seeing that the UPDATE's action is to change some of
the fields the sub-SELECT's WHERE examines, and again the FOR UPDATE
means that the sub-SELECT can see those changes. Given these particular
values it seems like a row that satisfied the WHERE conditions
to start with still would do so after the UPDATE, but I wonder whether
that's always true in your usage.

Any given execution of the sub-SELECT is definitely going to lock only one
(or at most one) row, but whether it's the same row each time is harder to
say. You might be well advised to replace the sub-SELECT with a CTE so
that it's guaranteed to execute only once, ie

WITH x AS (SELECT id ... FOR UPDATE LIMIT 1)
UPDATE delayed_jobs SET ... WHERE id IN (SELECT id FROM x)

> We also found some related stackoverflow content which sort of hints that
> this is a bug:
> - https://dba.stackexchange.com/questions/69471/postgres-update-limit-1
> -
> https://github.com/feikesteenbergen/demos/blob/master/bugs/update_limit_bug.txt

In that case the query is intentionally changing the sub-query's results
(by changing the sort ordering field). I don't think it's a bug; it's
really closer to being unspecified behavior.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Lars Vonk 2019-03-26 20:19:22 Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery
Previous Message Anthony Sotolongo 2019-03-26 15:49:20 Re: BUG #15716: pgadmin 4.3: row_to_json error