Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery

From: Lars Vonk <lars(dot)vonk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery
Date: 2019-03-26 20:19:22
Message-ID: CAMX1Thhr8X5e=t6n+m=kWR1LhOqoqe-wupyBZKuuxmpr3091FA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

Thanks for the explanation. Our hotfix was to replace the IN with a = so
the query fails and the transaction is rolled back when more than one is
returned from the sub-SELECT. We will follow up on your advice and will
rewrite the query to your recommendation.

I understand how this query is unstable and can lock an extra row. What I
still don't fully understand yet how this explains what happened though,
since we never so this in almost 7 years and because of the rapid increase
in rows being locked by this one process. So I hope you are willing to
help me understand (I am happy to take this to another mailing list if
needed).

To add a bit more context:

- The library delayed_job is a background processor, (this) one process
locks a row, and than executes the "delayed_job" (a row in the table). It
will delete the row on success or marks it as failed afterwards.
- Multiple "delayed_job"s per second can be inserted by other processes.
- This is the first time this behavior happened in almost 7 years. We
recently upgraded to postgres 11.2 (from 9.6). The ruby library is pretty
stable, that part of the code generating the query did not change for a
long time.
- After 4 minutes or so > 20000 rows were locked by this process

Coming back at your explanation:

> That'd depend on the query plan you're getting, but in principle this
> query is unstable if there are other transactions changing the table
> contents: if the sub-SELECT is executed more than once then it could
> return (and lock) a different row each time, since the FOR UPDATE
> implies it will see latest values not those that applied when the
> query started.

So if I understand this correctly if in the query plan the sub-SELECT is
executed 10 times, than 10 rows will be locked?
Is it plauseable for a query plan to execute a sub-SELECT so many times?
Otherwise I can't explain the rapid increase in number of rows.

What also is strange is the duration of that query (the query normally
takes couple of milliseconds).
Almost as if it was in an endless loop or something, immediately updating
new rows after insterted by other processes.
Is that also possible given your explanation?

Again thanks for your help.

Lars

On Tue, Mar 26, 2019 at 6:03 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> > We are using a ruby library called delayed-job
> > (https://github.com/collectiveidea/delayed_job) that generates the
> following
> > sql
>
> > LOG: duration: 82797.554 ms execute <unnamed>: UPDATE "delayed_jobs"
> SET
> > locked_at = '2019-03-26 13:25:20.808244', locked_by = 'host:myhost
> pid:9958'
> > WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE
> ((run_at
> > <= '2019-03-26 13:25:20.807815' AND (locked_at IS NULL OR locked_at <
> > '2019-03-26 09:25:20.807828') OR locked_by = 'host:myhost pid:9958') AND
> > failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE)
> > RETURNING *
>
> > We noticed that all the "delayed_jobs" were getting locked instead of
> just
> > 1.
>
> That'd depend on the query plan you're getting, but in principle this
> query is unstable if there are other transactions changing the table
> contents: if the sub-SELECT is executed more than once then it could
> return (and lock) a different row each time, since the FOR UPDATE
> implies it will see latest values not those that applied when the
> query started.
>
> For that matter it could be unstable even without any concurrent
> transactions, seeing that the UPDATE's action is to change some of
> the fields the sub-SELECT's WHERE examines, and again the FOR UPDATE
> means that the sub-SELECT can see those changes. Given these particular
> values it seems like a row that satisfied the WHERE conditions
> to start with still would do so after the UPDATE, but I wonder whether
> that's always true in your usage.
>
> Any given execution of the sub-SELECT is definitely going to lock only one
> (or at most one) row, but whether it's the same row each time is harder to
> say. You might be well advised to replace the sub-SELECT with a CTE so
> that it's guaranteed to execute only once, ie
>
> WITH x AS (SELECT id ... FOR UPDATE LIMIT 1)
> UPDATE delayed_jobs SET ... WHERE id IN (SELECT id FROM x)
>
> > We also found some related stackoverflow content which sort of hints that
> > this is a bug:
> > - https://dba.stackexchange.com/questions/69471/postgres-update-limit-1
> > -
> >
> https://github.com/feikesteenbergen/demos/blob/master/bugs/update_limit_bug.txt
>
> In that case the query is intentionally changing the sub-query's results
> (by changing the sort ordering field). I don't think it's a bug; it's
> really closer to being unspecified behavior.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-03-26 22:53:56 Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery
Previous Message Tom Lane 2019-03-26 17:03:17 Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery