Re: Possible bug with SKIP LOCKED behaviour

From: Glen Mailer <glen(at)geckoboard(dot)com>
To: Zhang Mingli <zmlpostgres(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Possible bug with SKIP LOCKED behaviour
Date: 2022-09-29 08:50:50
Message-ID: CAHvdy4VGD+Jk2hc=m=iYL4amAohcUBoPzYcChM1mfHtPQ9AbWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello

With SKIP LOCKED, any selected rows that cannot be immediately locked are
> skipped. Skipping locked rows provides an inconsistent view of the data, so
> this is not suitable for general purpose work, but can be used to avoid
> lock contention with multiple consumers accessing a queue-like table.
>

Yes, I am specifically aiming to avoid lock contention with multiple
consumers accessing a queue-like table, and I'm seeing the same row being
retrieved my multiple workers

And a golang script is not convenient for hackers to reproduce. Could you
> provide some steps to produce the bug stably if it really was ?
>

Reproducing requires running a transaction with queries dependent on the
results of earlier queries, and then running a number of these transactions
concurrently, and then repeating the test until the unexpected result
happens. Currently I'm doing 20 concurrent transactions, and I find that if
I repeat the test 100 times I tend to get between zero and 3 failures.

What would be a more convenient way for me to provide this for reproduction?

Thanks
Glen

On Thu, 29 Sept 2022 at 03:41, Zhang Mingli <zmlpostgres(at)gmail(dot)com> wrote:

> Hi,
>
> On Sep 29, 2022, 00:56 +0800, Glen Mailer <glen(at)geckoboard(dot)com>, wrote:
>
> Hello everyone
>
> I believe I've run into a bug in the behaviour of SKIP LOCKED, where I
> have a program that implements a queue with concurrent workers SELECTing
> work from some shared tables.
>
> The code in question does a LEFT JOIN across two tables with a FOR UPDATE
> on the left table and a SKIP LOCKED clause, and then UPDATEs or INSERTs
> rows into the table on right side of the JOIN in a way that leads to
> subsequent executions of the same query to no longer match those rows.
> However, when run concurrently I'm seeing the same row be selected by
> multiple workers - which shouldn't be possible based on my understanding of
> the relevant semantics of these operations. Perhaps I'm just holding it
> wrong, but I would have expected the FOR UPDATE lock on the left table to
> be sufficient to avoid overlapping results.
>
> I have extracted a fairly minimal reproducing case from our production
> code, which includes some Go code as a test harness to run the queries
> concurrently enough to demonstrate the problem - this can be found at
> https://github.com/glenjamin/postgres-skip-locked-surprise
> I wasn't sure how much detail from that reproducing case to repeat in this
> email, so I've only gone with an outline of the observed and expected
> behaviour - but I can try and add more detail to this thread if desired
>
> Cheers
> Glen
>
> According to doc:
>
> With SKIP LOCKED, any selected rows that cannot be immediately locked are
> skipped. Skipping locked rows provides an inconsistent view of the data, so
> this is not suitable for general purpose work, but can be used to avoid
> lock contention with multiple consumers accessing a queue-like table.
>
> this can be found at
> https://github.com/glenjamin/postgres-skip-locked-surprise
>
> And a golang script is not convenient for hackers to reproduce. Could you
> provide some steps to produce the bug stably if it really was ?
>
> Regards,
> Zhang Mingli
>
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Ivan Ivanov 2022-09-29 11:59:03 Re: Function modification visibility in parallel connection
Previous Message PG Bug reporting form 2022-09-29 08:41:30 BUG #17624: Creating database is non-ending execution.