Re: BUG #17141: SELECT LIMIT WITH TIES FOR UPDATE SKIP LOCKED returns wrong number of rows

From: Emil Iggland <emil(at)iggland(dot)com>
To: David Christensen <david(dot)christensen(at)crunchydata(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, Surafel Temesgen <surafel3000(at)gmail(dot)com>
Subject: Re: BUG #17141: SELECT LIMIT WITH TIES FOR UPDATE SKIP LOCKED returns wrong number of rows
Date: 2021-08-12 06:47:56
Message-ID: 58fd3519-8866-daed-0ca5-768f5bab1c00@iggland.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I continued trying the variations in the linked thread.

* Sub-query
BEGIN;
SELECT * FROM (SELECT * FROM queue
ORDER BY task DESC
FETCH NEXT 1 ROWS WITH TIES) t
FOR UPDATE SKIP LOCKED;
COMMIT;

This behaves the same way, this does not work around the bug.
The same goes for my previous "find" with NEXT. I can not replicate the
working state, I must have done something wrong last night.

I added some more tasks with the same number in order to see if there
was a problem with the first row, or with the count.

I now have the following counts:
task count(*)
180 2
280 2
380 4
480 3
580 2

I attempted to select multiple tasks at the same time, representing a
case where a worker might select multiple tasks.

SELECT * FROM queue
ORDER BY task DESC
FETCH FIRST 3 ROWS WITH TIES
FOR UPDATE SKIP LOCKED;

Here I get three rows back (580), (580), (480)

If I run
SELECT * FROM queue ORDER BY task DESC
FETCH FIRST 3 ROWS WITH TIES;
I get back 5 rows (580), (580), (480), (480), (480) as expected.

/Emil

On 2021-08-11 23:39, David Christensen wrote:
>
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
>
>> On 2021-Aug-11, PG Bug reporting form wrote:
>>
>>> BEGIN;
>>> SELECT * FROM queue
>>> ORDER BY task DESC
>>> FETCH FIRST 1 ROWS WITH TIES
>>> FOR UPDATE SKIP LOCKED;
>>> /* Some work to be done here */
>>> COMMIT;
>>>
>>> select version();
>>> PostgreSQL 13.3, compiled by Visual C++ build 1914, 64-bit
>>>
>>> Expected result Worker 1: (580), (580), Actual result Worker 1: (580), (580)
>>> Expected result Worker 2: (480), (480), Actual result Worker 2: (480)
>>
>> Ouch, we already saw this actually:
>> https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
>> The problem is that the first worker locks the first (480) row (even
>> though it does not return it), so the second worker skips it due to SKIP
>> LOCKED.
>>
>> I have this on my list of things to look at, but it's not at the top
>> yet sadly ...
>
> Yeah, I'd looked at this when I found it, and short of detecting the situation "WITH TIES FOR UPDATE
> SKIP LOCKED" and erroring out, it seems like it would require adding in infrastructure that we don't
> support (AFAIK) with unlocking an already locked row inside a transaction or reworking the order of
> LockRows and Limit such that Limit comes first (and itself handles the WITH TIES) before handing to
> LockRows. Either way (other than the error), it seems to be a fairly invasive change.
>
> If someone has another idea on how to handle this, I could take a stab at things. Detecting the
> situation and erroring seems like the easiest way to handle so you're at least not getting back bad
> results, though I agree that the functionality would be useful if we *could* support it somehow.
>
> David
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2021-08-12 07:40:35 Re: BUG #17142: COPY ignores client_encoding for octal digit characters
Previous Message Masahiko Sawada 2021-08-12 02:45:59 Re: I/O timigns don't include time for temp buffers