RE: Row locks, SKIP LOCKED, and transactions

From: Steven Winfield <Steven(dot)Winfield(at)cantabcapital(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: Row locks, SKIP LOCKED, and transactions
Date: 2019-12-18 11:33:07
Message-ID: E9FA92C2921F31408041863B74EE4C2001E7665213@CCPMAILDAG03.cantab.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Huh. SERIALIZABLE shouldn't allow two transactions to see no result row
> for a given ID and then insert a result row for that ID. One of those
> transactions should have to roll back, because otherwise it'd be
> incompatible with both serial orderings of the two transactions.

Sorry for the misunderstanding - I wasn't suggesting that.
Even at the serializable level, W2 can see a row that is unlocked by W1's commit despite W2's snapshot being taken before W1 commits.
Carrying on my example, W2 would indeed fail to insert a result(id=1) row.

> Conceptually, the thing you really need to lock for this to work is the
> result row that isn't there yet, so that some overlapping transaction
> doesn't try to lock the same absent thing. Unfortunately, our system for
> locking things that aren't there isn't there either.
> Some articles on serializability talk about "materialising the conflict",
> which means locking some other surrogate thing that "covers" a gap you are
> interested in. You might think the job row would do the trick, but since
> we don't recheck the condition (that is, recheck that there is no
> corresponding result because you don't update the job row), no cigar.

I like the concept of "materialising the conflict", that’s a useful way of thinking about it - thanks.

> You could also use plain old pg_try_advisory_xact_lock(id), because it just
> locks integers, and they're always there.

Yeah, I tried this, and might give it another go. A naïve attempt failed for a similar reason.

>
> SERIALIZABLE deals with that type of magic internally (it locks gaps in
> key ranges by predicate-locking a physical btree or hash page that you'd
> need to write on to insert a row with a matching key, which is how it
> discovers a conflict between one transaction that went looking for key=42
> but didn't find it and another that later writes key=42), but, as
> mentioned, SERIALIZABLE doesn't really allow concurrency with this
> workload, and you specified that you wanted concurrency with SKIP LOCKED
> (but I think you'd have the same problem without it; SKIP LOCKED just gets
> you the wrong answer faster).
>
> There are various ways you could deal with this, but I'd probably go for a
> simple scheme where you only have to consult a single row to know if you
> can claim it. You could still put the results into a separate table, but
> use job.state to find work, and set it to DONE when you insert the result.
> It may also be possible to add no new columns but do a dummy update to the
> job row to get the join qual rechecked, but I'm not sure if that'd work.
> Another reason to add a state column to the job table is so that you can
> put a conditional index on it so you can find jobs to be done very
> quickly, if you're not planning to remove the ones that are done.

Thanks. I rejected the idea of doing a dummy update to the locked row as I wanted to avoid too much extra WAL - the real table originally had quite a few more columns than the toy example, but it's much slimmer now so this could be a viable option.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Josef Šimánek 2019-12-18 12:10:17 Re: REINDEX VERBOSE unknown option
Previous Message Steven Winfield 2019-12-18 10:42:49 RE: Row locks, SKIP LOCKED, and transactions