Re: Exposing the lock manager's WaitForLockers() to SQL

From: Will Mortensen <will(at)extrahop(dot)com>
To: Marco Slot <marco(dot)slot(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, marco(at)citusdata(dot)com
Subject: Re: Exposing the lock manager's WaitForLockers() to SQL
Date: 2023-01-11 09:59:38
Message-ID: CAMpnoC4vnjs7PMgrmpsR1e8R7byGNS06=Z=Ey2WMMR8xc6+FHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Marco, thanks for the reply! Glad to know you'd find it useful too. :-)

On Tue, Jan 10, 2023 at 1:01 AM Marco Slot <marco(dot)slot(at)gmail(dot)com> wrote:
> I'm wondering whether it could be an option of the LOCK command.
> (LOCK WAIT ONLY?)

I assume that's doable, but just from looking at the docs, it might be
a little confusing. For example, at least if we use
WaitForLockersMultiple(), waiting for multiple tables would happen in
parallel (which I think is good), while locking them is documented to
happen sequentially. Also, normal LOCK is illegal outside a
transaction, but waiting makes perfect sense. (Actually, normal LOCK
makes sense too, if the goal was just to wait. :-) )

By contrast, while LOCK has NOWAIT, and SELECT's locking clause
has NOWAIT and SKIP LOCKED, they only change the blocking/failure
behavior, while success still means taking the lock and has the same
semantics.

But I'm really no expert on SQL syntax or typical practice for things like
this. Anything that works is fine with me. :-)

====

As a possibly superfluous sidebar, I wanted to correct this part of my
original message:

> On Fri, Dec 23, 2022 at 11:43 AM Will Mortensen <will(at)extrahop(dot)com> wrote:
> > pg_sequence_last_value() (still undocumented!) can be used to
> > obtain an instantaneous upper bound on the sequence values
> > that have been returned by nextval(), even if the transaction
> > that called nextval() hasn't yet committed.

This is true, but not the most important part of making this scheme
work: as you mentioned in the Citus blog post, to avoid missing rows,
we need (and this gives us) an instantaneous *lower* bound on the
sequence values that could be used by transactions that commit after
we finish waiting (and start processing). This doesn't work with
sequence caching, since without somehow inspecting all sessions'
sequence caches, rows with arbitrarily old/low cached sequence
values could be committed arbitrarily far into the future, and we'd
fail to process them.

As you also implied in the blog post, the upper bound is what
allows us to also process each row *exactly* once (instead of at
least once) and in sequence order, if desired.

So those are the respective justifications for both arms of the
WHERE clause: id > min_id AND id <= max_id .

On Tue, Jan 10, 2023 at 1:01 AM Marco Slot <marco(dot)slot(at)gmail(dot)com> wrote:
>
> On Fri, Dec 23, 2022 at 11:43 AM Will Mortensen <will(at)extrahop(dot)com> wrote:
> > We'd like to be able to call the lock manager's WaitForLockers() and
> > WaitForLockersMultiple() from SQL. Below I describe our use case, but
> > basically I'm wondering if this:
> >
> > 1. Seems like a reasonable thing to do
> >
> > 2. Would be of interest upstream
> >
> > 3. Should be done with a new pg_foo() function (taking an
> > oid?), or a whole new SQL command, or something else
>
> Definitely +1 on adding a function/syntax to wait for lockers without
> actually taking a lock. The get sequence value + lock-and-release
> approach is still the only reliable scheme I've found for reliably and
> efficiently processing new inserts in PostgreSQL. I'm wondering
> whether it could be an option of the LOCK command. (LOCK WAIT ONLY?)
>
> Marco

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-01-11 10:04:17 RE: [Proposal] Add foreign-server health checks infrastructure
Previous Message shveta malik 2023-01-11 09:57:13 Re: Time delayed LR (WAS Re: logical replication restrictions)