Re: Latches with weak memory ordering (Re: max_wal_senders must die)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Latches with weak memory ordering (Re: max_wal_senders must die)
Date: 2010-11-19 02:17:40
Message-ID: AANLkTimysOTF2sajOf1E=YG1ETVFqaUQKkqi6fuwaKhP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 18, 2010 at 5:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Nov 15, 2010 at 11:12 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Hmm ... I just remembered the reason why we didn't use a spinlock in
>>> these functions already.  Namely, that it's unsafe for a signal handler
>>> to try to acquire a spinlock that the interrupted code might be holding.
>
>> The signal handler just checks a process-local, volatile variable
>> called "waiting" (which should be fine) and then sends a self-pipe
>> byte.  It deliberately *doesn't* take a spinlock.
>
> I'm not talking about latch_sigusr1_handler.  I'm talking about the
> several already-existing signal handlers that call SetLatch.  Now maybe
> it's possible to prove that none of those can fire in a process that's
> mucking with the same latch in-line, but that sounds fragile as heck;
> and anyway what of future usage?  Given that precedent, somebody is
> going to write something unsafe at some point, and it'll fail only often
> enough to be seen in the field but not in our testing.

Oh, I get it. You're right. We can't possibly assume that that we're
not trying to set the latch for our own process, because that's the
whole point of having the self-pipe code in the first place.

How about changing the API so that the caller must use one function,
say, SetOwnedLatch(), to set a latch that they own, and another
function, say, SetNonOwnedLatch(), to set a latch that they do not
own? The first can simply set is_set (another process might fail to
see that the latch has been set, but the worst thing they can do is
set it over again, which should be fairly harmless) and send a
self-pipe byte. The second can take the spin lock, set is_set, read
the owner PID, release the spin lock, and send a signal. WaitLatch()
can similarly take the spin lock before reading is_set. This requires
the caller to know whether they are setting their own latch or someone
else's latch, but I don't believe that's a problem given current
usage.

I'm all in favor of having some memory ordering primitives so that we
can try to implement better algorithms, but if we use it here it
amounts to a fairly significant escalation in the minimum requirements
to compile PG (which is bad) rather than just a performance
optimization (which is good).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Itagaki Takahiro 2010-11-19 02:40:05 Re: UNNEST ... WITH ORDINALITY (AND POSSIBLY OTHER STUFF)
Previous Message Cédric Villemain 2010-11-19 01:34:34 Re: final patch - plpgsql: for-in-array