Re: Listen / Notify - what to do when the queue is full

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Listen / Notify - what to do when the queue is full
Date: 2010-01-19 09:58:33
Message-ID: dc7b844e1001190158m548a215fv5e6f62faaab051a1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jeff,

thanks a lot for your review. I will reply to your review again in
detail but I'd like to answer your two main questions already now.

On Tue, Jan 19, 2010 at 8:08 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> * AsyncCommitOrderLock
>
> I believe this needs a re-think. What is the real purpose for
> AsyncCommitOrderLock, and can we acheive that another way? It seems
> that you're worried about a transaction that issues a LISTEN and
> committing not getting a notification from a NOTIFYing transaction
> that commits concurrently (and slightly after the LISTEN).

Yes, that is exactly the point. However I am not worried about a
notification getting lost but rather about determining the visibility
of the notifications.

In the end we need to be able to know about the order of LISTEN,
UNLISTEN and NOTIFY commits to find out who should receive which
notifications. As you cannot determine if xid1 has committed before or
after xid2 retrospectively I enforced the order by an LWLock and by
saving the list of xids currently being committed.

There are also two examples in
http://archives.postgresql.org/pgsql-hackers/2009-12/msg00790.php
about that issue.

> But SignalBackends() is called after transaction commit, and should signal
> all backends who committed a LISTEN before that time, right?

Yes, any listening backend is being signaled but that doesn't help to
find out about the exact order of the almost-concurrent events that
happened before.

> * The transaction IDs are used because Send_Notify() is called before
> the AsyncCommitOrderLock acquire, and so the backend could potentially
> be reading uncommitted notifications that are "about" to be committed
> (or aborted). Then, the queue is not read further until that transaction
> completes. That's not really commented effectively, and I suspect the
> process could be simpler. For instance, why can't the backend always
> read all of the data from the queue, notifying if the transaction is
> committed and saving to a local list otherwise (which would be checked
> on the next wakeup)?

It's true that the backends could always read up to the end of the
queue and copy everything into the local memory. However you still
need to apply the same checks before you deliver the notifications:
You need to make sure that the transaction has committed and that you
were listening to the channels of the notifications at the time they
got sent / committed. Also you need to copy really _everything_
because you could start to listen to a channel after copying its
uncommitted notifications.

There are other reasons (tail pointer management and signaling
strategy) but in the end it seemed more straightforward to stop as
soon as we hit an uncommitted notification. We will receive a signal
for it eventually anyway and can then start again and read further.

Also I think (but I have no numbers about it) that it makes the
backends work more on the same slru pages.

Joachim

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-01-19 10:46:24 Re: An example of bugs for Hot Standby
Previous Message Fujii Masao 2010-01-19 09:35:06 Streaming replication and pg_xlogfile_name()