Re: Listen / Notify - what to do when the queue is full

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Arnaud Betremieux <arnaud(dot)betremieux(at)keyconsulting(dot)fr>
Subject: Re: Listen / Notify - what to do when the queue is full
Date: 2010-02-17 10:16:39
Message-ID: dc7b844e1002170216s3503f0fbta4148bf63361aaee@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 16, 2010 at 11:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Joachim Wieland <joe(at)mcknight(dot)de> writes:
>> [ listen/notify patch ]
>
> I found a number of implementation problems having to do with wraparound
> behavior and error recovery.  I think they're all fixed, but any
> remaining bugs are probably my fault not yours.

First, thanks for the rework you have done and thanks for applying this.

While I can see a lot of improvements over my version, I think the
logic in asyncQueueProcessPageEntries() needs to be reordered:

+ static bool
+ asyncQueueProcessPageEntries(QueuePosition *current,
+ QueuePosition stop,
+ char *page_buffer)
[...]
+ do
+ {
[...]
+ /*
+ * Advance *current over this message, possibly to the next page.
+ * As noted in the comments for asyncQueueReadAllNotifications, we
+ * must do this before possibly failing while processing the message.
+ */
+ reachedEndOfPage = asyncQueueAdvance(current, qe->length);
[...]
+ if (TransactionIdDidCommit(qe->xid))
[...]
+ else if (TransactionIdDidAbort(qe->xid))
[...]
+ else
+ {
+ /*
+ * The transaction has neither committed nor aborted so far,
+ * so we can't process its message yet. Break out of the loop.
+ */
+ reachedStop = true;
+ break;

In the beginning you are advancing *current but later on you could
find out that the transaction is still running. As the position in the
queue has already advanced you would miss one notification here
because you'd restart directly behind this notification in the
queue...

Joachim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2010-02-17 10:27:38 Re: Streaming replication on win32, still broken
Previous Message Zdenek Kotala 2010-02-17 10:08:07 codlin_month is up and complain - PL/Python crash