LISTEN/NOTIFY race condition?

From: Laurent Birtz <laurent(dot)birtz(at)kryptiva(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: LISTEN/NOTIFY race condition?
Date: 2008-03-05 00:17:51
Message-ID: 47CDE6AF.5040400@kryptiva.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

I am wondering if I have encountered a race condition in Postgres that
sometimes cause a notification not to be sent. This post is rather long
since the situation is rather complex and I tried to provide as much
details as possible.

Please note:

I have modified the file src/backend/commands/async.c to make the logging
more verbose. I have not otherwise modified the code. I am using
Postgres 8.3 on Debian/Linux. The sources of Postgres I am using were
taken from Debian's packages (unstable repos). Debian patches were
applied.

Organization of my setup:

I am running a multithreaded daemon that uses Postgres. This daemon has
two threads of interest: the command thread and the event thread. Each
thread uses a dedicated connection to the Postgres database, i.e. there
are two independent connections to the database. Both connections are
asynchronous, i.e. I have called PQsetnonblocking() on both connections
to make them non-blocking.

The command thread inserts data (which I call "events") in a table called
the "event log". The event thread polls the event log table from time to
time to retrieve and process the new events. The command thread never
read data from the event log and the event thread never write data to
the event log.

I use the LISTEN/NOTIFY mechanism to wake up the event thread when a
new event is written in the event log table. An AFTER-INSERT trigger is
created on the event log table so that any insertion in that table sends
a notification to the connections listening on that table, i.e. the event
thread.

The processing done by the event thread is as follow:

1) Connect to the database.
2) Do a LISTEN operation on the event log.
3) Do a SELECT operation on the event log to obtain all the events
inserted so far.
4) Enter the "wait-for-notification" loop. When a notification is
received, redo a SELECT operation to obtain the newly posted events.

The problem I am encountering is that sometimes, when an event is posted
between the steps 2) and 3) above, the following situation occurs:

a) No notification about this event is ever sent to the event thread.
b) The SELECT operation of step 3) does NOT retrieve the posted event.

Hence, the event thread does not retrieve the posted event until a
notification is received because another event was posted. The problem
occurs about 1 time of 100. All attempts to introduce sleep() calls
in the code of the daemon to make the problem more reproductible prevent
the problem from happening (heisenbug).

The following trace presents the steps being done until the problem
is encountered. The output sent to the daemon by Postgres is intermixed
with the output of the daemon itself. I provided comments to describe
what is going on. Clearly, what is shown here is the OBSERVED sequence
of events; the real chronological order can only be guessed at.

// The command thread begins a transaction.
Command thread: executing |start transaction|.
Command thread: executing |create sequence kws_961_event_id_seq|.

// The command thread creates the event table.
Command thread: executing
|create table kws_961_event_log (
id int primary key default nextval('kws_961_event_id_seq'),
type int,
event bytea)|.

NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"kws_961_event_log_pkey" for table "kws_961_event_log"

Command thread: executing
|create or replace function notify_log_961() returns trigger as '
begin if TG_OP = ''INSERT'' then notify kws_961_event_log;
end if; return NEW; end' language plpgsql|.

Command thread: executing
|create trigger notify_log_961 after insert on kws_961_event_log
execute procedure notify_log_961()|.

// The command thread posts an event in the event log.
Command thread: executing |INSERT INTO kws_961_event_log (bla bla bla...

DEBUG: Async_Notify(kws_961_event_log)
CONTEXT: SQL statement "notify kws_961_event_log"
PL/pgSQL function "notify_log_961" line 1 at SQL statement

// The command thread commits the transaction.
Command thread: executing |commit|.

DEBUG: AtCommit_Notify
DEBUG: Pending notification for relation kws_961_event_log. // My msg.
DEBUG: AtCommit_Notify: done

// NOT SHOWN: the command thread posts another event in the event log.

// The listen thread now wants to listen to the event log. At this point
// there are two events in the event log. NOTE: the event thread SENDS
// the query to Postgres, but does not RECEIVE the result yet (see
// below).
Event thread: executing |listen kws_961_event_log|.

DEBUG: Async_Listen(kws_961_event_log,17748)
DEBUG: Async_Listen(kws_961_event_log,17748): got exclusive lock on
listener table
DEBUG: Async_Listen(kws_961_event_log,17748): inserting new tuple
DEBUG: Async_Listen(kws_961_event_log,17748): about to release exclusive
lock
DEBUG: Async_Listen(kws_961_event_log,17748): released exclusive lock

// Meanwhile, the command thread posts the third event, in a transaction.
Command thread: executing |start transaction|.
Command thread: executing |INSERT INTO kws_961_event_log (bla bla bla

DEBUG: Async_Notify(kws_961_event_log)
CONTEXT: SQL statement "notify kws_961_event_log"
PL/pgSQL function "notify_log_961" line 1 at SQL statement

// The command thread commits the transaction.
Command thread: executing |commit|.
DEBUG: AtCommit_Notify

// Note: this message occurs because I printed the content of
// 'static List *pendingNotifies = NIL;' in async.c.
DEBUG: Pending notification for relation kws_961_event_log.

// Notice that NO notification is actually delivered.
DEBUG: AtCommit_Notify: done

// The event thread receives the result of the LISTEN operation.
Event thread: result for listen to workspace 961.

// The event thread polls the event log.
Event thread: |select id, type, event from kws_961_event_log
where id > 0|.

// The event thread receives the poll results. Notice that only two
// events are actually retrieved.
Event thread: fetched 2 events for workspace 961.
Event thread: got event ID 1 type 402784512.
Event thread: got event ID 2 type 402785024.

// No subsequent notification is received, even though PQconsumeInput()
// and PQnotifies() are called subsequently and repeatedly. A
// notification is only received when another event is posted.

From the observed sequence of events, it would appear that the event
thread inserts a tuple in the pg_listener table BEFORE the command thread
actually commits the transaction. However, when this transaction commits,
Postgres does not actually find the event thread's tuple in this table.
Consequently, one has to assume that the tuple in the pg_listener table
is inserted AFTER the AtCommit_Notify() function has executed.

But, when the event thread later polls the event log, it does not
actually retrieve the third event. This suggests that the transaction has
not yet been commited.

I have three hypotheseses at this point:

1) The pg_listener table is desynchronized.

2) The function AtCommit_Notify() does not hold the lock on the
pg_listener table until the transaction is fully committed, so the
event thread has the time to listen and poll the event log in that
small window of time.

3) I screwed up somewhere (most likely hypothesis :-).

Can somebody help me figure out what is going on? That would be much
appreciated!

Thanks in advance,
Laurent Birtz

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jasen Betts 2008-03-05 00:20:51 BUG #4010: psql does not honour -c after database name
Previous Message Gurjeet Singh 2008-03-04 13:53:49 Fwd: pg_ctl problem