Re: Postgres-R: internal messaging

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexey Klyukin <alexk(at)commandprompt(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgres-R: internal messaging
Date: 2008-07-23 20:26:52
Message-ID: 4887940C.4090305@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

what follows are some comments after trying to understand how the
autovacuum launcher works and thoughts on how to apply this to the
replication manager in Postgres-R.

The initial comments in autovacuum.c say:

> If the fork() call fails in the postmaster, it sets a flag in the shared
> memory area, and sends a signal to the launcher.

I note that the shmem area that the postmaster is writing to is pretty
static and not dependent on any other state stored in shmem. That
certainly makes a difference compared to my imessages approach, where a
corruption in the shmem for imessages could also confuse the postmaster.

Reading on, the 'can_launch' flag in the launcher's main loop makes sure
that only one worker is requested concurrently, so that the launcher
doesn't miss a failure or success notice from either the postmaster or
the newly started worker. The replication manager currently shamelessly
requests as many helper backend as it wants. I think I can change that
without much trouble. Would certainly make sense.

Notifications of the replication manager after termination or crashes of
a helper backend remain. Upon normal errors (i.e. elog(ERROR... ), the
backend processes themselves should take care of notifying the
replication manager. But crashes are more difficult. IMO the replication
manager needs to stay alive during this reinitialization, to keep the
GCS connection. However, it can easily detach from shared memory
temporarily (the imessages stuff is the only shmem place it touches,
IIRC). However, a more difficult aspect is: it must be able to tell if a
backend has applied its transaction *before* it died or not. Thus, after
all backends have been killed, the postmaster needs to wait with
reinitializing shared memory, until the replication manager has consumed
all its messages. (Otherwise we would risk "losing" local transactions,
probably also remote ones).

So, yes, after thinking about it, detaching the postmaster from shared
memory seems doable for Postgres-R (in the sense of "the postmaster does
not rely on possibly corrupted data in shared memory"). Reinitialization
needs some more thoughts, but in general that seems like the way to go.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2008-07-23 20:28:46 Re: [GENERAL] Fragments in tsearch2 headline
Previous Message Dann Corbit 2008-07-23 20:23:25 Re: Research/Implementation of Nested Loop Join optimization