Re: Clustering features for upcoming developer meeting -- please claim yours!

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-cluster-hackers(at)postgresql(dot)org
Subject: Re: Clustering features for upcoming developer meeting -- please claim yours!
Date: 2010-05-11 02:46:24
Message-ID: 4BE8C500.7000509@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-cluster-hackers

On 5/10/2010 6:40 PM, Hannu Krosing wrote:
> On Mon, 2010-05-10 at 17:04 -0400, Jan Wieck wrote:
>> On 5/10/2010 4:25 PM, Marko Kreen wrote:
>> > AFAICS the "agreeable order" should take care of positioning:
>> >
>> > http://wiki.postgresql.org/wiki/ModificationTriggerGDQ#Suggestions_for_Implementation
>> >
>> > This combined with DML triggers that react to invalidate events (like
>> > PgQ ones) should already work fine?
>> >
>> > Are there situations where such setup fails?
>> >
>>
>> That explanation of an agreeable order only solves the problems of
>> placing the DDL into the replication stream between transactions,
>> possibly done by multiple clients.
>
> Why only "between transactions" (whatever that means) ?
>
> If all transactions get their event ids from the same non-cached
> sequence, then the event id _is_ a reliable ordering within a set of
> concurrent transactions.
>
> Event id's get serialized (wher it matters) by the very locks taken by
> DDL/DML statments on the objects they manipulate.
>
> Once more, for this to work over more than one backend, the sequence
> providing the event id's needs to be non-cached.
>
>> It does in no way address the problem of one single client executing a
>> couple of updates, modifies the object, then continues with updates. In
>> this case, there isn't even a transaction boundary at which the DDL
>> happened on the master. And this one transaction could indeed alter the
>> object several times.
>
> How is DDL here different from DML herev?
>
> You need to replay DML in the right order too, no ?
>
>> This means that a generalized data queue needs to have hooks, so that
>> DDL triggers can inject their payload into it.
>
> Anything that needs to be replicated, needs "to have hooks" in the
> generalized data queue, so that
>
> a) they get replicated in the right order for each affected object
> a.1) this can be relaxed for related objects in case FK-s are
> disabled of deferred until transaction end
> b) they get committed on the subscriber side at transaction (set)
> boundaries of provider.
>
> if you implement the data queue as something non-transactional (non
> pgQ-like), then you need to replicate (i,e. copy over and replay
>
> c) events from both committed and rollbacked transaction
> d) commits/rollbacks themselves
> e) and apply and/or rollback each individual transaction separately
>
> IOW you mostly re-implement WAL, except at logical level. Which may or
> may not be a good thing, depending on other requirements of the system.
>
> If you do it using pgQ you save on not copying rollbacked data, but you
> do slightly more work on provider side. You also end up with not having
> dead tuples from aborted transactions on subscriber.
>

So the idea is to have one queue that captures row level DML events as
well as statement level DDL. That is certainly possible and in that case
the event id will indeed provide a usable order for applying these
actions, if it is taken from a non-cached sequence after all locks have
been taken, as Marko explained.

That event id resembles Slony's action_seq.

The thing this event id alone does not provide is any point where inside
that sequence of event id's the replica can issue commits. On a busy
server, there may never be any such moment unless the replica applies
things the Slony way instead of in monotonically increasing event id's.
If your idea is to simply record things WAL style and shove them off to
the replicas, you just move some of the current overhead from the master
by duplicating it onto every replica.

There are more things to consider about such a generalized queue,
especially if we think about adding it to core.

One for example is version independence. Slony and I think Londiste too
can replicate across PostgreSQL server versions. And experience shows us
that no communications protocol, on disk format or the like is ever set
in stone. So we need to think how this queue can become backwards
compatible without introducing more overhead than we try to save right now.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

In response to

Responses

Browse pgsql-cluster-hackers by date

  From Date Subject
Next Message Marko Kreen 2010-05-11 09:24:59 Re: Clustering features for upcoming developer meeting -- please claim yours!
Previous Message Hannu Krosing 2010-05-10 22:48:13 Re: Clustering features for upcoming developer meeting -- please claim yours!