Re: Clustering features for upcoming developer meeting -- please claim yours!

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Jan Wieck <JanWieck(at)yahoo(dot)com>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-cluster-hackers(at)postgresql(dot)org
Subject: Re: Clustering features for upcoming developer meeting -- please claim yours!
Date: 2010-05-11 09:24:59
Message-ID: AANLkTilw2BtuDGCy_jxNOJuKR1OpD7FLDytFlnxo1l08@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-cluster-hackers

On 5/11/10, Jan Wieck <JanWieck(at)yahoo(dot)com> wrote:
> On 5/10/2010 6:40 PM, Hannu Krosing wrote:
> > On Mon, 2010-05-10 at 17:04 -0400, Jan Wieck wrote:
> > > On 5/10/2010 4:25 PM, Marko Kreen wrote:
> > > > AFAICS the "agreeable order" should take care of positioning:
> > > > >
> http://wiki.postgresql.org/wiki/ModificationTriggerGDQ#Suggestions_for_Implementation
> > > > > This combined with DML triggers that react to invalidate events
> (like
> > > > PgQ ones) should already work fine?
> > > > > Are there situations where such setup fails?
> > > >

> So the idea is to have one queue that captures row level DML events as well
> as statement level DDL. That is certainly possible and in that case the
> event id will indeed provide a usable order for applying these actions, if
> it is taken from a non-cached sequence after all locks have been taken, as
> Marko explained.
>
> That event id resembles Slony's action_seq.
>
> The thing this event id alone does not provide is any point where inside
> that sequence of event id's the replica can issue commits. On a busy server,
> there may never be any such moment unless the replica applies things the
> Slony way instead of in monotonically increasing event id's. If your idea is
> to simply record things WAL style and shove them off to the replicas, you
> just move some of the current overhead from the master by duplicating it
> onto every replica.

I'm not sure about what overhead are you talking about.

Are you trying to get rid of current snapshot-based grouping
of events? Why?

> There are more things to consider about such a generalized queue,
> especially if we think about adding it to core.
>
> One for example is version independence. Slony and I think Londiste too can
> replicate across PostgreSQL server versions. And experience shows us that no
> communications protocol, on disk format or the like is ever set in stone. So
> we need to think how this queue can become backwards compatible without
> introducing more overhead than we try to save right now.

I'm guessing you are trying to do 2 more things:

1) Add queue operations to SQL syntax
2) Non-table custom storage.

I'm indifferent to 1) and dubious how big the win the 2) can bring,
but glad to be proven wrong.

But there's another issue - our experience with PgQ has shown
that generic queue means also generic code operating with it,
which means bugs. And transactional queue readers are not
allowed to drop events on problems. Which means on problems,
admins need to examine queue and delete/modify the events.

Ofcourse, the bug causing the problem needs also be fixed,
but bugfixing does not repair the queue, that must be done
manually.

If the 1) and/or 2) means such possibility is removed,
it will be quite big hit on the generic-ness of the GDQ.

In that aspect I would prefer to fix any remaining problems
(what are they?) with plain queue queue tables, even if
the "NoSQL" queueing could perform significantly better.

--
marko

In response to

Responses

Browse pgsql-cluster-hackers by date

  From Date Subject
Next Message Jan Wieck 2010-05-11 12:33:12 GDQ iimplementation (was: Re: Clustering features for upcoming developer meeting -- please claim yours!)
Previous Message Jan Wieck 2010-05-11 02:46:24 Re: Clustering features for upcoming developer meeting -- please claim yours!