GDQ iimplementation (was: Re: Clustering features for upcoming developer meeting -- please claim yours!)

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-cluster-hackers(at)postgresql(dot)org
Subject: GDQ iimplementation (was: Re: Clustering features for upcoming developer meeting -- please claim yours!)
Date: 2010-05-11 12:33:12
Message-ID: 4BE94E88.3010008@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-cluster-hackers

I changed the subject line because we are diving deep into
implementation details.

On 5/11/2010 5:24 AM, Marko Kreen wrote:
> On 5/11/10, Jan Wieck <JanWieck(at)yahoo(dot)com> wrote:
>> The thing this event id alone does not provide is any point where inside
>> that sequence of event id's the replica can issue commits. On a busy server,
>> there may never be any such moment unless the replica applies things the
>> Slony way instead of in monotonically increasing event id's. If your idea is
>> to simply record things WAL style and shove them off to the replicas, you
>> just move some of the current overhead from the master by duplicating it
>> onto every replica.
>
> I'm not sure about what overhead are you talking about.
>
> Are you trying to get rid of current snapshot-based grouping
> of events? Why?

The problem statement on the Wiki page and Itagaki's comments about
non-table storage of the queue made it look to me as if some WAL style
flat file approach was looked after.

I am glad that we agree that we cannot get rid of the snapshot based
grouping. That and the IMHO required table storage is the overhead I was
talking about. We should be clear that we cannot get rid of that
grouping and that however many log segments are used (Slony currently 2,
Londiste default 3), the oldest running transaction on the master
determines which log segments can get truncated. The more log segments
there are in use, the more UNION keywords may appear in the query,
selecting from the log.

>
>> There are more things to consider about such a generalized queue,
>> especially if we think about adding it to core.
>>
>> One for example is version independence. Slony and I think Londiste too can
>> replicate across PostgreSQL server versions. And experience shows us that no
>> communications protocol, on disk format or the like is ever set in stone. So
>> we need to think how this queue can become backwards compatible without
>> introducing more overhead than we try to save right now.
>
> I'm guessing you are trying to do 2 more things:
>
> 1) Add queue operations to SQL syntax
> 2) Non-table custom storage.

No. I don't know how you read 1) into the above and 2) was my
misunderstanding reading the Wiki. I don't want either.

> But there's another issue - our experience with PgQ has shown
> that generic queue means also generic code operating with it,
> which means bugs. And transactional queue readers are not
> allowed to drop events on problems. Which means on problems,
> admins need to examine queue and delete/modify the events.
>
> Ofcourse, the bug causing the problem needs also be fixed,
> but bugfixing does not repair the queue, that must be done
> manually.
>
> If the 1) and/or 2) means such possibility is removed,
> it will be quite big hit on the generic-ness of the GDQ.
>
> In that aspect I would prefer to fix any remaining problems
> (what are they?) with plain queue queue tables, even if
> the "NoSQL" queueing could perform significantly better.

A generic queue implementation needs to come with some advantage over
what we have now. Otherwise there is no incentive for any of the
existing systems to even consider switching to it.

What are the advantages of anything proposed over the current
implementations used by Londiste and Slony?

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

In response to

Responses

Browse pgsql-cluster-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-05-11 13:19:04 Re: GDQ iimplementation (was: Re: Clustering features for upcoming developer meeting -- please claim yours!)
Previous Message Marko Kreen 2010-05-11 09:24:59 Re: Clustering features for upcoming developer meeting -- please claim yours!