Re: Spec discussion: Generalized Data Queue / Modification Trigger

From: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Greg Sabino Mullane <greg(at)endpoint(dot)com>, PostgresCluster ML <pgsql-cluster-hackers(at)postgresql(dot)org>
Subject: Re: Spec discussion: Generalized Data Queue / Modification Trigger
Date: 2010-03-03 21:01:44
Message-ID: 1267650105.5157.36.camel@hvost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-cluster-hackers

On Wed, 2010-03-03 at 11:52 -0800, Josh Berkus wrote:
> Greg,
>
> >> (1) The ability to send asynchronous (or synchronous?) notifications, on
> >> a per-row basis, whenever data is modified *only after commit*. This
> >> has been generally described as "on-commit triggers", but could actually
> >> take a variety of forms.
> >
> > I'm not sure I like the idea of this. Could be potentially dangerous, as
> > listen/notify is not treated as a "reliable" process. What's wrong with
> > the current method, namely having a row trigger update an internal
> > table, and then a statement level trigger firing off a notify?
>
> Well, the main problem with that is that it doubles the number of writes
> you have to do ... or more. So it's a major efficiency issue.
>
> This isn't as much of a concern for a system like Slony or Londiste
> where the replication queue is a table in the database.

Yes. For Londiste, in addition to WAL writes, which write bigger chunks
of data, but need the same number of seeks and syncs, only deferred
writes to heap and a single index would be added and even those may
never be actually written to disk if replication is fast enough and the
event tables are rotated faster than background writer and checkpoints
try to write them down.

> But if you
> were, say, replicating through ApacheMQ? Or replicating cached data to
> Redis? Then the whole queue-table, NOTIFY, poll structure is needless
> overhead.

I't may seem easy to replace a database table with "something else" for
collecting the changes which have happened during the transaction, but
you have to answer the following questions:

1) do I need persistence, what about 2PC ?

2) does the "something else" work well for all situations an event table
would work (say, for example, a load of 500GB of data in one
transaction)

3) what would I gain in return for all the work needed to implement the
"something else" ?

> >> (3) A method of marking DDL changes in the data modification stream.

Yes, DDL triggers or somesuch would be highly desirable.

> > Hmm..can you expand on what you have in mind here? Something more than
> > just treating the DDL as another item in the (txn ordered) queue?
>
> Yeah, that would be one way to handle it. Alternately, you could have
> the ability to mark rows with a DDL "version".

But the actual DDL would still need to be transferred, no ?

--
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training

In response to

Responses

Browse pgsql-cluster-hackers by date

  From Date Subject
Next Message Josh Berkus 2010-03-03 21:43:43 Re: Spec discussion: Generalized Data Queue / Modification Trigger
Previous Message Josh Berkus 2010-03-03 19:52:52 Re: Spec discussion: Generalized Data Queue / Modification Trigger