Re: logical changeset generation v4 - Heikki's thoughts about the patch state

From: Steve Singer <steve(at)ssinger(dot)info>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Phil Sorber <phil(at)omniti(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical changeset generation v4 - Heikki's thoughts about the patch state
Date: 2013-01-24 16:15:15
Message-ID: BLU0-SMTP936D5A9C6D202B07CB5AB1DC140@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13-01-24 06:40 AM, Andres Freund wrote:
> Fair enough. I am also working on a user of this infrastructure but that
> doesn't help you very much. Steve Singer seemed to make some stabs at
> writing an output plugin as well. Steve, how far did you get there?

I was able to get something that generated output for INSERT statements
in a format similar to what a modified slony apply trigger would want.
This was with the list of tables to replicate hard-coded in the plugin.
This was with the patchset from the last commitfest.I had gotten a bit
hung up on the UPDATE and DELETE support because slony allows you to use
an arbitrary user specified unique index as your key. It looks like
better support for tables with a unique non-primary key is in the most
recent patch set. I am hoping to have time this weekend to update my
plugin to use parameters passed in on the init and other updates in the
most recent version. If I make some progress I will post a link to my
progress at the end of the weekend. My big issue is that I have limited
time to spend on this.

>> BTW, why does all the transaction reordering stuff has to be in core?
> It didn't use to, but people argued pretty damned hard that no undecoded
> data should ever allowed to leave the postgres cluster. And to be fair
> it makes writing an output plugin *way* much easier. Check
> If you skip over tuple_to_stringinfo(), which is just pretty generic
> scaffolding for converting a whole tuple to a string, writing out the
> changes in some format by now is pretty damn simple.

I think we will find that the replication systems won't be the only
users of this feature. I have often seen systems that have a logging
requirement for auditing purposes or to log then reconstruct the
sequence of changes made to a set of tables in order to feed a
downstream application. Triggers and a journaling table are the
traditional way of doing this but it should be pretty easy to write a
plugin to accomplish the same thing that should give better
performance. If the reordering stuff wasn't in core this would be much

>> How much of this infrastructure is to support replicating DDL changes? IOW,
>> if we drop that requirement, how much code can we slash?
> Unfortunately I don't think too much unless we add in other code that
> allows us to check whether the current definition of a table is still
> the same as it was back when the tuple was logged.
>> Any other features or requirements that could be dropped? I think it's clear at this stage that
>> this patch is not going to be committed as it is. If you can reduce it to a
>> fraction of what it is now, that fraction might have a chance. Otherwise,
>> it's just going to be pushed to the next commitfest as whole, and we're
>> going to be having the same doubts and discussions then.
> One thing that reduces complexity is to declare the following as
> unsupported:
> - CREATE TABLE foo(data text);
> - INSERT INTO foo(data) VALUES(very-long-to-be-externally-toasted-tuple);
> - DROP TABLE foo;
> but thats just a minor thing.
> I think what we can do more realistically than to chop of required parts
> of changeset extraction is to start applying some of the preliminary
> patches independently:
> - the relmapper/relfilenode changes + pg_relation_by_filenode(spc,
> relnode) should be independently committable if a bit boring
> - allowing walsenders to connect to a database possibly needs an interface change
> but otherwise it should be fine to go in independently. It also has
> other potential use-cases, so I think thats fair.
> - logging xl_running_xact's more frequently could also be committed
> independently and makes sense independently as it allows a standby to
> enter HS faster if the master is busy
> - Introducing InvalidCommandId should be relatively uncontroversial. The
> fact that no invalid value for command ids exists is imo an oversight
> - the *Satisfies change could be applied and they are imo ready but
> there's no use-case for it without the rest, so I am not sure whether
> theres a point
> - currently not separately available, but we could add wal_level=logical
> independently. There would be no user of it, but it would be partial
> work. That includes the relcache support for keeping track of the
> primary key which already is available separately.
> Greetings,
> Andres Freund

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2013-01-24 16:19:54 Re: BUG #6510: A simple prompt is displayed using wrong charset
Previous Message Jeff Janes 2013-01-24 16:01:11 Re: Setting visibility map in VACUUM's second phase