Re: logical changeset generation v4 - Heikki's thoughts about the patch state

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, steve(dot)singer(at)awork2(dot)anarazel(dot)de
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Phil Sorber <phil(at)omniti(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical changeset generation v4 - Heikki's thoughts about the patch state
Date: 2013-01-24 11:40:10
Message-ID: 20130124114010.GB4231@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-24 12:38:25 +0200, Heikki Linnakangas wrote:
> On 24.01.2013 00:30, Andres Freund wrote:
> >Hi,
> >
> >I decided to reply on the patches thread to be able to find this later.
> >
> >On 2013-01-23 22:48:50 +0200, Heikki Linnakangas wrote:
> >>"logical changeset generation v4"
> >>This is a boatload of infrastructure for supporting logical replication, yet
> >>we have no code actually implementing logical replication that would go with
> >>this. The premise of logical replication over trigger-based was that it'd be
> >>faster, yet we cannot asses that without a working implementation. I don't
> >>think this can be committed in this state.
> >
> >Its a fair point that this is a huge amount of code without a user in
> >itself in-core.
> >But the reason it got no user included is because several people
> >explicitly didn't want a user in-core for now but said the first part of
> >this would be to implement the changeset generation as a separate
> >piece. Didn't you actually prefer not to have any users of this in-core
> >yourself?
>
> Yes, I certainly did. But we still need to see the other piece of the puzzle
> to see how this fits with it.

Fair enough. I am also working on a user of this infrastructure but that
doesn't help you very much. Steve Singer seemed to make some stabs at
writing an output plugin as well. Steve, how far did you get there?

> BTW, why does all the transaction reordering stuff has to be in core?

It didn't use to, but people argued pretty damned hard that no undecoded
data should ever allowed to leave the postgres cluster. And to be fair
it makes writing an output plugin *way* much easier. Check
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=contrib/test_decoding/test_decoding.c;hb=xlog-decoding-rebasing-cf4
If you skip over tuple_to_stringinfo(), which is just pretty generic
scaffolding for converting a whole tuple to a string, writing out the
changes in some format by now is pretty damn simple.

> How much of this infrastructure is to support replicating DDL changes? IOW,
> if we drop that requirement, how much code can we slash?

Unfortunately I don't think too much unless we add in other code that
allows us to check whether the current definition of a table is still
the same as it was back when the tuple was logged.

> Any other features or requirements that could be dropped? I think it's clear at this stage that
> this patch is not going to be committed as it is. If you can reduce it to a
> fraction of what it is now, that fraction might have a chance. Otherwise,
> it's just going to be pushed to the next commitfest as whole, and we're
> going to be having the same doubts and discussions then.

One thing that reduces complexity is to declare the following as
unsupported:
- CREATE TABLE foo(data text);
- DECODE UP TO HERE;
- INSERT INTO foo(data) VALUES(very-long-to-be-externally-toasted-tuple);
- DROP TABLE foo;
- DECODE UP TO HERE;

but thats just a minor thing.

I think what we can do more realistically than to chop of required parts
of changeset extraction is to start applying some of the preliminary
patches independently:
- the relmapper/relfilenode changes + pg_relation_by_filenode(spc,
relnode) should be independently committable if a bit boring
- allowing walsenders to connect to a database possibly needs an interface change
but otherwise it should be fine to go in independently. It also has
other potential use-cases, so I think thats fair.
- logging xl_running_xact's more frequently could also be committed
independently and makes sense independently as it allows a standby to
enter HS faster if the master is busy
- Introducing InvalidCommandId should be relatively uncontroversial. The
fact that no invalid value for command ids exists is imo an oversight
- the *Satisfies change could be applied and they are imo ready but
there's no use-case for it without the rest, so I am not sure whether
theres a point
- currently not separately available, but we could add wal_level=logical
independently. There would be no user of it, but it would be partial
work. That includes the relcache support for keeping track of the
primary key which already is available separately.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Phil Sorber 2013-01-24 11:40:40 Re: My first patch! (to \df output)
Previous Message Amit Kapila 2013-01-24 11:33:37 Re: CF3+4 (was Re: Parallel query execution)