Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: <andres(at)2ndquadrant(dot)com>,<heikki(dot)linnakangas(at)enterprisedb(dot)com>, <robertmhaas(at)gmail(dot)com>, <daniel(at)heroku(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Date: 2012-06-20 15:34:42
Message-ID: 4FE1A74202000025000487FB@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
> Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>>> Heikki Linnakangas wrote:
>>
>>> I don't like the idea of adding the origin id to the record
>>> header. It's only required in some occasions, and on some record
>>> types.
>>
>> Right.
>
> Wrong, as explained.

The point is not wrong; you are simply not responding to what is
being said.

You have not explained why an origin ID is required when there is no
replication, or if there is master/slave logical replication, or
there are multiple masters with non-overlapping primary keys
replicating to a single table in a consolidated database, or each
master replicates to all other masters directly, or any of various
other scenarios raised on this thread. You've only explained why
it's necessary for certain configurations of multi-master
replication where all rows in a table can be updated on any of the
masters. I understand that this is the configuration you find most
interesting, at least for initial implementation. That does not
mean that the other situations don't exist as use cases or should be
not be considered in the overall design.

I don't think there is anyone here who would not love to see this
effort succeed, all the way to multi-master replication in the
configuration you are emphasizing. What is happening is that people
are expressing concerns about parts of the design which they feel
are problematic, and brainstorming about possible alternatives. As
I'm sure you know, fixing a design problem at this stage in
development is a lot less expensive than letting the problem slide
and trying to deal with it later.

> It isn't true that this is needed only for some configurations of
> multi-master, per discussion.

I didn't get that out of the discussion; I saw a lot of cases
mentioned as not needing it to which you simply did not respond.

> This is not transaction metadata, it is WAL record metadata
> required for multi-master replication, see later point.
>
> We need to add information to every WAL record that is used as the
> source for generating LCRs.

If the origin ID of a transaction doesn't count as transaction
metadata (i.e., data about the transaction), what does? It may be a
metadata element about which you have special concerns, but it is
transaction metadata. You don't plan on supporting individual WAL
records within a transaction containing different values for origin
ID, do you? If not, why is it something to store in every WAL
record rather than once per transaction? That's not intended to be
a rhetorical question. I think it's because you're still thinking
of the WAL stream as *the medium* for logical replication data
rather than *the source* of logical replication data.

As long as the WAL stream is the medium, options are very
constrained. You can code a very fast engine to handle a single
type of configuration that way, and perhaps that should be a
supported feature, but it's not a configuration I've needed yet.
(Well, on reflection, if it had been available and easy to use, I
can think of *one* time I *might* have used it for a pair of nodes.)
It seems to me that you are so focused on this one use case that you
are not considering how design choices which facilitate fast
development of that use case paint us into a corner in terms of
expanding to other use cases.

>> I think removing origin ID from this patch and submitting a
>> separate patch for a generalized transaction metadata system is
>> the sensible way to go.
>
> We already have a very flexible WAL system for recording data of
> interest to various resource managers. If you wish to annotate a
> transaction, you can either generate a new kind of WAL record or
> you can enhance a commit record.

Right. Like many of us are suggesting should be done for origin ID.

> XLOG_NOOP records can already be generated by your application if
> you wish to inject additional metadata to the WAL stream. So no
> changes are required for you to implement the generalised
> transaction metadata scheme you say you require.

I'm glad it's that easy. Are there SQL functions to for that yet?

> Not sure how or why that relates to requirements for multi-master.

That depends on whether you want to leave the door open to other
logical replication than the one use case on which you are currently
focused. I even consider some of those other cases multi-master,
especially when multiple databases are replicating to a single table
on another server. I'm not clear on your definition -- it seems to
be rather more narrow. Maybe we need to define some terms somewhere
to facilitate discussion. Is there a Wiki page where that would
make sense?

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-20 15:44:09 Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Previous Message Tom Lane 2012-06-20 15:34:25 Re: libpq compression