Re: Logical Replication WIP

From: Andres Freund <andres(at)anarazel(dot)de>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>, Steve Singer <steve(at)ssinger(dot)info>
Subject: Re: Logical Replication WIP
Date: 2016-09-14 19:53:59
Message-ID: 20160914195358.msst4gpebwqwifwn@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2016-09-14 21:17:42 +0200, Petr Jelinek wrote:
> > > +/*
> > > + * Gather Relations based o provided by RangeVar list.
> > > + * The gathered tables are locked in access share lock mode.
> > > + */
> >
> > Why access share? Shouldn't we make this ShareUpdateExclusive or
> > similar, to prevent schema changes?
> >
>
> Hm, I thought AccessShare would be enough to prevent schema changes that
> matter to us (which is basically just drop afaik).

Doesn't e.g. dropping an index matter as well?

> > > + if (strcmp($1, "replicate_insert") == 0)
> > > + $$ = makeDefElem("replicate_insert",
> > > + (Node *)makeInteger(TRUE), @1);
> > > + else if (strcmp($1, "noreplicate_insert") == 0)
> > > + $$ = makeDefElem("replicate_insert",
> > > + (Node *)makeInteger(FALSE), @1);
> > > + else if (strcmp($1, "replicate_update") == 0)
> > > + $$ = makeDefElem("replicate_update",
> > > + (Node *)makeInteger(TRUE), @1);
> > > + else if (strcmp($1, "noreplicate_update") == 0)
> > > + $$ = makeDefElem("replicate_update",
> > > + (Node *)makeInteger(FALSE), @1);
> > > + else if (strcmp($1, "replicate_delete") == 0)
> > > + $$ = makeDefElem("replicate_delete",
> > > + (Node *)makeInteger(TRUE), @1);
> > > + else if (strcmp($1, "noreplicate_delete") == 0)
> > > + $$ = makeDefElem("replicate_delete",
> > > + (Node *)makeInteger(FALSE), @1);
> > > + else
> > > + ereport(ERROR,
> > > + (errcode(ERRCODE_SYNTAX_ERROR),
> > > + errmsg("unrecognized publication option \"%s\"", $1),
> > > + parser_errposition(@1)));
> > > + }
> > > + ;
> >
> > I'm kind of inclined to do this checking at execution (or transform)
> > time instead. That allows extension to add options, and handle them in
> > utility hooks.
> >
>
> Thant's interesting point, I prefer the parsing to be done in gram.y, but it
> might be worth moving it for extensibility. Although there are so far other
> barriers for that.

Citus uses the lack of such check for COPY to implement copy over it's
distributed tables for example. So there's some benefit.

> > > + check_subscription_permissions();
> > > +
> > > + rel = heap_open(SubscriptionRelationId, RowExclusiveLock);
> > > +
> > > + /* Check if name is used */
> > > + subid = GetSysCacheOid2(SUBSCRIPTIONNAME, MyDatabaseId,
> > > + CStringGetDatum(stmt->subname));
> > > + if (OidIsValid(subid))
> > > + {
> > > + ereport(ERROR,
> > > + (errcode(ERRCODE_DUPLICATE_OBJECT),
> > > + errmsg("subscription \"%s\" already exists",
> > > + stmt->subname)));
> > > + }
> > > +
> > > + /* Parse and check options. */
> > > + parse_subscription_options(stmt->options, &enabled_given, &enabled,
> > > + &conninfo, &publications);
> > > +
> > > + /* TODO: improve error messages here. */
> > > + if (conninfo == NULL)
> > > + ereport(ERROR,
> > > + (errcode(ERRCODE_SYNTAX_ERROR),
> > > + errmsg("connection not specified")));
> >
> > Probably also makes sense to parse the conninfo here to verify it looks
> > saen. Although that's fairly annoying to do, because the relevant code
> > is libpq :(
> >
>
> Well the connection is eventually used (in later patches) so maybe that's
> not problem.

Well, it's nicer if it's immediately parsed, before doing complex and
expensive stuff, especially if that happens outside of the transaction.

> >
> > > diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
> > > index 65230e2..f3d54c8 100644
> > > --- a/src/backend/nodes/copyfuncs.c
> > > +++ b/src/backend/nodes/copyfuncs.c
> >
> > I think you might be missing outfuncs support.
> >
>
> I thought that we don't do outfuncs for DDL?

I think it's just readfuncs that's skipped.

> > > + Length of column name (including the NULL-termination
> > > + character).
> > > +</para>
> > > +</listitem>
> > > +</varlistentry>
> > > +<varlistentry>
> > > +<term>
> > > + String
> > > +</term>
> > > +<listitem>
> > > +<para>
> > > + Name of the column.
> > > +</para>
> > > +</listitem>
> > > +</varlistentry>
> >
> > Huh, no type information?
> >
>
> It's not necessary for the text transfer, it will be if we ever add binary
> data transfer but that will require protocol version bump anyway.

I'm *hugely* unconvinced of this. For one type information is useful for
error reporting and such as well. For another, it's one thing to add a
new protocol message (for differently encoded tuples), and something
entirely different to change the format of existing messages.

> > > +
> > > +/*
> > > + * COMMIT callback
> > > + */
> > > +static void
> > > +pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
> > > + XLogRecPtr commit_lsn)
> > > +{
> > > + OutputPluginPrepareWrite(ctx, true);
> > > + logicalrep_write_commit(ctx->out, txn, commit_lsn);
> > > + OutputPluginWrite(ctx, true);
> > > +}
> >
> > Hm, so we don't reset the context for these...
> >
>
> What?

We only use & reset the data-> memory context in the change
callback. I'm not sure that's good.

> > This however I'm not following. Why do we need multiple copies of this?
> > And why aren't we doing the assignments in _PG_init? Seems better to
> > just allocate one WalRcvCalllbacks globally and assign all these as
> > constants. Then the establishment function can just return all these
> > (as part of a bigger struct).
> >
>
> Meh, If I understand you correctly that will make the access bit more ugly
> (multiple layers of structs).

On the other hand, you right now need to access one struct, and pass the
other...

> > This does rather reinforce my opinion that the _PG_init removal in
> > libpqwalreceiver isn't useful.
>
> I don't see how it helps, you said we'd still return struct from some
> interface so this would be more or less the same?

Or we just set some global vars and use them directly.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2016-09-14 19:54:08 Re: Choosing parallel_degree
Previous Message Robert Haas 2016-09-14 19:48:46 Re: Choosing parallel_degree