Re: [PATCH] Logical decoding of TRUNCATE

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Logical decoding of TRUNCATE
Date: 2018-04-04 02:31:14
Message-ID: a22a12cf-be0a-cef6-f152-6984a47f4878@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Here is an updated patch that addresses some of your concerns.

I've split it up into the decoding support and the pgoutput replication
support.

The problem I see now is that when we WAL-log a truncate, we include all
the relids in one record. That seems useful. But during decoding we
split this into one change per relid. So at the receiving end, truncate
can only process one relation at a time, which will fail if there are
foreign keys involved. The solution that had been proposed here was to
ignore foreign keys on the subscriber, which is clearly problematic.

I wonder why this approach was chosen. If we go through the trouble of
WAL-logging all the relations together, why not present them to the
output plugin as one so that they can be applied together? This will
clearly make questions of filtering and replication set membership and
so on more difficult, but that's just the nature of the thing. If you
are connecting tables by foreign keys and only replicate some of them,
then that's going to have confusing effects no matter what you do.

(That's perhaps another reason why having the option of replicating
truncate separately from delete could be useful.)

I'm going to try to hack up an alternative approach and see how it works
out.

On 4/1/18 16:01, Andres Freund wrote:
> On 2018-01-25 14:21:15 +0100, Marco Nenciarini wrote:
>> + if (SessionReplicationRole != SESSION_REPLICATION_ROLE_REPLICA)
>> + {
>> +
>> + /*
>> + * Check foreign key references. In CASCADE mode, this should be
>> + * unnecessary since we just pulled in all the references; but as a
>> + * cross-check, do it anyway if in an Assert-enabled build.
>> + */
>> #ifdef USE_ASSERT_CHECKING
>> heap_truncate_check_FKs(rels, false);
>> + #else
>> + if (stmt->behavior == DROP_RESTRICT)
>> + heap_truncate_check_FKs(rels, false);
>> #endif
>> + }
>
> That *can't* be right.

This is actually existing code that just looks funny in the diff after
being indented.

But I'd rather skip this patch altogether and find a different solution;
see above.

> I know this has been discussed in the thread already, but it really
> strikes me as wrong to basically do some mini DDL replication feature
> via per-command WAL records.

I think TRUNCATE is special in some ways and it's reasonable to treat it
specially. Real DDL is being worked on in the 2PC decoding thread.
What we are discussing here isn't going to be applicable there and vice
versa, I think. In fact, one of the reasons for this effort is that in
BDR TRUNCATE is currently handled like a DDL command, which doesn't work
well.

>> + <para>
>> + <command>TRUNCATE</command> is treated as a form of
>> + <command>DELETE</command> for the purpose of deciding whether
>> + to publish, or not.
>> + </para>
>> </listitem>
>> </varlistentry>
>> </variablelist>
>
> Why is this a good idea?

I have changed this by making this a separate property.

> Hm, it seems logicaldecoding.sgml hasn't been updated?

I re-checked but didn't find anything suitable to update.

>> + void
>> + ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
>> + DropBehavior behavior, bool restart_seqs)
>> + {
>> + List *rels = list_copy(explicit_rels);
>
> Why is this copied?

Because it is overwritten later. I have moved it down a bit to make
that a bit clearer.

>> + * Write a WAL record to allow this set of actions to be logically decoded.
>> + * We could optimize this away when !RelationIsLogicallyLogged(rel)
>> + * but that doesn't save much space or time.
>
> What you're saying isn't that you're not logging anything, but that
> you're allocating the header regardless? Because this certainly sounds
> like you unconditionally log a WAL record.

> I'm confused. Why do we need the resizing here, when we know the max
> upfront?

I have rewritten this a bit and removed the logging of the sequence
relids, which isn't used anywhere after, to make the code a bit simpler.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v18-0001-Logical-decoding-of-TRUNCATE.patch text/plain 31.1 KB
v18-0002-Logical-replication-support-for-TRUNCATE.patch text/plain 35.4 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2018-04-04 06:11:56 Re: dblink: could not send query: another command is already in progress
Previous Message Thiemo Kellner 2018-04-04 00:35:28 Re: dblink: could not send query: another command is already in progress

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-04-04 02:31:17 Re: [HACKERS] Runtime Partition Pruning
Previous Message David Rowley 2018-04-04 02:27:08 Re: [HACKERS] Runtime Partition Pruning