Re: Logical decoding for operations on zheap tables

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Logical decoding for operations on zheap tables
Date: 2019-01-03 18:00:55
Message-ID: 20190103180055.6wjapbjlojcgittv@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-12-31 09:56:48 +0530, Amit Kapila wrote:
> To support logical decoding for zheap operations, we need a way to
> ensure zheap tuples can be registered as change streams. One idea
> could be that we make ReorderBufferChange aware of another kind of
> tuples as well, something like this:
>
> @@ -100,6 +123,20 @@ typedef struct ReorderBufferChange
> ReorderBufferTupleBuf *newtuple;
> } tp;
> + struct
> + {
> + /* relation that has been changed */
> + RelFileNode relnode;
> +
> + /* no previously reassembled toast chunks are necessary anymore */
> + bool clear_toast_afterwards;
> +
> + /* valid for DELETE || UPDATE */
> + ReorderBufferZTupleBuf *oldtuple;
> + /* valid for INSERT || UPDATE */
> + ReorderBufferZTupleBuf *newtuple;
> + } ztp;
> +
>
>
> +/* an individual zheap tuple, stored in one chunk of memory */
> +typedef struct ReorderBufferZTupleBuf
> +{
> ..
> + /* tuple header, the interesting bit for users of logical decoding */
> + ZHeapTupleData tuple;
> ..
> +} ReorderBufferZTupleBuf;
>
> Apart from this, we need to define different decode functions for
> zheap operations as the WAL data is different for heap and zheap, so
> same functions can't be used to decode.

I'm very strongly opposed to that. We shouldn't have expose every
possible storage method to output plugins, that'll make extensibility
a farce. I think we'll either have to re-form a HeapTuple or decide
to bite the bullet and start exposing tuples via slots.

> This email is primarily to discuss about how the logical decoding for
> basic DML operations (Insert/Update/Delete) will work in zheap. We
> might need some special mechanism to deal with sub-transactions as
> zheap doesn't generate a transaction id for sub-transactions, but we
> can discuss that separately.

Subtransactions seems to be the hardest part besides the tuple format
issue, so I think we should discuss that very soon.

> +/*
> + * Write zheap's INSERT to the output stream.
> + */
> +void
> +logicalrep_write_zinsert(StringInfo out, Relation rel, ZHeapTuple newtuple)
> +{
> + pq_sendbyte(out, 'I'); /* action INSERT */
> +
> + Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
> + rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
> + rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
> +
> + /* use Oid as relation identifier */
> + pq_sendint32(out, RelationGetRelid(rel));
> +
> + pq_sendbyte(out, 'N'); /* new tuple follows */
> + //logicalrep_write_tuple(out, rel, newtuple);
> +}

Obviously we need to do better - I don't think we should have
tuple-specific replication messages.

> /*
> * Write relation description to the output stream.
> */
> diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
> index 23466bade2..70fb5e2934 100644
> --- a/src/backend/replication/logical/reorderbuffer.c
> +++ b/src/backend/replication/logical/reorderbuffer.c
> @@ -393,6 +393,19 @@ ReorderBufferReturnChange(ReorderBuffer *rb, ReorderBufferChange *change)
> change->data.tp.oldtuple = NULL;
> }
> break;
> + case REORDER_BUFFER_CHANGE_ZINSERT:

This really needs to be undistinguishable from normal CHANGE_INSERT...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-01-03 18:03:15 Re: Unified logging system for command-line programs
Previous Message Tom Lane 2019-01-03 17:52:10 Re: Python versions (was Re: RHEL 8.0 build)