Re: Binary support for pgoutput plugin

From: Dave Cramer <davecramer(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary support for pgoutput plugin
Date: 2020-03-07 23:18:38
Message-ID: CADK3HH+BeJXB1_cS1CioMiCv8XLRmdwvjWXPMChLid9M9wTzzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 6 Mar 2020 at 08:54, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:

> Hi Dave,
>
> On 29/02/2020 18:44, Dave Cramer wrote:
> >
> >
> > rebased and removed the catversion bump.
>
> Looked into this and it generally seems okay, but I do have one gripe here:
>
> > + tuple->values[i].data = palloc(len
> + 1);
> > + /* and data */
> > +
> > + pq_copymsgbytes(in,
> tuple->values[i].data, len);
> > + tuple->values[i].len = len;
> > + tuple->values[i].cursor = 0;
> > + tuple->values[i].maxlen = len;
> > + /* not strictly necessary but the
> docs say it is required */
> > + tuple->values[i].data[len] = '\0';
> > + break;
> > + }
> > + case 't': /* text formatted
> value */
> > + {
> > + tuple->changed[i] = true;
> > + int len = pq_getmsgint(in, 4); /*
> read length */
> >
> > /* and data */
> > - tuple->values[i] = palloc(len + 1);
> > - pq_copymsgbytes(in,
> tuple->values[i], len);
> > - tuple->values[i][len] = '\0';
> > + tuple->values[i].data = palloc(len
> + 1);
> > + pq_copymsgbytes(in,
> tuple->values[i].data, len);
> > + tuple->values[i].data[len] = '\0';
> > + tuple->values[i].len = len;
>
> The cursor should be set to 0 in the text formatted case too if this is
> how we chose to encode data.
>
> However I am not quite convinced I like the StringInfoData usage here.
> Why not just change the struct to include additional array of lengths
> rather than replacing the existing values array with StringInfoData
> array, that seems generally both simpler and should have smaller memory
> footprint too, no?
>

Can you explain this a bit more? I don't really see a huge difference in
memory usage.
We still need length and the data copied into LogicalRepTupleData when
sending the data in binary, no?

>
> We could also merge the binary and changed arrays into single char array
> named something like format (as data can be either unchanged, binary or
> text) and just reuse same identifiers we have in protocol.
>

This seems like a good idea.

Dave Cramer

>
> --
> Petr Jelinek
> 2ndQuadrant - PostgreSQL Solutions for the Enterprise
> https://www.2ndQuadrant.com/
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-03-07 23:24:56 Re: What's difference among insert/write/flush lsn?
Previous Message James Coleman 2020-03-07 22:47:00 Re: [PATCH] Incremental sort (was: PoC: Partial sort)