Re: Binary support for pgoutput plugin

From: Dave Cramer <davecramer(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary support for pgoutput plugin
Date: 2020-04-07 19:45:57
Message-ID: CADK3HHKXjajsTDiEEaP_TiJtRbfQxkL0tSOHWF+OP308AP4apQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 3 Apr 2020 at 16:44, Dave Cramer <davecramer(at)gmail(dot)com> wrote:

>
>
> On Fri, 3 Apr 2020 at 03:43, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>
>> Hi,
>>
>> On 08/03/2020 00:18, Dave Cramer wrote:
>> > On Fri, 6 Mar 2020 at 08:54, Petr Jelinek <petr(at)2ndquadrant(dot)com
>> > <mailto:petr(at)2ndquadrant(dot)com>> wrote:
>> >
>> > Hi Dave,
>> >
>> > On 29/02/2020 18:44, Dave Cramer wrote:
>> > >
>> > >
>> > > rebased and removed the catversion bump.
>> >
>> > Looked into this and it generally seems okay, but I do have one
>> > gripe here:
>> >
>> > > + tuple->values[i].data =
>> > palloc(len + 1);
>> > > + /* and data */
>> > > +
>> > > + pq_copymsgbytes(in,
>> > tuple->values[i].data, len);
>> > > + tuple->values[i].len = len;
>> > > + tuple->values[i].cursor =
>> 0;
>> > > + tuple->values[i].maxlen =
>> len;
>> > > + /* not strictly necessary
>> > but the docs say it is required */
>> > > + tuple->values[i].data[len]
>> > = '\0';
>> > > + break;
>> > > + }
>> > > + case 't': /* text
>> > formatted value */
>> > > + {
>> > > + tuple->changed[i] = true;
>> > > + int len = pq_getmsgint(in,
>> > 4); /* read length */
>> > >
>> > > /* and data */
>> > > - tuple->values[i] =
>> > palloc(len + 1);
>> > > - pq_copymsgbytes(in,
>> > tuple->values[i], len);
>> > > - tuple->values[i][len] =
>> '\0';
>> > > + tuple->values[i].data =
>> > palloc(len + 1);
>> > > + pq_copymsgbytes(in,
>> > tuple->values[i].data, len);
>> > > + tuple->values[i].data[len]
>> > = '\0';
>> > > + tuple->values[i].len = len;
>> >
>> > The cursor should be set to 0 in the text formatted case too if
>> this is
>> > how we chose to encode data.
>> >
>> > However I am not quite convinced I like the StringInfoData usage
>> here.
>> > Why not just change the struct to include additional array of
>> lengths
>> > rather than replacing the existing values array with StringInfoData
>> > array, that seems generally both simpler and should have smaller
>> memory
>> > footprint too, no?
>> >
>> >
>> > Can you explain this a bit more? I don't really see a huge difference
>> in
>> > memory usage.
>> > We still need length and the data copied into LogicalRepTupleData when
>> > sending the data in binary, no?
>> >
>>
>> Yes but we don't need to have fixed sized array of 1600 elements that
>> contains maxlen and cursor positions of the StringInfoData struct which
>> we don't use for anything AFAICS.
>>
>
New patch that fixes a number of errors in the check for validity as well
as reduces the memory usage by
dynamically allocating the data changed as well as collapsing the changed
and binary arrays into a format array.

Dave Cramer

>

Attachment Content-Type Size
0001-First-pass-at-working-code-without-subscription-opti.patch application/octet-stream 21.4 KB
0002-add-binary-column-to-pg_subscription.patch application/octet-stream 11.5 KB
0003-document-new-binary-option-for-CREATE-SUBSCRIPTION.patch application/octet-stream 2.9 KB
0005-Remove-C99-declaration-and-code.patch application/octet-stream 1.3 KB
0004-get-relid-inside-of-logical_read_insert.patch application/octet-stream 2.5 KB
0006-make-sure-the-subcription-is-marked-as-binary.patch application/octet-stream 829 bytes
0007-check-that-the-subscriber-is-compatible-with-the-pub.patch application/octet-stream 7.6 KB
0008-Changed-binary-and-changed-to-format-and-use-existin.patch application/octet-stream 16.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anna Akenteva 2020-04-07 19:58:01 Re: [HACKERS] make async slave to wait for lsn to be replayed
Previous Message Andres Freund 2020-04-07 19:43:27 Re: Improving connection scalability: GetSnapshotData()