RE: Add null termination to string received in parallel apply worker

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Add null termination to string received in parallel apply worker
Date: 2023-10-12 03:43:33
Message-ID: OS0PR01MB571687D3D8F4A70671717FE894D3A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, October 12, 2023 12:04 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

Hi,

>
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > I was thinking about this when skimming the other StringInfo thread a
> > couple of days ago. I wondered if it wouldn't be more convenient to
> > change the convention that all StringInfos are null-terminated: what
> > is really the reason to have them all be like that?
>
> It makes sense for StringInfos containing text, not because the stringinfo.c
> routines need it but because callers inspecting the string will very likely do
> something that expects nul-termination.
> When the StringInfo contains binary data, that argument has little force of
> course.
>
> I could see extending the convention for caller-supplied buffers (as is under
> discussion in the other thread) to say that the caller needn't provide a
> nul-terminated buffer if it is confident that no reader of the StringInfo will need
> that. I'd be even more inclined than before to tie this to a specification that
> such a StringInfo is read-only, though.
>
> In any case, this does not immediately let us jump to the conclusion that it'd be
> safe to use such a convention in apply workers. Aren't the things being passed
> around here usually text strings?

I think the data passed to parallel apply worker is of mixed types. If we see
the data reading logic for it like logicalrep_read_attrs(), it uses both
pq_getmsgint/pq_getmsgbyte/pq_getmsgint(binary) and pq_getmsgstring(text).

> Do you really want to promise that no reader is depending on nul-termination?

I think we could not make an absolute guarantee in this regard, but currently
all the consumer uses pq_getxxx style functions to read the passed data and it
also takes care to read the text stuff(get the length separately e.g.
logicalrep_read_tuple). So it seems OK to release the rule for it.

OTOH, I am not opposed to keeping the rule intact for the apply worker, just to
share the information and to gather opinions from others.

Best Regards,
Hou zj

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2023-10-12 03:48:51 Re: Synchronizing slots from primary to standby
Previous Message Merlin Moncure 2023-10-12 03:05:50 Memory knob testing (was Re: Let's make PostgreSQL multi-threaded)