Aw: Re: Minor documentation error regarding streaming replication protocol

From: Brar Piening <Brar(at)gmx(dot)de>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Aw: Re: Minor documentation error regarding streaming replication protocol
Date: 2020-10-15 06:27:51
Message-ID: trinity-b5c0c9ec-957a-42f8-9f58-cb01cd00dd39-1602743271839@3c-app-gmx-bs50
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> Good point. The reporter was assuming the data would come to the client
>> in the bytea output format specified by the GUC, e.g. \x..., so that
>> doesn't happen either. As I said before, it is more raw bytes, but we
>> don't have an SQL data type for that.

> I did some more research on this. It turns out timeline 'content' is
> the only BYTEA listed in the protocol docs, even though it just passes C
> strings to pq_sendbytes(), just like many other fields like the field
> above it, the timeline history filename. The proper fix is to change
> the code to return the timeline history file contents as TEXT instead of
> BYTEA.

In the light of what Michael wrote above, I don't think that this is really enough.

If the timeline history file can contain strings which "may not be made just of ASCII characters" this would probably make the client side assume that the content is being sent as TEXT encoded in client_encoding which again isn't true.
In the worst case this could lead to nasty decoding bugs on the client side which could even result in security issues.

Since you probably can't tell in which encoding the aforementioned "recovery target name" was written to the timeline history file, I agree with Michael that BYTEA is probably the sanest way to send this file.

IMO the best way out of this is to either really encode the content as BYTEA by passing it through byteaout() and by that escaping characters <0x20 and >0x7e, or to document that the file is being sent "as raw bytes that can be read as 'bytea Escape Format' by parsers compatible with byteain()" (this works because byteain() doesn't check whether bytes <0x20 or >0x7e are actually escaped).

Again, reading the raw bytes, either via byteain() or just as raw bytes, isn't really a problem and I don't want to bring you into a situation where the cure is worse than the disease.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2020-10-15 06:55:22 RE: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Kyotaro Horiguchi 2020-10-15 05:52:10 Re: Add Information during standby recovery conflicts