Re: pg_basebackup: Always return valid temporary slot names

From: Jelte Fennema <me(at)jeltef(dot)nl>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: Nishant Sharma <nishant(dot)sharma(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
Subject: Re: pg_basebackup: Always return valid temporary slot names
Date: 2023-09-05 10:21:46
Message-ID: CAGECzQTYPWwNNRnXi5f5AH6tk9VqXVTQG3BPE0nbPEgYoTHvTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 5 Sept 2023 at 11:39, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>
> > On 5 Sep 2023, at 09:09, Nishant Sharma <nishant(dot)sharma(at)enterprisedb(dot)com> wrote:
>
> > With this, I was thinking, isn't this a problem of pgbouncer filling
> > be_pid with random bits? Maybe it should have filled the be_pid
> > with a random positive integer instead of any integer as it
> > represents a pid?
>
> I'm inclined to agree that anyone sending a value which is supposed to
> represent a PID should be expected to send a value which corresponds to the
> format of a PID.

When there is a pooler in the middle it already isn't a PID anyway. I
took a look at a few other connection poolers and all the ones I
looked at (Odyssey and pgcat) do the same: They put random bytes in
the be_pid field (and thus can result in negative values). This normally
does not cause any problems, because the be_pid value is simply sent
back verbatim to the server when canceling a query, which is it's main
purpose according to the docs:

> This message provides secret-key data that the frontend must save if it wants to be able to issue cancel requests later.

Source: https://www.postgresql.org/docs/current/protocol-flow.html#id-1.10.6.7.3

For that purpose it's actually more secure to use all bits for random
data, instead of keeping one bit always 0.

Its main other purpose that I know if is displaying it in a psql
prompt, so you know where to attach a debugger. This is completely
broken either way as soon as you have a connection pooler in the
middle, because you would want to display the Postgres backend PID
instead of the random ID that the connection pooler sends back. So if
it's negative that's no issue (it displays fine and it's useless
either way).

So, while I agree that putting a negative value in the process ID field of
BackendData, is arguably incorrect. Given the simplicity of the fix on
the pg_basebackup side, I think addressing it in pg_basebackup is
easier than fixing this in all connection poolers.

Sidenote: When PgBouncer is run in peering mode it actually uses the
first two bytes of the PID to encode the peer_id into it. That way it
knows to which peer it should forward the cancellation message. Thus
fixing this in PgBouncer would require using other bytes for that.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2023-09-05 10:25:29 Remove unnecessary 'always:' from CompilerWarnings task
Previous Message tender wang 2023-09-05 10:10:32 Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()