Re: PROXY protocol support

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Jacob Champion <pchampion(at)vmware(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PROXY protocol support
Date: 2021-03-06 15:17:18
Message-ID: CABUevEygBkZWyw0qYt8pMZ4jp3DOjMYKoj-68T_qFjeYqoi+6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 5, 2021 at 8:11 PM Jacob Champion <pchampion(at)vmware(dot)com> wrote:
>
> On Fri, 2021-03-05 at 10:22 +0100, Magnus Hagander wrote:
> > On Fri, Mar 5, 2021 at 12:21 AM Jacob Champion <pchampion(at)vmware(dot)com> wrote:
> > > The original-host logging isn't working for me:
> > >
> > > [...]
> >
> > That's interesting -- it works perfectly fine here. What platform are
> > you testing on?
>
> Ubuntu 20.04.

Curious. It doesn't show up on my debian.

But either way -- it was clearly wrong :)

> > (I sent for sizeof(SockAddr) to make it
> > easier to read without having to look things up, but the net result is
> > the same)
>
> Cool. Did you mean to attach a patch?

I didn't, I had some other hacks that were broken :) I've attached one
now which includes those changes.

> == More Notes ==
>
> (Stop me if I'm digging too far into a proof of concept patch.)

Definitely not -- much appreciated, and just what was needed to take
it from poc to a proper one!

> > + proxyaddrlen = pg_ntoh16(proxyheader.len);
> > +
> > + if (proxyaddrlen > sizeof(proxyaddr))
> > + {
> > + ereport(COMMERROR,
> > + (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > + errmsg("oversized proxy packet")));
> > + return STATUS_ERROR;
> > + }
>
> I think this is not quite right -- if there's additional data beyond
> the IPv6 header size, that just means there are TLVs tacked onto the
> header that we should ignore. (Or, eventually, use.)

Yeah, you're right. Fallout of too much moving around. I think inthe
end that code should just be removed, in favor of the discard path as
you mentinoed below.

> Additionally, we need to check for underflow as well. A misbehaving
> proxy might not send enough data to fill up the address block for the
> address family in use.

I used to have that check. I seem to have lost it in restructuring. Added back!

> > + /* If there is any more header data present, skip past it */
> > + if (proxyaddrlen > sizeof(proxyaddr))
> > + pq_discardbytes(proxyaddrlen - sizeof(proxyaddr));
>
> This looks like dead code, given that we'll error out for the same
> check above -- but once it's no longer dead code, the return value of
> pq_discardbytes should be checked for EOF.

Yup.

> > + else if (proxyheader.fam == 0x11)
> > + {
> > + /* TCPv4 */
> > + port->raddr.addr.ss_family = AF_INET;
> > + port->raddr.salen = sizeof(struct sockaddr_in);
> > + ((struct sockaddr_in *) &port->raddr.addr)->sin_addr.s_addr = proxyaddr.ip4.src_addr;
> > + ((struct sockaddr_in *) &port->raddr.addr)->sin_port = proxyaddr.ip4.src_port;
> > + }
>
> I'm trying to reason through the fallout of setting raddr and not
> laddr. I understand why we're not setting laddr -- several places in
> the code rely on the laddr to actually refer to a machine-local address
> -- but the fact that there is no actual connection from raddr to laddr
> could cause shenanigans. For example, the ident auth protocol will just
> break (and it might be nice to explicitly disable it for PROXY
> connections). Are there any other situations where a "faked" raddr
> could throw off Postgres internals?

That's a good point to discuss. I thought about it initially and
figured it'd be even worse to actually copy over laddr since that
woudl then suddenly have the IP address belonging to a different
machine.. And then I forgot to enumerate the other cases.

For ident, disabling the method seems reasonable.

Another thing that shows up with added support for running the proxy
protocol over Unix sockets, is that PostgreSQL refuses to do SSL over
Unix sockets. So that check has to be updated to allow it over proxy
connections. Same for GSSAPI.

An interesting thing is what to do about
inet_server_addr/inet_server_port. That sort of loops back up to the
original question of where/how to expose the information about the
proxy in general (since right now it just logs). Right now you can
actually use inet_server_port() to see if the connection was proxied
(as long as it was over tcp).

Attached is an updated, which covers your comments, as well as adds
unix socket support (per your question and Alvaros confirmed usecase).
It allows proxy connections over unix sockets, but I saw no need to
get into unix sockets over the proxy protocol (dealing with paths
between machines etc).

I changed the additional ListenSocket array to instead declare
ListenSocket as an array of structs holding two fields. Seems cleaner,
and especially should there be further extensions needed in the
future.

I've also added some trivial tests (man that took an ungodly amount of
fighting perl -- it's clearly been a long time since I used perl
properly). They probably need some more love but it's a start.

And of course rebased.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Attachment Content-Type Size
proxy_protocol_3.patch text/x-patch 35.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-03-06 15:28:46 Re: is cfbot's apply aging intentional?
Previous Message Tom Lane 2021-03-06 15:08:55 Re: Inquiries about PostgreSQL's system catalog development——from a student developer of Nanjing University