PoC patch: expose TCP socket stats for walsenders

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: PoC patch: expose TCP socket stats for walsenders
Date: 2020-09-30 08:23:02
Message-ID: CAMsr+YFG0yRmpb-2sZ_ZwfoY9ZOvkok7OQ0wMmE12YY3Tmy5rA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all

I've attached a PoC patch demonstrating how to use the linux ioctls
SIOCINQ and SIOCOUTQ and its getsockopt option TCP_INFO to expose a
lot of useful network socket info directly in system views.

Sample output from pg_stat_replication, \x format, trimmed, from a
test run where I deliberately stopped the walreceiver (SIGSTOP) to
cause the send buffer to fill up:

pid | 391966
...
client_addr | 127.0.0.1
...
sync_state | async
reply_time | 2020-09-30 16:13:59.179852+08
...
sock_tx_bufsz | 4194304
sock_tx_bufcontentsz | 4124021
sock_rx_bufsz | 3537513
sock_rx_bufcontentsz | 0
sock_tx_windowsz | 6912
...
sock_rtt | 13655
sock_rtt_variance | 19783
sock_recv_rtt | 204761
sock_packets_lost | 0
sock_packets_retransmitted | 0

The kernel can tell us even more than this if we use linux/tcp.h
instead of netinet/tcp.h - the full-size struct tcp_info can report
things like the delivery rate, how long we spent waiting for receive
window space, send buffer space, totals sent/received, etc.

So we can see, right there in Pg views, whether a walsender /
walreceiver / logical worker is limited by the remote side, local
side, or the connection itself, and some strong indications as to why.

I think we're pretty clearly going to want this in some form.
Especially given the increasing adoption of sealed systems where you
can't expect the user to track down which socket is associated which
which pg worker then run some `ss` commands themselves.

However, the demo implementation isn't especially pretty. It bloats
struct WalSnd, and it grabs the socket info frequently whether or not
it is required. I can't see that being acceptable for real world use.

I'm aiming to clean this up into something submittable, and something
we can apply to the walreceiver and logical receiver workers too. That
will require some design changes since I can't see anyone being happy
about bloating shmem with these big stats structs or collecting them
constantly just in case.

I'd welcome design input. Especially because this seems like a case of
a more general problem - how can we ask backends to self-report
various diagnostic state or instrument their internals efficiently,
and only when requested, without having to rely on the logfile?

The obvious choice would seem to be to use shm_mq to make requests and
send replies, but there are a few issues with that:

* A shm_mq requires a lot of space. sizeof(struct shm_mq) = 56 on my
x64 system. That's not easy to justify adding to struct WalSnd, struct
WalRcv, struct LogicalRepWorker, etc.
* A shm_mq isn't well suited to a listen/respond model where the queue
is attached to, used to exchange a couple of messages, detached,
reset, and ready for re-use by the next request. From experience I
know it's not particularly easy to get that approach right.

I'm thinking some kind of request/response interface will be needed.
Something like setting a shmem variable to a dsa_pointer with storage
for the requested stats and the latch of the requesting process to
notify when it's stored. That way when stats aren't being collected no
shmem is wasted and no syscalls are being made to collect data that's
thrown away or overwritten.

The downside is that stats couldn't be shown in pg_stat_replication
etc. There would need to be a separate function. And sometimes a
backend could take a while to re-enter its mainloop or other locations
where it checks for stats requests, so it might take a while to get
results.

But that way we could easily run a bgworker that periodically requests
stats and writes them to an unlogged relation for inspection. Or have
monitoring tools poll it, that sort of thing.

--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise

Attachment Content-Type Size
0001-Add-libpq_be-function-to-report-bytes-pending-send-i.patch text/x-patch 5.2 KB
0002-PoC-to-expose-TCP_INFO-for-walsender-socket-in-pg_st.patch text/x-patch 21.6 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey V. Lepikhov 2020-09-30 08:48:12 Re: Adding Support for Copy callback functionality on COPY TO api
Previous Message Heikki Linnakangas 2020-09-30 07:59:30 Re: pgbench - refactor init functions with buffers