Re: Why does backend send buffer size hardcoded at 8KB?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Artemiy Ryabinkov <getlag(at)ya(dot)ru>, pgsql-general(at)postgresql(dot)org
Subject: Re: Why does backend send buffer size hardcoded at 8KB?
Date: 2019-07-27 22:58:54
Message-ID: 20190727225854.w2f2e5n6lepqbbui@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

On 2019-07-27 18:34:50 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > It might be better to just use larger send sizes however. I think most
> > kernels are going to be better than us knowing how to chop up the send
> > size.

> Yeah. The existing commentary about that is basically justifying 8K
> as being large enough to avoid performance issues; if somebody can
> show that that's not true, I wouldn't have any hesitation about
> kicking it up.

You think that unnecessary fragmentation, which I did show, isn't good
enough? That does have cost on the network level, even if it possibly
doesn't show up that much in timing.

I wonder if we ought to just query SO_SNDBUF/SO_RCVBUF or such, and use
those (although that's not quite perfect, because there's some added
overhead before data ends up in SNDBUF). Probably with some clamping, to
defend against a crazy sysadmin setting it extremely high.

Additionally we perhaps ought to just not use the send buffer when
internal_putbytes() is called with more data than can fit in the
buffer. We should fill it with as much data as fits in it (so the
pending data like the message header, or smaller previous messages, are
flushed out in the largest size), and then just call secure_write()
directly on the rest. It's not free to memcpy all that data around, when
we already have a buffer.

> (Might be worth malloc'ing it rather than having it as part of the
> static process image if we do so, but that's a trivial change.)

We already do for the send buffer, because we repalloc it in
socket_putmessage_noblock(). Olddly enough we never reduce it's size
after that...

While the receive side is statically allocated, I don't think it ends up
in the process image as-is - as the contents aren't initialized, it ends
up in .bss.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2019-07-27 23:10:22 Re: Why does backend send buffer size hardcoded at 8KB?
Previous Message Tom Lane 2019-07-27 22:34:50 Re: Why does backend send buffer size hardcoded at 8KB?