Re: Why does backend send buffer size hardcoded at 8KB?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Artemiy Ryabinkov <getlag(at)ya(dot)ru>, pgsql-general(at)postgresql(dot)org
Subject: Re: Why does backend send buffer size hardcoded at 8KB?
Date: 2019-07-27 21:08:50
Message-ID: 20190727210850.iierdj5fed7jyox7@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

On 2019-07-27 11:09:06 -0400, Tom Lane wrote:
> Artemiy Ryabinkov <getlag(at)ya(dot)ru> writes:
> > Does it make sense to make this parameter configurable?
>
> Not without some proof that it makes a performance difference on
> common setups (which you've not provided).

I think us unnecessarily fragmenting into some smaller packets everytime
we send a full 8kB buffer, unless there's already network congestion, is
kind of evidence enough? The combination of a relatively small send
buffer + TCP_NODELAY isn't great.

I'm not quite sure what the smaller buffer is supposed to achieve, at
least these days. In blocking mode (emulated in PG code, using latches,
so we can accept interrupts) we'll always just loop back to another
send() in internal_flush(). In non-blocking mode, we'll fall out of the
loop as soon as the kernel didn't send any data. Isn't the outcome of
using such a small send buffer that we end up performing a) more
syscalls, which has gotten a lot worse in last two years due to all the
cpu vulnerability mitigations making syscalls a *lot* more epensive b)
unnecessary fragmentation?

The situation for receiving data is a bit different. For one, we don't
cause unnecessary fragmentation by using a buffer of a relatively
limited size. But more importantly, copying data into the buffer takes
time, and we could actually be responding to queries earlier in the
data. In contrast to the send case we don't loop around recv() until all
the data has been received.

I suspect we could still do with a bigger buffer, just to reduce the
number of syscalls in bulk loading cases, however.

Greetings,

Andres Freund

In response to

Browse pgsql-general by date

  From Date Subject
Next Message farjad.farid 2019-07-27 22:06:30 RE: Hardware for writing/updating 12,000,000 rows per hour
Previous Message Andres Freund 2019-07-27 20:52:43 Re: Why does backend send buffer size hardcoded at 8KB?