Re: [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send

From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Jonathon Nelson <jdnelson(at)dyn(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send
Date: 2017-01-08 17:36:29
Message-ID: CAM-w4HMAP4wYQ09LcJiyP9qySyMbOQB2u78DJpuno2MEm7KRfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 January 2017 at 17:26, Greg Stark <stark(at)mit(dot)edu> wrote:
> On 5 January 2017 at 19:01, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> That's a bit odd - shouldn't the OS network stack take care of this in
>> both cases? I mean either is too big for TCP packets (including jumbo
>> frames). What type of OS and network is involved here?
>
> 2x may be plausible. The first 128k goes out, then the rest queues up
> until the first ack comes back. Then the next 128kB goes out again
> without waiting... I think this is what Nagle is supposed to actually
> address but either it may be off by default these days or our usage
> pattern may be defeating it in some way.

Hm. That wasn't very clear. And the more I think about it, it's not right.

The first block of data -- one byte in the worst case, 128kB in our
case -- gets put in the output buffers and since there's nothing
stopping it it immediately gets sent out. Then all the subsequent data
gets put in output buffers but buffers up due to Nagle. Until there's
a full packet of data buffered, the ack arrives, or the timeout
expires, at which point the buffered data drains efficiently in full
packets. Eventually it all drains away and the next 128kB arrives and
is sent out immediately.

So most packets are full size with the occasional 128kB packet thrown
in whenever the buffer empties. And I think even when the 128kB packet
is pending Nagle only stops small packets, not full packets, and the
window should allow more than one packet of data to be pending.

So, uh, forget what I said. Nagle should be our friend here.

I think you should get network dumps and use xplot to understand
what's really happening. c.f.
https://fasterdata.es.net/assets/Uploads/20131016-TCPDumpTracePlot.pdf

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Victor Wagner 2017-01-08 17:57:50 Explicit subtransactions for PL/Tcl
Previous Message Greg Stark 2017-01-08 17:26:26 Re: [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send