Inefficient shutdown of pg_basebackup

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Inefficient shutdown of pg_basebackup
Date: 2017-04-27 03:31:24
Message-ID: 6456.1493263884@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I griped before that the src/test/recovery/ tests take an unreasonably
long time. My interest in that was piqued further when I noticed that
the tests consume not very much CPU time, and aren't exactly saturating
my disks either. That suggests that the problem isn't so much that the
tests do too much work, as that we've got dire performance problems in
either the test harness or the code under test.

While I'm continuing to poke at it, I've identified one such problem:
the system basically stops dead for about ten seconds at the end of
the pg_basebackup run invoked by t/001_stream_rep.pl. The length of
the idle time corresponds to pg_basebackup's -s (standby_message_timeout)
parameter; you can make it even worse by increasing that parameter or
setting it to zero. (In principle, setting it to zero ought to cause
pg_basebackup to never terminate at all :-( ... but apparently there is
some other effect that will wake it up after 30 seconds or so. I've not
found out what yet.)

The reason for this appears to be that by the time the pg_basebackup
parent process has determined the xlogend position and sent it down
the bgpipe to the child process, the child process has already read
all the WAL that the source server is going to send, and is waiting
for more such input with a timeout corresponding to
standby_message_timeout. Only after that timeout elapses does it
get around to noticing that some input is available from the bgpipe
and then realizing that it's time to stop streaming.

The attached draft patch fixes this by expanding the StreamCtl API
with a socket that the low-level wait routine should check for input.
For me, this consistently knocks about 10 seconds off the runtime of
001_stream_rep.pl.

It could be argued that this isn't too significant in the real world
because pg_basebackup would always run far longer than 10 seconds
anyway for non-toy data. But it still looks like a bug to me.

regards, tom lane

Attachment Content-Type Size
pg_basebackup-notice-socket-input-sooner.patch text/x-diff 9.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-04-27 03:36:21 Crash when partition column specified twice
Previous Message Tom Lane 2017-04-27 02:50:50 Re: Logical replication in the same cluster