Quick Links

Unduly short fuse in RequestCheckpoint

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Unduly short fuse in RequestCheckpoint
Date:	2019-03-16 16:07:55
Message-ID:	27830.1552752475@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I noticed an odd buildfarm failure today:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2019-03-16%2012%3A12%3A20

of which the key bit seems to be

2019-03-16 15:20:43.835 UTC [10879304] 003_promote.pl LOG: received replication command: BASE_BACKUP LABEL 'pg_basebackup base backup' NOWAIT
2019-03-16 15:20:45.857 UTC [10879304] 003_promote.pl ERROR: could not request checkpoint because checkpointer not running
2019-03-16 15:20:47.227 UTC [61604144] LOG: received immediate shutdown request

Digging in the buildfarm archives finds seven other occurrences of the
same error in the past three months (I didn't look back further).

The cause of this error is that RequestCheckpoint will give up and fail
after just 2 seconds, which evidently is not long enough on slow or
heavily loaded machines. Since there isn't any good reason why the
checkpointer wouldn't be running, I'm inclined to swing a large hammer
and kick this timeout up to 60 seconds. Thoughts?

regards, tom lane

Responses

Re: Unduly short fuse in RequestCheckpoint at 2019-03-17 19:41:43 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dmitry Dolgov	2019-03-16 16:14:20	Re: Index Skip Scan
Previous Message	Tomas Vondra	2019-03-16 15:54:01	Re: [HACKERS] PATCH: multivariate histograms and MCV lists