Re: Race conditions with checkpointer and shutdown

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Race conditions with checkpointer and shutdown
Date: 2019-04-19 03:48:07
Message-ID: 6528.1555645687@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> On Fri, Apr 19, 2019 at 10:22 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Maybe what we should be looking for is "why doesn't the walreceiver
>> shut down"? But the dragonet log you quote above shows the walreceiver
>> exiting, or at least starting to exit. Tis a puzzlement.

> ... Is there some way that the exit code could hang
> *after* that due to corruption of libc resources (FILE streams,
> malloc, ...)? It doesn't seem likely to me (we'd hopefully see some
> more clues) but I thought I'd mention the idea.

I agree it's not likely ... but that's part of the reason I was thinking
about adding some postmaster logging. Whatever we're chasing here is
"not likely", per the observed buildfarm failure rate.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-04-19 04:00:24 Re: bug in update tuple routing with foreign partitions
Previous Message Thomas Munro 2019-04-19 03:41:30 Re: Race conditions with checkpointer and shutdown