Re: [RFC] Should we fix postmaster to avoid slow shutdown?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Should we fix postmaster to avoid slow shutdown?
Date: 2016-09-26 14:18:36
Message-ID: 31376.1474899516@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> writes:
>> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Robert Haas
>> I think that we shouldn't start changing things based on guesses about what
>> the problem is, even if they're fairly smart guesses. The thing to do would
>> be to construct a test rig, crash the server repeatedly, and add debugging
>> instrumentation to figure out where the time is actually going.

> We have tried to reproduce the problem in the past several days with much more stress on our environment than on the customer's one -- 1,000 tables aiming for a dozens of times larger stats file and repeated reconnection requests from hundreds of clients -- but we could not succeed.

>> I do think your theory about the stats collector might be worth pursuing.
>> It seems that the stats collector only responds to SIGQUIT, ignoring SIGTERM.
>> Making it do a clean shutdown on SIGTERM and a fast exit on SIGQUIT seems
>> possibly worthwhile.

> Thank you for giving confidence for proceeding. And I also believe that postmaster should close the listening ports earlier. Regardless of whether this problem will be solved not confident these will solve the, I think it'd be better to fix these two points so that postmaster doesn't longer time than necessary. I think I'll create a patch after giving it a bit more thought.

FWIW, I'm pretty much -1 on messing with the timing of the socket close
actions. I broke that once within recent memory, so maybe I'm gun-shy,
but I think that the odds of unpleasant side effects greatly outweigh
any likely benefit there.

Allowing SIGQUIT to prompt fast shutdown of the stats collector seems
sane, though. Try to make sure it doesn't leave partly-written stats
files behind.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-26 14:29:46 Re: Showing parallel status in \df+
Previous Message Tom Lane 2016-09-26 13:59:15 Re: Small race in pg_xlogdump --follow