Re: [RFC] Should we fix postmaster to avoid slow shutdown?

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: 'Robert Haas' <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Should we fix postmaster to avoid slow shutdown?
Date: 2016-09-23 05:04:28
Message-ID: 0A3221C70F24FB45833433255569204D1F5F0D78@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Robert Haas
> On Tue, Sep 20, 2016 at 2:20 AM, Tsunakawa, Takayuki
> <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
> > There's no apparent evidence to indicate the cause, but I could guess
> > a few reasons. What do you think these are correct and should fix
> > PostgreSQL? (I think so)
>
> I think that we shouldn't start changing things based on guesses about what
> the problem is, even if they're fairly smart guesses. The thing to do would
> be to construct a test rig, crash the server repeatedly, and add debugging
> instrumentation to figure out where the time is actually going.

We have tried to reproduce the problem in the past several days with much more stress on our environment than on the customer's one -- 1,000 tables aiming for a dozens of times larger stats file and repeated reconnection requests from hundreds of clients -- but we could not succeed.

> I do think your theory about the stats collector might be worth pursuing.
> It seems that the stats collector only responds to SIGQUIT, ignoring SIGTERM.
> Making it do a clean shutdown on SIGTERM and a fast exit on SIGQUIT seems
> possibly worthwhile.

Thank you for giving confidence for proceeding. And I also believe that postmaster should close the listening ports earlier. Regardless of whether this problem will be solved not confident these will solve the, I think it'd be better to fix these two points so that postmaster doesn't longer time than necessary. I think I'll create a patch after giving it a bit more thought.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2016-09-23 05:05:38 Re: Typo in libpq-int.h
Previous Message Craig Ringer 2016-09-23 05:01:27 Re: Stopping logical replication protocol