Re: rare avl shutdown slowness (related to signal handling)

From: Qingqing Zhou <zhouqq(dot)postgres(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Qingqing Zhou <zhouqq(dot)postgres(at)gmail(dot)com>
Subject: Re: rare avl shutdown slowness (related to signal handling)
Date: 2015-04-07 21:11:22
Message-ID: CAJjS0u0oAjBqeZdHE3_ebCzr5hKFToyKG83_CaN=FQQif1FAXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I got another repro with the shutdown slowness (DEBUG5 with verbosed
log are attached).

It gives a finer picture of what's going on:
1. Avl ereport("autovacuum launcher shutting down");
2. At the end of errfinish(), it honors a pending SIGINT;
3. SIGINT handler longjmp to the start of avl error handling;
4. The error handling continues and rebuild_database_list() (that's
why we see begin/commit pair);
5. In main loop, it WaitLatch(60 seconds);
6. Finally it ereport() again and proc_exit().

This looks like a general pattern - don't think *nix is immune. Notice
that this ereport() is special as there is way to go back. So we can
insert HOLD_INTERRUPTS() just before it.

Thoughts?

Regards,
Qingqing

On Tue, Apr 7, 2015 at 10:54 AM, Qingqing Zhou
<zhouqq(dot)postgres(at)gmail(dot)com> wrote:
> I am playing git tip on windows 7/32 bits, with the backend compiled
> with visual studio 2005 (I know, it is very old :-( ).
>
> I encountered avl shutdown slowness twice, last night and this
> morning: after a ctrl_c is hit, wait for another 60 seconds, server
> shuts down. Here is one log:
>
> D:\pgsql\bin>postgres -D../data --log_line_prefix="%t %p"
> 2015-04-07 10:30:04 PDT 3148LOG: database system was shut down at
> 2015-04-07 10:29:24 PDT
> 2015-04-07 10:30:04 PDT 19548LOG: database system is ready to accept
> connections
> 2015-04-07 10:30:04 PDT 27008LOG: autovacuum launcher started
> 2015-04-07 10:30:08 PDT 19548LOG: received fast shutdown request
> 2015-04-07 10:30:08 PDT 19548LOG: aborting any active transactions
> 2015-04-07 10:30:08 PDT 27008LOG: autovacuum launcher shutting down
> 2015-04-07 10:30:08 PDT 27008ERROR: canceling statement due to user request
> 2015-04-07 10:31:09 PDT 27008LOG: autovacuum launcher shutting down
> 2015-04-07 10:31:09 PDT 15656LOG: shutting down
> 2015-04-07 10:31:09 PDT 15656LOG: database system is shut down
>
> The interesting part is on PID 27008: avl first ereport() shutdown,
> which is at the very end of the main loop and just one step away from
> proc_exit(). Then it seems honors a SIGINT within ereport(), longjmp
> to the loop head, and waits for another 60 seconds. After timeout, it
> ereports shutdown again, and finally exits.
>
> Not sure if this is windows only or general. I can hardly repro it.
> Anyone has ever seen this?
>
> Regards,
> Qingqing

Attachment Content-Type Size
avlexit.log application/octet-stream 13.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-04-07 21:32:37 Re: rare avl shutdown slowness (related to signal handling)
Previous Message David G. Johnston 2015-04-07 21:05:26 Re: pg_restore -t should match views, matviews, and foreign tables