Fw: Windows 10 got stuck with PostgreSQL at starting up. Adding delay lets it avoid.

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Fw: Windows 10 got stuck with PostgreSQL at starting up. Adding delay lets it avoid.
Date: 2018-07-20 08:58:13
Message-ID: 20180720175813.154db441.nagata@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Recently, one of our clients reported a problem that Windows 10 sometime
(approximately once in 300 tries) hung up at OS starting up while PostgreSQL
9.3.x service is starting up. My co-worker analyzed this and found that
PostgreSQL's auxiliary process and Windows' logon processes are in a dead-lock
situation.

Although this problem have been found only with PostgreSQL 9.3.x and Windows 10
in our client's environment for now, maybe the same problem occurs with other
versions of PostgreSQL.

He reported this problem to pgsql-general list as below. Also, he created a patch
to add a build-time option for adding 0.5 or 3.0 seconds delay after each sub
process starts. The attached is the same one. Our client confirmed that this
patch resolves the dead-lock problem. Is it acceptable to add this option to
PostgreSQL? Any comment would be appreciated.

Regards,

Begin forwarded message:

Date: Fri, 29 Jun 2018 15:03:10 +0900
From: TAKATSUKA Haruka <harukat(at)sraoss(dot)co(dot)jp>
To: pgsql-general(at)postgresql(dot)org
Subject: Windows 10 got stuck with PostgreSQL at starting up. Adding delay lets it avoid.

I got a trouble in PostgreSQL 9.3.x on Windows 10.
I would like to add new delay code as an official build option.

Windows 10 sometime (approximately once in 300 tries) hung up
at OS starting up. The logs say it happened while the PostgreSQL
service was starting. When OS stopped, some postgres auxiliary
process were started and some were not started yet.

The Windows dump say some threads of the postgres auxiliary process
are waiting OS level locks and the logon processes’thread are
also waiting a lock. MS help desk said that PostgreSQL’s OS level
deadlock caused OS freeze. I think it is strange story. But,
in fact, it not happened in repeated tests when I got rid of
PostgreSQL from the initial auto-starting services.

I tweaked PostgreSQL 9.3.x (the newest from the repository) to add
0.5 or 3.0 seconds delay after each sub process starts.
And then the hung up was gone. This test patch is attached.
It is only implemented for Windows. Also, I did not use existing
pg_usleep because it contains locking codes (e.g. WaitForSingleObject
and Enter/LeaveCriticalSection).

Although Windows OS may have some problems, I think we should have
a means to avoid it. Can PostgreSQL be accepted such delay codes
as build-time options by preprocessor variables?

Thanks,
Takatsuka Haruka

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
postmaster.c_0619_win32delay.diff text/plain 1.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseeev 2018-07-20 09:18:06 Re: project updates
Previous Message Etsuro Fujita 2018-07-20 08:57:26 Re: de-deduplicate code in DML execution hooks in postgres_fdw