[bug fix] postgres.exe crashes with access violation on Windows while starting up

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: [bug fix] postgres.exe crashes with access violation on Windows while starting up
Date: 2017-10-27 02:10:21
Message-ID: 0A3221C70F24FB45833433255569204D1F80CC73@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


We encountered a rare and hard-to-investigate problem on Windows, which one of our customers reported. Please find the attached patch to fix that. I'll add this to the next CF.


PostgreSQL sometimes crashes with the following messages. This is infrequent (but frequent for the customer); it occurred about 10 times in the past 5 months.

LOG: server process (PID 2712) was terminated by exception 0xC0000005
HINT: See C include file "ntstatus.h" for a description of the hexadecimal value.
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
LOG: all server processes terminated; reinitializing

"server process" shows that an client backend crashed. The above messages indicate that the process was not running an SQL command.

PostgreSQL runs as a Windows service.

No crash dump was produced anywhere, despite the facts:
- <PGDATA>/crashdumps folder exists and is writable by the PostgreSQL user account (which is the user postgres.exe runs as)
- The Windows registry configuration allows dumping the crash dump


We believe WSAStartup() in main.c failed. The only conceivable error is:

Too many processes.
A Windows Sockets implementation may have a limit on the number of applications that can use it simultaneously. WSAStartup may fail with this error if the limit has been reached.

But I couldn't find what the limit is and whether we can tune it. We couldn't reproduce the problem.

When I pretend that WSAStartup() failed while a client backend is starting up, I could see the same phenomenon as the customer. This problem only occurs when PostgreSQL runs as a Windows service.

The bug is in write_eventlog(). It calls pgwin32_message_to_utf16() which in turn calls palloc(), which requires the memory management system to be set up (CurrentMemoryContext != NULL).


Add the check "CurrentMemoryContext != NULL" in write_eventlog() as in write_console().


The reason is for not outputing the crash dump is a) the crash occurred before installing the Windows exception handler (pgwin32_install_crashdump_handler() call) and b) the effect of the following call in postmaster is inherited in the child process.

/* In case of general protection fault, don't show GUI popup box */

But I'm not sure in what order we should do pgwin32_install_crashdump_handler(), startup_hacks() and steps therein, MemoryContextInit(). I think that's another patch.

Takayuki Tsunakawa

Attachment Content-Type Size
write_eventlog_crash.patch application/octet-stream 887 bytes


Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2017-10-27 02:51:00 Re: [bug fix] postgres.exe crashes with access violation on Windows while starting up
Previous Message Amit Langote 2017-10-27 01:17:17 Re: path toward faster partition pruning