Re: Backbranch releases and Win32 locking

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Magnus Hagander <mha(at)sollentuna(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Backbranch releases and Win32 locking
Date: 2006-10-09 12:05:05
Message-ID: 452A3AF1.7090005@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Analyzing locking state, lock occurs when backend wants to send data to stat
collector. So state is:
backend waits FD_WRITE event, stat collector waits FD_READ.

I suspect follow sequence of events in backend:
0 Let us work only with one socket, and socket associated with statically
defined event object in pgwin32_waitforsinglesocket.
1. pgwin32_send:WSASend fails with WSAEWOULDBLOCK ( or its equivalent )
2. socket s becomes writable and Windows signals event defined statically
in pgwin32_waitforsinglesocket.
3. pgwin32_waitforsinglesocket(): ResetEvent resets event
4. pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx waits indefinitely...

If I'm right, it's needed to move ResetEvent after WaitForMultipleObjectsEx. But
comment in pgwin32_select() says that we should send something before test
socket for FD_WRITE. pgwin32_send calls WSASend before
pgwin32_waitforsinglesocket(), but there is a call of
pgwin32_waitforsinglesocket in libpq/be-secure.c. So, attached patch adds call
of WSASend with void buffer.

It's a pity, but locking problem occurs only on SMP box and requires several
hours to reproduce. So we are in testing now.

What are opinions?

PS Backtraces
backend:

ntdll.dll!KiFastSystemCallRet
postgres.exe!pgwin32_waitforsinglesocket+0x197
postgres.exe!pgwin32_send+0xaf
postgres.exe!pgstat_report_waiting+0x1bd
postgres.exe!pgstat_report_tabstat+0xda
postgres.exe!PostgresMain+0x1040
postgres.exe!ClosePostmasterPorts+0x1bce
postgres.exe!SubPostmasterMain+0x1be
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49

logger:

ntdll.dll!KiFastSystemCallRet
kernel32.dll!WaitForSingleObject+0x12
postgres.exe!pg_usleep+0x54
postgres.exe!SysLoggerMain+0x422
postgres.exe!SubPostmasterMain+0x370
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49

bgwriter:

ntdll.dll!KiFastSystemCallRet
kernel32.dll!WaitForSingleObject+0x12
postgres.exe!pg_usleep+0x54
postgres.exe!BackgroundWriterMain+0x63a
postgres.exe!BootstrapMain+0x61f
postgres.exe!SubPostmasterMain+0x22c
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49

stat collector:

ntdll.dll!KiFastSystemCallRet
postgres.exe!pgwin32_select+0x4f3
postgres.exe!PgstatCollectorMain+0x32f
postgres.exe!SubPostmasterMain+0x32a
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
win32.patch text/plain 761 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Cave-Ayland 2006-10-09 12:45:24 Re: 8.2beta1 crash possibly in libpq
Previous Message Tzahi Fadida 2006-10-09 11:19:46 OT: Is there a LinkedIn group for Postgresql?