From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Magnus Hagander <mha(at)sollentuna(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Backbranch releases and Win32 locking |
Date: | 2006-10-09 12:05:05 |
Message-ID: | 452A3AF1.7090005@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Analyzing locking state, lock occurs when backend wants to send data to stat
collector. So state is:
backend waits FD_WRITE event, stat collector waits FD_READ.
I suspect follow sequence of events in backend:
0 Let us work only with one socket, and socket associated with statically
defined event object in pgwin32_waitforsinglesocket.
1. pgwin32_send:WSASend fails with WSAEWOULDBLOCK ( or its equivalent )
2. socket s becomes writable and Windows signals event defined statically
in pgwin32_waitforsinglesocket.
3. pgwin32_waitforsinglesocket(): ResetEvent resets event
4. pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx waits indefinitely...
If I'm right, it's needed to move ResetEvent after WaitForMultipleObjectsEx. But
comment in pgwin32_select() says that we should send something before test
socket for FD_WRITE. pgwin32_send calls WSASend before
pgwin32_waitforsinglesocket(), but there is a call of
pgwin32_waitforsinglesocket in libpq/be-secure.c. So, attached patch adds call
of WSASend with void buffer.
It's a pity, but locking problem occurs only on SMP box and requires several
hours to reproduce. So we are in testing now.
What are opinions?
PS Backtraces
backend:
ntdll.dll!KiFastSystemCallRet
postgres.exe!pgwin32_waitforsinglesocket+0x197
postgres.exe!pgwin32_send+0xaf
postgres.exe!pgstat_report_waiting+0x1bd
postgres.exe!pgstat_report_tabstat+0xda
postgres.exe!PostgresMain+0x1040
postgres.exe!ClosePostmasterPorts+0x1bce
postgres.exe!SubPostmasterMain+0x1be
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49
logger:
ntdll.dll!KiFastSystemCallRet
kernel32.dll!WaitForSingleObject+0x12
postgres.exe!pg_usleep+0x54
postgres.exe!SysLoggerMain+0x422
postgres.exe!SubPostmasterMain+0x370
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49
bgwriter:
ntdll.dll!KiFastSystemCallRet
kernel32.dll!WaitForSingleObject+0x12
postgres.exe!pg_usleep+0x54
postgres.exe!BackgroundWriterMain+0x63a
postgres.exe!BootstrapMain+0x61f
postgres.exe!SubPostmasterMain+0x22c
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49
stat collector:
ntdll.dll!KiFastSystemCallRet
postgres.exe!pgwin32_select+0x4f3
postgres.exe!PgstatCollectorMain+0x32f
postgres.exe!SubPostmasterMain+0x32a
postgres.exe!main+0x22b
postgres.exe+0x1237
postgres.exe+0x1288
kernel32.dll!RegisterWaitForInputIdle+0x49
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
Attachment | Content-Type | Size |
---|---|---|
win32.patch | text/plain | 761 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Cave-Ayland | 2006-10-09 12:45:24 | Re: 8.2beta1 crash possibly in libpq |
Previous Message | Tzahi Fadida | 2006-10-09 11:19:46 | OT: Is there a LinkedIn group for Postgresql? |