Hang in pldebugger after git commit : 98a64d0

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Hang in pldebugger after git commit : 98a64d0
Date: 2016-12-07 09:46:09
Message-ID: CAE9k0PmG8tC27xj5MZ6ZU_C7ua5F=q+YgiOx2cC-pdkRFHKzRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

I have noticed that any SQL query executed immediately after attaching
to a particular debugging target (pldebugger) either takes longer time
for execution or it hangs on Windows. I have checked it on
PostgreSQL-9.5 and PostgreSQL-9.6 versions and have found that the
issue only exists on 9.6 version. After doing 'git bisect' i found
that the following git commit in PostgreSQL-9.6 branch is the culprit.

commit *98a64d0bd713cb89e61bef6432befc*
4b7b5da59e
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: Mon Mar 21 09:56:39 2016 +0100

Introduce WaitEventSet API.

Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.

This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.

.................

Following are the steps to reproduce the issue:

1) Download pldebugger from below url and copy it into contrib directory.

git clone git://git.postgresql.org/git/pldebugger.git

2) Start a new backend session (psql -d postgres)
3) Create a plpgsql function say func1();
4) Get the oid of the func1 and enable debugging of this using pldbgapi
function as shown below

select plpgsql_oid_debug(16487);

5) execute function func1 : select func1();

After executing above query we will get the message as below and
terminal will not respond as it will go in listen mode.
NOTICE: PLDBGBREAK:2

6) Start another backend session.
7) Execute below query.
SELECT * FROM pldbg_attach_to_port(2)
NOTE: We need to extract the port number from step 5 NOTICE message
after 'PLDBGBREAK:' string and use as input here.

8) Execute any SQL query now and the problem starts. I have tried with
below queries.

SELECT 1;
OR
SELECT pg_backend_pid();
OR
SELECT FROM pldbg_wait_for_breakpoint(1::INTEGER);

....

Problem Analysis:
-------------------------
Allthough i am very new to Windows, i tried debugging the issue and
could find that Backend is not receiving the query executed after
"SELECT pldbg_attach_to_port(2)" and is infinitely waiting on
"WaitEventSetWaitBlock()" at WaitForMultipleObjects() to read the
input command. Below is the backtrace for the same.

postgres.exe!WaitEventSetWaitBlock(WaitEventSet * set, int
cur_timeout, WaitEvent * occurred_events, int nevents) Line 1384 +
0x2b bytes C
postgres.exe!WaitEventSetWait(WaitEventSet * set, long timeout,
WaitEvent * occurred_events, int nevents) Line 936 + 0x18 bytes C
postgres.exe!secure_read(Port * port, void * ptr, unsigned __int64
len) Line 168 C
postgres.exe!pq_recvbuf() Line 921 + 0x33 bytes C
postgres.exe!pq_getbyte() Line 963 + 0x5 bytes C
postgres.exe!SocketBackend(StringInfoData * inBuf) Line 334 + 0x5 bytes C
postgres.exe!ReadCommand(StringInfoData * inBuf) Line 507 + 0xa bytes C
postgres.exe!PostgresMain(int argc, char * * argv, const char *
dbname, const char * username) Line 4004 + 0xd bytes C
postgres.exe!BackendRun(Port * port) Line 4259 C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4750 C
postgres.exe!main(int argc, char * * argv) Line 216 C
postgres.exe!__tmainCRTStartup() Line 555 + 0x19 bytes C
postgres.exe!mainCRTStartup() Line 371 C

With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-12-07 10:04:23 Re: Quorum commit for multiple synchronous replication.
Previous Message Magnus Hagander 2016-12-07 09:14:42 Re: Back-patch use of unnamed POSIX semaphores for Linux?