Hang in pldebugger after git commit : 98a64d0

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Hang in pldebugger after git commit : 98a64d0
Date: 2016-12-07 09:46:09
Message-ID: CAE9k0PmG8tC27xj5MZ6ZU_C7ua5F=q+YgiOx2cC-pdkRFHKzRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

Hi All,

I have noticed that any SQL query executed immediately after attaching
to a particular debugging target (pldebugger) either takes longer time
for execution or it hangs on Windows. I have checked it on
PostgreSQL-9.5 and PostgreSQL-9.6 versions and have found that the
issue only exists on 9.6 version. After doing 'git bisect' i found
that the following git commit in PostgreSQL-9.6 branch is the culprit.

commit *98a64d0bd713cb89e61bef6432befc*
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: Mon Mar 21 09:56:39 2016 +0100

Introduce WaitEventSet API.

Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.

This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.


Following are the steps to reproduce the issue:

1) Download pldebugger from below url and copy it into contrib directory.

git clone git://git.postgresql.org/git/pldebugger.git

2) Start a new backend session (psql -d postgres)
3) Create a plpgsql function say func1();
4) Get the oid of the func1 and enable debugging of this using pldbgapi
function as shown below

select plpgsql_oid_debug(16487);

5) execute function func1 : select func1();

After executing above query we will get the message as below and
terminal will not respond as it will go in listen mode.

6) Start another backend session.
7) Execute below query.
SELECT * FROM pldbg_attach_to_port(2)
NOTE: We need to extract the port number from step 5 NOTICE message
after 'PLDBGBREAK:' string and use as input here.

8) Execute any SQL query now and the problem starts. I have tried with
below queries.

SELECT pg_backend_pid();
SELECT FROM pldbg_wait_for_breakpoint(1::INTEGER);


Problem Analysis:
Allthough i am very new to Windows, i tried debugging the issue and
could find that Backend is not receiving the query executed after
"SELECT pldbg_attach_to_port(2)" and is infinitely waiting on
"WaitEventSetWaitBlock()" at WaitForMultipleObjects() to read the
input command. Below is the backtrace for the same.

postgres.exe!WaitEventSetWaitBlock(WaitEventSet * set, int
cur_timeout, WaitEvent * occurred_events, int nevents) Line 1384 +
0x2b bytes C
postgres.exe!WaitEventSetWait(WaitEventSet * set, long timeout,
WaitEvent * occurred_events, int nevents) Line 936 + 0x18 bytes C
postgres.exe!secure_read(Port * port, void * ptr, unsigned __int64
len) Line 168 C
postgres.exe!pq_recvbuf() Line 921 + 0x33 bytes C
postgres.exe!pq_getbyte() Line 963 + 0x5 bytes C
postgres.exe!SocketBackend(StringInfoData * inBuf) Line 334 + 0x5 bytes C
postgres.exe!ReadCommand(StringInfoData * inBuf) Line 507 + 0xa bytes C
postgres.exe!PostgresMain(int argc, char * * argv, const char *
dbname, const char * username) Line 4004 + 0xd bytes C
postgres.exe!BackendRun(Port * port) Line 4259 C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4750 C
postgres.exe!main(int argc, char * * argv) Line 216 C
postgres.exe!__tmainCRTStartup() Line 555 + 0x19 bytes C
postgres.exe!mainCRTStartup() Line 371 C

With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com


Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-12-07 10:04:23 Re: Quorum commit for multiple synchronous replication.
Previous Message Magnus Hagander 2016-12-07 09:14:42 Re: Back-patch use of unnamed POSIX semaphores for Linux?