Re: plperlu stored procedure seems to freeze for a minute

From: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: plperlu stored procedure seems to freeze for a minute
Date: 2015-12-02 15:26:14
Message-ID: 20151202152613.GB10220@hjp.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2015-12-01 20:55:02 +0100, Peter J. Holzer wrote:
> On 2015-12-01 18:58:31 +0100, Peter J. Holzer wrote:
> > I suspect such an interaction because I cannot reproduce the problem
> > outside of a stored procedure. A standalone Perl script doing the same
> > requests doesn't get a timeout.
[...]
> The strace doesn't show a reason for the SIGALRM, though. No alarm(2) or
> setitimer(2) system call (I connected strace to a running postgres
> process just after I got the prompt from "psql" and before I typed
> "select * from mb_search('export');" (I used a different (but very
> similar) stored procedure for those tests because it is much easier to
> find a search which is slow enough to trigger a timeout at least
> sometimes than a data request (which normally finishes in
> milliseconds)).
>
> So I guess my next task will be to find out where that SIGALRM comes
> from and/or whether I can just restart the zmq_msg_recv if it happens.

Ok, I think I know where that SIGALRM comes from: It's the
AuthenticationTimeout. What I'm seeing in strace (if I attach it early
enough) is that during authentication the postgres worker process calls
setitimer with a 60 second timeout twice. This matches the comment in
backend/postmaster/postmaster.c:

* Note: AuthenticationTimeout is applied here while waiting for the
* startup packet, and then again in InitPostgres for the duration of any
* authentication operations. So a hostile client could tie up the
* process for nearly twice AuthenticationTimeout before we kick him off.

As explained in backend/utils/misc/timeout.c, the timers are never
cancelled: If a timeout is cancelled, postgres just sees that it has
nothing to do and resumes whatever it is doing.

This is also what I'm seeing: 60 seconds after start, the process
receives a SIGALRM.

If the process is idle or in a "normal" SQL statement at the time, thats
not a problem. But if it is in one of my stored procedures which is
currently calling a ØMQ function which is waiting for some I/O
(zmq_msg_recv(), most likely), that gets interrupted and returns an
error which my code doesn't know how to handle (yet). So the error gets
back to the user.

A strange interaction between postgres and ØMQ indeed. But now that I
know what's causing it I can handle that. Thanks for your patience.

hp

--
_ | Peter J. Holzer | I want to forget all about both belts and
|_|_) | | suspenders; instead, I want to buy pants
| | | hjp(at)hjp(dot)at | that actually fit.
__/ | http://www.hjp.at/ | -- http://noncombatant.org/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Leonardo M. Ramé 2015-12-02 15:53:45 Could not connect to server: No buffer space available (0x00002747/10055)
Previous Message David Rowley 2015-12-02 10:51:13 Re: [GENERAL] how to import "where exists(subquery)" EXISTS CONDITION performance?