pgbench stuck with 100% cpu usage

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: pgbench stuck with 100% cpu usage
Date: 2017-09-28 12:46:36
Message-ID: CABOikdPhfXTypckMC1Ux6Ko+hKBWwUBA=EXsvamXYSg8M9J94w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

While running some tests, I encountered a situation where pgbench gets
stuck in an infinite loop, consuming 100% cpu. The setup was:

- Start postgres server from the master branch
- Initialise pgbench
- Run pgbench -c 10 -T 100
- Stop postgres with -m immediate

Now it seems that pgbench gets stuck and it's state machine does not
advance. Attaching it to debugger, I saw that one of the clients remain
stuck in this loop forever.

if (command->type == SQL_COMMAND)
{
if (!sendCommand(st, command))
{
/*
* Failed. Stay in CSTATE_START_COMMAND state, to
* retry. ??? What the point or retrying? Should
* rather abort?
*/
return;
}
else
st->state = CSTATE_WAIT_RESULT;
}

sendCommand() returns false because the underlying connection is bad
and PQsendQuery returns 0. Reading the comment, it seems that the author
thought about this situation but decided to retry instead of abort. Not
sure what was the rationale for that decision, may be to deal with
transient failures?

The commit that introduced this code is 12788ae49e1933f463bc. So I am
copying Heikki.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2017-09-28 12:55:24 Re: Optional message to user when terminating/cancelling backend
Previous Message Tom Lane 2017-09-28 12:21:37 Re: pgsql: Add test for postmaster crash restarts.