psql is hanging

From: John Smith <localdevjs(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: psql is hanging
Date: 2018-11-30 15:53:21
Message-ID: CAK6G+57yG5W+ZSShNjHMKjKPpYB-enTaWKcwC5k5wa4Oew5zXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We have a long script of sql that we run, several thousand lines of sql. If
I execute the script from start to finish, somewhere in the middle of it,
one sql command will hang and take 2 to 3 hours. During this time, "htop"
shows 100% cpu usage with a postgresql process, so it appears to be doing
something.

If I stop that command, then immediately rerun the rest of the script
starting from the command that "hung", the "hung" command then completes in
5 seconds and the rest of the script just continues on perfectly. I can
interrupt the command within 30 mins of hanging, or 2 hours of hanging;
regardless of when I interrrupt the hung command, it always immediately
completes successully in < 5 seconds when I rerun the script immediately
after that.

I have noticed this in multiple scripts, multiple commands; one was an
"insert into", a different script hung on a "update" and a third one would
hang on a "create table x as (select ... )". These are all short running
commands normally, and we have indexes on all the columns for the
appropriate "where" clauses.

I can't send the sql (forbidden), but the actual command does not appear to
be the issue. Rerunning the same little section of sql manually which
includes the hung command, always takes less than a minute, but the entire
sql which should only take 30 minutes in total will always hang at the same
spot; and then interrupting it (regardless of how long it was running) and
rerunning immediately completes.

The "hung" command does eventually finish at times after 2 to 3 hours, most
often we can't afford to wait for it to run since this batch of sql has to
run daily.

Some other notes:
- Environment is Ubuntu 16.04; postgresql version was originaly 9.5(?) but
we have upgraded to 11.1 using postgresql packages for ubuntu and the
problem has persisted across postgresql versions.
- There are no other queries running on the postgresql server during this
time, and server load is idling (other than whatever psql is doing).
- It is an amazon server, with ZFS as the files system, it has 50% disk
free on this storage partition that postgresql database is on.
- We have 60 gigs of ram total, 8 cores.
- We have run the entire script using "create unlogged table" everywhere as
well, no difference.
- We have made tunings to the postgresql configuration to make it use more
RAM.
- All sql commands are sent through a custom Java application; so we use
java + JDBC to connect to postgresql for this. The script has custom
commands which are interpreted by the Java app, so we can't test running
the exact script start to finish in psql native tools since the native
tools won't understand our custom sql. However the java app is very simple.
All individual sql commands are committed immediately after they run so
there is no long transaction in play here. The psql jdbc driver is
"postgresql-42.2.4".
- One may suspect the custom java app; but again, htop shows postgresql
using 100% CPU usage on one core, so postgresql is chewing on the command,
doing something.

I have attached our current postgresql.conf file.

We're kind of pulling out our hair here, any ideas?

Thanks!

Attachment Content-Type Size
postgresql.conf.zip application/zip 6.6 KB

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2018-11-30 16:02:27 Re: psql is hanging
Previous Message Stephen Frost 2018-11-30 15:33:27 Re: surprising query optimisation