What else could I've done? COPY to unlogged tbl "hung"/locked the table

From: amacvar <kahitarich-postgresForum(at)yahoo(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: What else could I've done? COPY to unlogged tbl "hung"/locked the table
Date: 2012-08-30 19:16:02
Message-ID: 1346354162985-5721983.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi,

I'm relatively new to Pg.

Question:
What could have happened to cause the COPY to unlogged table to hang?
And what would the recreate of the unlogged table have "released" that a
reboot of the cluster did not.
Any other place that I could've looked or done something else other than
recreating the unlogged table and/or rebooting the cluster.

Timeline:
Day before, the cluster host had to be rebooted since it was unresponsive
due to perhaps "a massive loop in vi re substitution" (as dev mentioned it)

Yesterday, a copy to an unlogged table thru java was hung. Usually it is
instantaneous, but it was hung for 2hrs before they contacted me.

pg_locks showed me the following:
pid | locktype | dbase | table |
virtualtransaction | virtualxid | mode | granted
26554 | relation | dbname | pg_class | 8/13
| | AccessShareLock | t
26554 | relation | dbname | pg_class_oid_index | 8/13
| | AccessShareLock | t
26554 | relation | dbname | pg_class_relname_nsp_index | 8/13
| | AccessShareLock | t
26554 | relation | | pg_database | 8/13
| | AccessShareLock | t
26554 | relation | | pg_database_datname_index | 8/13
| | AccessShareLock | t
26554 | relation | | pg_database_oid_index | 8/13
| | AccessShareLock | t
26554 | relation | dbname | pg_locks | 8/13
| | AccessShareLock | t
26554 | virtualxid | | | 8/13
| 8/13 | ExclusiveLock | t
27288 | relation | dbname | unlogged_tbl_idx | 11/33
| | RowExclusiveLock | t
27288 | relation | dbname | unlogged_tbl | 11/33
| | RowExclusiveLock | t
27288 | relation | dbname | unlogged_tbl_key | 11/33
| | RowExclusiveLock | t
27288 | relation | dbname | unlogged_tbl_pkey | 11/33
| | RowExclusiveLock | t
27288 | transactionid | | | 11/33
| | ExclusiveLock | t
27288 | virtualxid | | | 11/33
| 11/33 | ExclusiveLock | t

When the developer killed the shell script from his end, it didn't release
the locks.
neither pg_cancel nor pg_terminate of the copy job worked.
so finally a kill -9 of the COPY os process caused the cluster to reboot.

2012-08-29 14:29:58 EDT [8119]: [2-1] LOG: server process (PID 30468) was
terminated by signal 9: Killed
2012-08-29 14:29:58 EDT [8119]: [3-1] LOG: terminating any other active
server processes
2012-08-29 14:29:58 EDT [25798]: [3-1] WARNING: terminating connection
because of crash of another server process
2012-08-29 14:29:58 EDT [25798]: [4-1] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2012-08-29 14:29:58 EDT [25798]: [5-1] HINT: In a moment you should be able
to reconnect to the database and repeat your command.

The load script was kicked off and it hung again.
A fast stop of the server failed, so an immediate stop had to be executed.

2012-08-29 14:45:44 EDT [8119]: [9-1] LOG: received immediate shutdown
request
2012-08-29 14:45:44 EDT [27288]: [1-1] WARNING: terminating connection
because of crash of another server process
2012-08-29 14:45:44 EDT [27288]: [2-1] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2012-08-29 14:45:44 EDT [27288]: [3-1] HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2012-08-29 14:45:44 EDT [27288]: [4-1] CONTEXT: COPY unlogged_tbl, line 1:
"50005001|<abacus><address>5291 math
drive</address><address2/><alt_names/><city>atlantis..."

The developer tried to execute using only the java class and not thru a
shell script.
The copy process was kicked off and it hung again.

I created another table using

db=# create table tmp_tbl as select * from unlogged_tbl;

and COPY to this tmp_tbl from the same file was instantaneous.
However i later noticed that this creates a logged table, NOT an unlogged
table (as i had expected :) )

I dropped and recreated the unlogged tables and that seems to have worked.
No more hanging of the load script. And so far so good.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/What-else-could-I-ve-done-COPY-to-unlogged-tbl-hung-locked-the-table-tp5721983.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message amacvar 2012-08-31 18:21:21 Re: What else could I've done? COPY to unlogged tbl "hung"/locked the table
Previous Message Robert Haas 2012-08-30 18:58:30 Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)