Re: Errors on missing pg_subtrans/ files with 9.3

From: J Smith <dark(dot)panda+lists(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Errors on missing pg_subtrans/ files with 9.3
Date: 2013-11-13 17:29:37
Message-ID: CADFUPge4ySeoRNtNsG8S2hThNXFT+RD8CDkVgw7nb8BaPk1cfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Looks like we got another set of errors overnight. Here's the log file
from the errors. (Log file scrubbed slightly to remove private data,
but still representative of the problem I believe.)

Nov 13 05:34:34 dev postgres[6084]: [4-1] user=dev,db=dev ERROR:
could not access status of transaction 6337381
Nov 13 05:34:34 dev postgres[6084]: [4-2] user=dev,db=dev DETAIL:
Could not open file "pg_subtrans/0060": No such file or directory.
Nov 13 05:34:34 dev postgres[6084]: [4-3] user=dev,db=dev CONTEXT:
SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
Nov 13 05:34:34 dev postgres[6084]: [4-4] user=dev,db=dev STATEMENT:
update listings set deactivated_at=$1 where id=$2 and lock_version=$3
Nov 13 05:34:34 dev postgres[6076]: [4-1] user=dev,db=dev ERROR:
could not access status of transaction 6337381
Nov 13 05:34:34 dev postgres[6076]: [4-2] user=dev,db=dev DETAIL:
Could not open file "pg_subtrans/0060": No such file or directory.
Nov 13 05:34:34 dev postgres[6076]: [4-3] user=dev,db=dev CONTEXT:
SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
Nov 13 05:34:34 dev postgres[6076]: [4-4] user=dev,db=dev STATEMENT:
update listings set deactivated_at=$1 where id=$2 and lock_version=$3
Nov 13 05:34:34 dev postgres[6087]: [4-1] user=dev,db=dev ERROR:
could not access status of transaction 6337381
Nov 13 05:34:34 dev postgres[6087]: [4-2] user=dev,db=dev DETAIL:
Could not open file "pg_subtrans/0060": No such file or directory.
Nov 13 05:34:34 dev postgres[6087]: [4-3] user=dev,db=dev CONTEXT:
SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
Nov 13 05:34:34 dev postgres[6087]: [4-4] user=dev,db=dev STATEMENT:
update listings set deactivated_at=$1 where id=$2 and lock_version=$3
Nov 13 05:34:34 dev postgres[6086]: [4-1] user=dev,db=dev ERROR:
could not access status of transaction 6337381
Nov 13 05:34:34 dev postgres[6086]: [4-2] user=dev,db=dev DETAIL:
Could not open file "pg_subtrans/0060": No such file or directory.
Nov 13 05:34:34 dev postgres[6086]: [4-3] user=dev,db=dev CONTEXT:
SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
Nov 13 05:34:34 dev postgres[6086]: [4-4] user=dev,db=dev STATEMENT:
update listings set deactivated_at=$1 where id=$2 and lock_version=$3
Nov 13 05:34:34 dev postgres[6088]: [4-1] user=dev,db=dev ERROR:
could not access status of transaction 6337381
Nov 13 05:34:34 dev postgres[6088]: [4-2] user=dev,db=dev DETAIL:
Could not open file "pg_subtrans/0060": No such file or directory.
Nov 13 05:34:34 dev postgres[6088]: [4-3] user=dev,db=dev CONTEXT:
SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
Nov 13 05:34:34 dev postgres[6088]: [4-4] user=dev,db=dev STATEMENT:
update listings set deactivated_at=$1 where id=$2 and lock_version=$3
Nov 13 05:34:34 dev postgres[6085]: [4-1] user=dev,db=dev ERROR:
could not access status of transaction 6337381
Nov 13 05:34:34 dev postgres[6085]: [4-2] user=dev,db=dev DETAIL:
Could not open file "pg_subtrans/0060": No such file or directory.
Nov 13 05:34:34 dev postgres[6085]: [4-3] user=dev,db=dev CONTEXT:
SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
Nov 13 05:34:34 dev postgres[6085]: [4-4] user=dev,db=dev STATEMENT:
update listings set deactivated_at=$1 where id=$2 and lock_version=$3

Several processes all seemed to hit the problem at the same moment,
and all of them refer to the same transaction ID. Again, the file
pg_subtrans/0060 doesn't exist, and the only file that does exist is
pg_subtrans/005A which appears to be a zeroed-out file 245760 bytes in
length.

Still don't have a clue as to how I can reproduce the problem. It
seems that in all cases the error occurred during either an UPDATE to
a table_X or an INSERT to table_Y. In all cases, the error occurred in
a manner identical to those shown in the log above, the only
difference being either an UPDATE on table_X or an INSERT on table_Y.

Not sure what direction I should head to now. Perhaps some aggressive
logging would help, so we can see the queries surrounding the
problems? I could reconfigure things to capture all statements and set
up monit or something to send an alert when the problem resurfaces,
for instance.

Cheers all.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-11-13 17:43:00 Re: Clang 3.3 Analyzer Results
Previous Message Heikki Linnakangas 2013-11-13 16:49:55 Handling GIN incomplete splits