Re: [HACKERS] per-sesson errors after interrupting CLUSTER pg_attribute (not attrdef)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] per-sesson errors after interrupting CLUSTER pg_attribute (not attrdef)
Date: 2018-03-20 18:03:57
Message-ID: 20180320180357.GE24022@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 24, 2017 at 04:56:27PM -0700, Michael Paquier wrote:
> On Fri, Oct 20, 2017 at 9:01 AM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > This was briefly scary but seems to have been limited to my psql session (no
> > other errors logged). Issue with catcache (?)
> >
> > I realized that the backup job I'd kicked off was precluding the CLUSTER from
> > running, but that CLUSTER was still holding lock and stalling everything else
> > under the sun.
> >
> > ts=# \df qci_add*
> > ERROR: could not read block 8 in file "base/16400/999225102": read only 0 of 8192 bytes
> > ts=# \dt+ pg_att
> >
> > ts=# \dt+ pg_attrdef
> > ERROR: could not read block 8 in file "base/16400/999225102": read only 0 of 8192 bytes
>
> Perhaps that's too late, but to which relation is this relfilenode
> actually referring to?

Sorry for the late response :)

..but I've had mixed success reproducign this, and wasn't sure if it even an
issue under current postgres (the original issue was a 9.6 server+10.0client).

Now I know, this is still an issue under PG10.2:

[pryzbyj(at)united-telsasoft ~]$ psql ts
psql (10.2)
Type "help" for help.

ts=# CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
INFO: clustering "pg_catalog.pg_attribute" using sequential scan and sort
INFO: "pg_attribute": found 18074 removable, 2488511 nonremovable row versions in 80769 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU: user: 9.66 s, system: 1.92 s, elapsed: 22.29 s.
^CCancel request sent
ERROR: canceling statement due to user request
ts=# CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
INFO: clustering "pg_catalog.pg_attribute" using sequential scan and sort
INFO: "pg_attribute": found 18102 removable, 2488511 nonremovable row versions in 80769 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU: user: 10.00 s, system: 1.59 s, elapsed: 27.60 s.
^CCancel request sent
ERROR: canceling statement due to user request
ts=# CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
ERROR: could not open file "base/16400/948150297": No such file or directory
ts=# SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150297;
ERROR: column "pg_relation_filenode" does not exist
LINE 1: SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150...
^
ts=# SELECT * FROM pg_attribute WHERE pg_relation_filenode()=948150297;
ERROR: function pg_relation_filenode() does not exist
LINE 1: SELECT * FROM pg_attribute WHERE pg_relation_filenode()=9481...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
ts=# SELECT * FROM pg_class WHERE pg_relation_filenode(oid)=948150297;
ERROR: could not open file "base/16400/948150297": No such file or directory
ts=#

In another session:
ts=# SELECT * FROM pg_class WHERE pg_relation_filenode(oid)=948150297;
relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | relallvisible | reltoastrelid | relhasindex | relisshared | relpersistence | relkind | relnatts | relchecks | relhasoids | rel
haspkey | relhasrules | relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated | relreplident | relispartition | relfrozenxid | relminmxid | relacl | reloptions | relpartbound
---------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+----
--------+-------------+----------------+----------------+----------------+---------------------+----------------+--------------+----------------+--------------+------------+--------+------------+--------------
(0 rows)

postgres=# SELECT session_line, message, query FROM postgres_log_2018_03_20_1200 WHERE pid=29855 AND session_id='5ab147d9.749f' ORDER BY 1;
session_line|message|query
1|statement: SELECT pg_catalog.quote_ident(c2.relname) FROM pg_catalog.pg_class c1, pg_catalog.pg_class c2, pg_catalog.pg_index i WHERE c1.oid=i.indrelid and i.indexrelid=c2.oid and (0 = pg_catalog.length('')) and pg_catalog.quote_ident(c1.relname)='pg_attribute' and pg_catalog.pg_table_is_visible(c2.oid)
LIMIT 1000|
2|statement: SELECT pg_catalog.quote_ident(c2.relname) FROM pg_catalog.pg_class c1, pg_catalog.pg_class c2, pg_catalog.pg_index i WHERE c1.oid=i.indrelid and i.indexrelid=c2.oid and (24 = pg_catalog.length('pg_attribute_relid_attnu')) and pg_catalog.quote_ident(c1.relname)='pg_attribute' and pg_catalog.pg_table_is_visible(c2.oid)
LIMIT 1000|
3|statement: CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;|
4|clustering "pg_catalog.pg_attribute" using sequential scan and sort|
5|temporary file: path "base/pgsql_tmp/pgsql_tmp29855.0", size 374013952|CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
6|"pg_attribute": found 18074 removable, 2488511 nonremovable row versions in 80769 pages|
7|canceling statement due to user request|CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
8|statement: CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;|
9|clustering "pg_catalog.pg_attribute" using sequential scan and sort|
10|temporary file: path "base/pgsql_tmp/pgsql_tmp29855.1", size 374013952|CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
11|"pg_attribute": found 18102 removable, 2488511 nonremovable row versions in 80769 pages|
12|canceling statement due to user request|CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
13|statement: CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;|
14|could not open file "base/16400/948150297": No such file or directory|CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
15|statement: SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150297;|
16|column "pg_relation_filenode" does not exist|SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150297;
17|statement: SELECT * FROM pg_attribute WHERE pg_relation_filenode()=948150297;|
18|function pg_relation_filenode() does not exist|SELECT * FROM pg_attribute WHERE pg_relation_filenode()=948150297;
19|statement: SELECT * FROM pg_class WHERE pg_relation_filenode(oid)=948150297;|
20|could not open file "base/16400/948150297": No such file or directory|SELECT * FROM pg_class WHERE pg_relation_filenode(oid)=948150297;
(20 rows)

I believe lines 1-2 were tab completion, which was also present in my original report.

In my original report, I believe \df wasn't working (or maybe tab completion
for a function?). This is different, as the CLUSTER isn't working.

Also note the subject of my original report was wrong (pg_attrdef not pg_attribute).

I'll leave the broken session opened for awhile in cases someone has questions
about it other than relfilenode, which apparently isn't working..

I don't know much but..is this related to pg_filenode.map?

Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-20 18:04:50 Re: configure's checks for --enable-tap-tests are insufficient
Previous Message Pavel Stehule 2018-03-20 17:38:47 Re: [HACKERS] proposal: schema variables