BUG #17568: unexpected zero page at block 0 during REINDEX CONCURRENTLY

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: sk(at)zsrv(dot)org
Subject: BUG #17568: unexpected zero page at block 0 during REINDEX CONCURRENTLY
Date: 2022-08-03 14:23:34
Message-ID: 17568-ef121b956ec1559c@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17568
Logged by: Sergei Kornilov
Email address: sk(at)zsrv(dot)org
PostgreSQL version: 14.4
Operating system: Ubuntu 20.04
Description:

Hello
I recently ran "REINDEX INDEX CONCURRENTLY i_sess_uuid;" (pg14.4, table
around 700gb), but suddenly, after the start of phase "index validation:
scanning index", the insert and update operations started returning an
error:

ERROR: index "i_sess_uuid_ccnew" contains unexpected zero page at block 0
HINT: Please REINDEX it.

i_sess_uuid_ccnew is exactly the new index that builds reindex concurrently
at this time. It is clear that the errors started after
index_set_state_flags INDEX_CREATE_SET_READY, because insert and update
queries now need to update this index too. But it remains unclear how
exactly page 0 turned out to be all zeros at this point.

I think some process may have loaded btree metapage (page 0) into shared
buffers prior the end of _bt_load. In this case, the error is reproduced
(14.4, 14 STABLE, HEAD):

create extension pageinspect;
create table test as select generate_series(1,1e4) as id;
create index test_id_idx on test(id);
# prepare gdb for this backend with breakpoint on _bt_uppershutdown
reindex index concurrently test_id_idx ;

While gdb is stopped on breakpoint run from second session:

insert into test values (0);
SELECT * FROM bt_metap('test_id_idx_ccnew');
-[ RECORD 1 ]-------------+---
magic | 0
version | 0
root | 0
level | 0
fastroot | 0
fastlevel | 0
last_cleanup_num_delpages | 0
last_cleanup_num_tuples | -1
allequalimage | f

Then continue reindex backend. New inserts along with reindex itself will
give error "index "test_id_idx_ccnew" contains unexpected zero page at block
0". The metapage on disk after _bt_uppershutdown call will be written
correctly and correctly replicated to standby. But it is still erroneous in
shared buffers on primary.

I still don't know if this is what happened to my base. Monitoring requests
(like pg_total_relation_size, pg_stat_user_indexes, pg_statio_user_indexes)
do not load metapage into shared buffers. Normal select/insert/update/delete
should not touch in any way not ready index. This database does not have any
extensions installed other than those available in contrib.

Thoughts?

regards, Sergei

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2022-08-03 14:56:13 Re: BUG #17564: Planner bug in combination of generate_series(), unnest() and ORDER BY
Previous Message Tom Lane 2022-08-03 14:09:12 Re: BUG #17564: Planner bug in combination of generate_series(), unnest() and ORDER BY