Re: error updating a tuple after promoting a standby

From: Tom DalPozzo <t(dot)dalpozzo(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: error updating a tuple after promoting a standby
Date: 2016-12-21 18:06:16
Message-ID: CAK77FCRkEhwezYwr1BPAarCoYi98PzHdc=01ksJa--TN3zhoJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
> Is there an index on this table?
>>
>
> Have you tried a REINDEX on it?
>
> yes there is an index on id field. I tried REINDEX. Nothing changes but I
notice now (but perhaps it was like that even before reindexing) that every
time I issue that UPDATE query, the number of the block it can't read
increases by one. Now, after some attempts: ERROR: could not read block
12289 in file "base/16384/29153": read only 0 of 8192 bytes.

> In your original post you mention this error occurred while testing
> backup/replication/standby promotion.
>
> What was the procedure you followed in doing the testing?
>
Unfortunately I don't remember every step as I was focused on completely
other things... Anyway, in synthesis:
1 pg_basebackup on primary and added, to the just created backup pg_xlog
dir, the needed WAL files according to the .label file (I'm trying without
archiving) .
2 copied the backup dir to my two standby PCs (one is for sync streaming
replication, the other async).
3 configured recovery.conf ecc... on the standby PCs.
4 started the two standby servers . The first was in sync replication, the
second in async. Messages were OK.
5 Updated some thousands of rows in the primary just to check that it
worked fine.
6 stopped the primary
7 promoted the 1st standby (new primary).
8 stopped/reconfigured/restarted the 2nd standby (async replication) to
point to the 1st standby.
9 checked that all messages were ok in both active PCs.
10 tried to update on the new primary getting the error (perhaps after some
successful updates but I'm not sure).

A new thing:
I noticed that, always restarting from the corrupted cluster (without
reindex I mean), if I update the row id=409 with few data (3 bytes), then
it works and after that, even updating with that long data works.

Regards
Pupillo

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Daniel Westermann 2016-12-21 19:29:20 Re: pg_restore to a port where nobody is listening?
Previous Message Tom Lane 2016-12-21 17:57:28 Re: JSON objects merge using || operator