Re: After replication failover: could not read block X in file Y read only 0 of 8192 bytes

From: Brian Sutherland <brian(at)vanguardistas(dot)net>
To: Venkata Balaji N <nag1010(at)gmail(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: After replication failover: could not read block X in file Y read only 0 of 8192 bytes
Date: 2016-05-31 09:22:12
Message-ID: 20160531092212.GA63217@Admins-MacBook-Air-2.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, May 31, 2016 at 04:49:26PM +1000, Venkata Balaji N wrote:
> On Mon, May 30, 2016 at 11:37 PM, Brian Sutherland <brian(at)vanguardistas(dot)net>
> wrote:
>
> > I'm running a streaming replication setup with PostgreSQL 9.5.2 and have
> > started seeing these errors on a few INSERTs:
> >
> > ERROR: could not read block 8 in file "base/3884037/3885279": read
> > only 0 of 8192 bytes
> >
>
> These errors are occurring on master or slave ?

On the master (which was previously a slave)

> > on a few tables. If I look at that specific file, it's only 6 blocks
> > long:
> >
> > # ls -la base/3884037/3885279
> > -rw------- 1 postgres postgres 49152 May 30 12:56 base/3884037/3885279
> >
> > It seems that this is the case on most tables in this state. I havn't
> > seen any error on SELECT and I can SELECT * on the all tables I know
> > have this problem. The database is machine is under reasonable load.
> >
>
> So, the filenodes generating this error belong to a Table ? or an Index ?

So far I have found 3 tables with this issue, 2 were pg_statistic in
different databases. The one referenced above is definitely a table:
"design_file".

The usage pattern on that table is to DELETE and later INSERT a few
hundred rows at a time on an occasional basis. The table is very small,
680 rows.

> > On some tables an "ANALYZE tablename" causes the error.

I discovered why ANALYZE raised an error, it was because pg_statistic
was affected. "vacuum full verbose pg_statistic;" fixed it. Hoping any
missing statistics get re-generated.

> > We recently had a streaming replication failover after loading a large
> > amount of data with pg_restore. The problems seem to have started after
> > that, but I'm not perfectly sure.
>
> pg_restore has completed successfully ?

pg_restore did complete successfully

> When pg_restore was running, did
> you see anything suspicious in the postgresql logfiles ?

The restore happened on the old master. The logfile was long since
deleted :(

> I have data_checksums switched on so am suspecting a streaming
> > replication bug. Anyone know of a recent bug which could have caused
> > this?
> >
>
> I cannot conclude at this point. I encountered these kind of errors with
> Indexes and re-indexing fixed them.

This is actually the second time I am seeing these kinds of errors, in
the past, after verifying that no data was lost I used VACUUM FULL to
recover the ability to INSERT. There was no pitchfork uprising...

> Regards,
> Venkata B N
>
> Fujitsu Australia

--
Brian Sutherland

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Farber 2016-05-31 09:32:20 How to hide JDBC connection credentials from git?
Previous Message CN 2016-05-31 07:45:40 Switching roles as an replacement of connection pooling tools