Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
Date: 2012-02-21 04:50:57
Message-ID: CAK-MWwThExSgg74OaRk-fe36CsKfBhj+K_n6vmY20hr11WDcYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, Feb 21, 2012 at 3:47 PM, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> wrote:

>
>
> On Tue, Feb 21, 2012 at 3:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> I wrote:
>> > OK, so that pretty much explains where the visible symptoms are coming
>> > from: somehow, the table got truncated but its pkey index did not get
>> > cleared out. So an insert creates an empty page zero, inserts a heap
>> > tuple there, tries to insert an index entry. The btree code sees there
>> > is an index entry for that key already, and tries to fetch the heap
>> > tuple for that index entry to see if it's dead (which would allow the
>> > insertion to proceed). But the block number the index is pointing at
>> > isn't there, so you get the quoted error message. The insertion rolls
>> > back, leaving a dead tuple that can be garbage-collected by autovacuum,
>> > after which it truncates the table again --- but of course without
>> > removing any index entries, except maybe one for TID (0,1) if that's
>> > still there. Lather rinse repeat.
>>
>> Hmm ... actually there is a point that this theory doesn't explain
>> entirely. If the probability of a collision with an existing index
>> entry was near 100%, then each hourly cron job should only have been
>> able to insert one or a few heap tuples before failing. That would
>> not trigger an autovacuum right away. Eventually the number of dead
>> tuples would build up to the point where autovacuum got interested,
>> but it strains credulity a bit to assume that this happened exactly
>> after the last hourly run before you renamed the table. Yet, if that
>> didn't happen just that way, how come the size of the table is exactly
>> zero now?
>>
>> The theory would be more satisfactory if we could expect that an hourly
>> run would be able to insert some thousands of tuples before failing,
>> enough to trigger an autovacuum run. So I'm wondering if maybe the
>> index is *partially* cleaned out, but not completely. Does this
>> materialized view have a fairly predictable number of rows, and if so
>> how does that compare to the number of entries in the index? (If you
>> have no other way to get the number of entries in the index, try
>> inserting a dummy row, deleting it, and then VACUUM VERBOSE.)
>>
>> regards, tom lane
>>
>
> There is some funny results:
>
> hh=# VACUUM verbose agency_statistics_old;
> INFO: vacuuming "public.agency_statistics_old"
> INFO: index "agency_statistics_pkey" now contains 0 row versions in 605
> pages
> DETAIL: 0 index row versions were removed.
> 595 index pages have been deleted, 595 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.01 sec.
> INFO: "agency_statistics_old": found 0 removable, 0 nonremovable row
> versions in 0 out of 0 pages
> DETAIL: 0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.01 sec.
> VACUUM
>
> However when I try populate that table with production data I get an error:
>
> hh=# insert into agency_statistics_old select * from agency_statistics;
> ERROR: could not read block 228 in file "base/16404/118881486": read only
> 0 of 8192 bytes
>
> E.g. the database see that index have zero rows, but an insert still fail.
>
> May be I should use pageinspect addon to see an actual index pages content?
>
>
>
What makes Your idea about: "index is *partially* cleaned out, but not
completely" highly probable that is the next query produce no error:

hh=# insert into agency_statistics_old select * from agency_statistics
limit 1;
INSERT 0 1

--
Maxim Boguk
Senior Postgresql DBA.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2012-02-21 05:03:11 Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
Previous Message Maxim Boguk 2012-02-21 04:47:30 Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-02-21 05:03:11 Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
Previous Message Maxim Boguk 2012-02-21 04:47:30 Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again