Re: FlushRelationBuffers error

From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Subject: Re: FlushRelationBuffers error
Date: 2004-09-30 16:56:27
Message-ID: 415C3ABB.4070206@bigfoot.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jan Wieck wrote:
> Any chance for bad memory?
>

I'll say near 0. However who never knows ? Now the server is again up and
running without glitches.

I suspect a race condition somewhere for the reindex operation.

I had with the engine 7.3 ( see in the archives ) a duplicate error during
reindexes at least one each month, for instance the server was another one,
and at that time I solved it not reindexing the DB daily ( so I decreased the
chances ).

With the 7.4 is the first time, since November 2003, that I see this error
( and for coincidence during a reindex too ) so I suspect that the race condition
is still there but with less chance to pops up.

Is it so dangerous teach the postmaster to solve this kind of problems without
a direct user intervention ?

Regards
Gaetano Mendola

> On 9/30/2004 6:16 AM, Gaetano Mendola wrote:
>
>> Hi all,
>> I'm running postgres 7.4.5 on a linux box, this morning I got this
>> error on my logs:
>>
>> WARNING: FlushRelationBuffers("exp_provider", 1836): block 1460 is
>> referenced (private 0, global 1)
>> ERROR: FlushRelationBuffers returned -2
>> DEBUG: AbortCurrentTransaction
>> PANIC: cannot abort transaction 354676201, it was already committed
>>
>> after the recovery:
>>
>> ERROR: could not access status of transaction 352975274
>> DEBUG: AbortCurrentTransaction
>>
>> this messages for 5 hours
>>
>>
>>
>> I had my verbosity equal to terse ( I run the server with debug2 level
>> ) so I didn't see the
>> exactly reason for this, after putting verbosity to "verbose" I got
>> the entire message:
>>
>> ERROR: 58P01: could not access status of transaction 352975274
>> DETAIL: could not open file "/var/lib/pgsql/data/pg_clog/0150": No
>> such file or directory
>> LOCATION: SlruReportIOError, slru.c:609
>> DEBUG: 00000: AbortCurrentTransaction
>> LOCATION: PostgresMain, postgres.c:2721
>>
>> In the pg_clog directory I had only the file 0152 !
>>
>>
>> I had to create a 8k file with zeroes and I discover the offset:
>>
>> ERROR: XX000: could not access status of transaction 352975274
>> DETAIL: could not read from file "/var/lib/pgsql/data/pg_clog/0150"
>> at offset 155648: Success
>> LOCATION: SlruReportIOError, slru.c:630
>> DEBUG: 00000: AbortCurrentTransaction
>> LOCATION: PostgresMain, postgres.c:2721
>>
>> After creating that file till to cover that offset the problem seems
>> be fixed.
>>
>> Info for hackers: exp_provider is an index and during that message a
>> reindex was in place.
>>
>> Some questions:
>> What about the 0151 file?
>> Don't you think that even with verbosity terse the message about the
>> file missing shall appear ?
>> Why emit the offset only if the file was found ?
>>
>> I have to thank Neil Conway that was helping me on IRC about this error.
>>
>> If you need further infos, please let me know.
>>
>> Regards
>> Gaetano Mendola
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: if posting/reading through Usenet, please send an appropriate
>> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>> message can get through to the mailing list cleanly
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-09-30 17:08:25 Re: FlushRelationBuffers error
Previous Message Tom Lane 2004-09-30 16:39:01 Re: -HEAD build failure on AIX 4.3.3 PPC