Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

From: Greg Stark <gsstark(at)mit(dot)edu>
To: daveg <daveg(at)sonic(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Date: 2011-03-08 02:07:53
Message-ID: AANLkTi=54GF1MJN3FSXL2gr9xG_aB09y_taT8daUwz4Z@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Mon, Mar 7, 2011 at 11:53 PM, daveg <daveg(at)sonic(dot)net> wrote:
>> Looking at the code, I don't see how that situation could arise, though.
>> The value calculated by GetOldestXmin() should never move backwards. And
>> GetOldestXmin() is called in lazy_vacuum_rel(), after it has acquired a
>> lock on the table, which should protect from a race condition where two
>> vacuums could run on the table one after another, in a way where the
>> later vacuum runs with an OldestXmin calculated before the first vacuum.
>>
>> Hmm, fiddling with vacuum_defer_cleanup_age on the fly could cause that,
>> though. You don't do that, do you?
>
> No.
>
> I've updated the patch to collect db and schema and added Merlins patch as
> well and run it for a while. The attached log is all the debug messages
> for pg_statistic page 333 from one database. I've also attached the two
> most recent page images for that particular page, the last digits in the
> filename are the hour and minute of when the page was saved.

Well from that log you definitely have OldestXmin going backwards. And
not by a little bit either. at 6:33 it set the all_visible flag and
then at 7:01 it was almost 1.3 million transactions earlier. In fact
to precisely the same value that was in use for a transaction at 1:38.
That seems like a bit of a coincidence though it's not repeated
earlier.

It also seems odd that it happens only with this one block of this one table.

What does SHOW ALL show for the current settings in effect? And what
was process 23896, are there any other log messages from it? When did
it start?

--
greg

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message daveg 2011-03-08 02:51:28 Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Previous Message daveg 2011-03-07 23:53:40 Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2011-03-08 02:36:09 Re: pgsql: Basic Recovery Control functions for use in Hot Standby. Pause,
Previous Message Tom Lane 2011-03-08 01:09:57 Re: Parallel make problem with git master