Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: daveg <daveg(at)sonic(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Date: 2011-03-08 08:00:01
Message-ID: 4D75E201.20402@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On 08.03.2011 04:07, Greg Stark wrote:
> Well from that log you definitely have OldestXmin going backwards. And
> not by a little bit either. at 6:33 it set the all_visible flag and
> then at 7:01 it was almost 1.3 million transactions earlier. In fact
> to precisely the same value that was in use for a transaction at 1:38.
> That seems like a bit of a coincidence though it's not repeated
> earlier.

Yep. After staring at GetOldestXmin() again, it finally struck me how
OldestXmin can move backwards. You need two databases for it, which
probably explains why this has been so elusive.

Here's how to reproduce that:

CREATE DATABASE foodb;
CREATE DATABASE bardb;

session 1, in foodb:

foodb=# begin isolation level serializable;
BEGIN
foodb=# CREATE TABLE foo (a int4); -- just something to force this xact
to have an xid
CREATE TABLE
foodb=#

(leave the transaction open)

session 2, in bardb:

bardb=# CREATE TABLE foo AS SELECT 1;
SELECT
bardb=# vacuum foo; -- to set the PD_ALL_VISIBLE flag
VACUUM

session 3, in bardb:
bardb=# begin isolation level serializable;
BEGIN
bardb=# SELECT 1;
?column?
----------
1
(1 row)

(leave transaction open)

session 2, in bardb:

bardb=# vacuum foo;
WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "foo" page
0 (OldestXmin 803)
VACUUM
bardb=#

What there are no other transactions active in the same database,
GetOldestXmin() returns just latestCompletedXid. When you open a
transaction in the same database after that, its xid will be above
latestCompletedXid, but its xmin includes transactions from all
databases, and there might be a transaction in some other database with
an xid that precedes the value that GetOldestXmin() returned earlier.

I'm not sure what to do about that. One idea is track two xmin values in
proc-array, one that includes transactions in all databases, and another
that only includes transactions in the same database. GetOldestXmin()
(when allDbs is false) would only pay attention to the latter. It would
add a few instructions to GetSnapshotData(), though.

Another idea is to give up on the warning when it appears that
oldestxmin has moved backwards, and assume that it's actually fine. We
could still warn in other cases where the flag appears to be incorrectly
set, like if there is a deleted tuple on the page.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Heikki Linnakangas 2011-03-08 08:37:24 Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Previous Message Bob Lunney 2011-03-08 05:25:09 Re: PG Server Crash

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2011-03-08 08:16:06 Re: Theory of operation of collation patch
Previous Message Selena Deckelmann 2011-03-08 06:44:19 GSoC 2011 - Mentors? Projects?