Re: "page is not marked all-visible" warning in regression tests

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: "page is not marked all-visible" warning in regression tests
Date: 2012-06-06 17:46:11
Message-ID: 201206061946.11827.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, June 05, 2012 04:18:44 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Tuesday, June 05, 2012 03:32:08 PM Tom Lane wrote:
> >> I got this last night in a perfectly standard build of HEAD:
> >> + WARNING: page is not marked all-visible but visibility map bit is set
> >> in relation "pg_db_role_setting" page 0 --
> >
> > I have seen that twice just yesterday. Couldn't reproduce it so far.
> > Workload was (pretty exactly):
> >
> > initdb
> > postgres -c fsync=off
> > pgbench -i -s 100
> > CREATE TABLE data(id serial primary key, data int);
> > ALTER SEQUENCE data_id_seq INCREMENT 2;
> > VACUUM FREEZE;
> > normal shutdown
> > postgres -c fsync=on
> > pgbench -c 20 -j 20 -T 100
> > WARNING: ... pg_depend ...
> > WARNING: ... can't remember ...
>
> Hmm ... from memory, what I did was
>
> configure/build/install from a fresh pull
> initdb
> start postmaster, fsync off
> make installcheck
> stop postmaster
> apply Hanada-san's json patch, replace postgres executable
> start postmaster, fsync off
> make installcheck
>
> and it was the second of these runs that failed. Could we be missing
> flushing some blocks out to disk at shutdown? Maybe fsync off is a
> contributing factor?
On a cursory lock it might just be a race condition in
vacuumlazy.c:lazy_scan_heap. If scan_all is set, which it has to be for the
warning to be visible, all_visible_according_to_vm is determined before we
loop over all blocks. At the point where one specific heap block is actually
read and locked that knowledge might be completely outdated by any concurrent
backend. Am I missing something?

I have to say the whole visibilitymap correctness and crash-safety seems to be
quite under documented, especially as it seems to be somewhat intricate (to
me). E.g. not having any note why visibilitymap_test doesn't need locking. (I
guess the theory is that a 1 byte read will always be consistent. But how does
that ensure other backends see an up2date value?).

Andres

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-06 17:55:02 Re: Ability to listen on two unix sockets
Previous Message Daniel Farina 2012-06-06 16:58:30 Re: Inconsistency in libpq connection parameters, and extension thereof