visibility map

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: visibility map
Date: 2010-06-14 03:08:13
Message-ID: AANLkTilxav5NzXHdZ_K8M8gi0ARXt4jOsumLcIImXZRv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

visibilitymap.c begins with a long and useful comment - but this part
seems to have a bit of split personality disorder.

* Currently, the visibility map is not 100% correct all the time.
* During updates, the bit in the visibility map is cleared after releasing
* the lock on the heap page. During the window between releasing the lock
* and clearing the bit in the visibility map, the bit in the visibility map
* is set, but the new insertion or deletion is not yet visible to other
* backends.
*
* That might actually be OK for the index scans, though. The newly inserted
* tuple wouldn't have an index pointer yet, so all tuples reachable from an
* index would still be visible to all other backends, and deletions wouldn't
* be visible to other backends yet. (But HOT breaks that argument, no?)

I believe that the answer to the parenthesized question here is "yes"
(in which case we might want to just delete this paragraph).

* There's another hole in the way the PD_ALL_VISIBLE flag is set. When
* vacuum observes that all tuples are visible to all, it sets the flag on
* the heap page, and also sets the bit in the visibility map. If we then
* crash, and only the visibility map page was flushed to disk, we'll have
* a bit set in the visibility map, but the corresponding flag on the heap
* page is not set. If the heap page is then updated, the updater won't
* know to clear the bit in the visibility map. (Isn't that prevented by
* the LSN interlock?)

I *think* that the answer to this parenthesized question is "no".
When we vacuum a page, we set the LSN on both the heap page and the
visibility map page. Therefore, neither of them can get written to
disk until the WAL record is flushed, but they could get flushed in
either order. So the visibility map page could get flushed before the
heap page, as the non-parenthesized portion of the comment indicates.
However, at least in theory, it seems like we could fix this up during
redo.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-06-14 03:47:06 Re: [PERFORM] No hash join across partitioned tables?
Previous Message Fujii Masao 2010-06-14 02:56:15 Re: SR slaves and .pgpass