Re: crash-safe visibility map, take three

From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: crash-safe visibility map, take three
Date: 2011-01-05 20:22:54
Message-ID: 4D24D31E.2040906@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2010-11-30 05:57, Robert Haas wrote:
> Last week, I posted a couple of possible designs for making the
> visibility map crash-safe, which did not elicit much comment. Since
> this is an important prerequisite to index-only scans, I'm trying
> again.

The logic seems to be:

* If the visibillity map should be crash-safe if should be WAL-logged.
* PD_ALL_VISIBLE is currently not being WAL-logged when vacuum sets it.
* WAL logging the visibillity map bit is not "that" bad (small size).
* WAL-logging the PD_ALL_VISIBLE bit would can WAL-record for the entire
relation to be written out (potentially huge).

Would the problem not be solved by not "trying to keep the two bits in
sync" but
simply removing the PD_ALL_VISIBLE bit in the page-header in favor
for the bit in the visibillity map, that is now WAL-logged and thus safe
to trust?

Then vacuum could emit WAL records for setting the visibillity map bits,
combined
with changes on the page could clear it?

The question probably boils down to:

Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE
bit serve?

I've probably just missed some logic?

Having index-only scans per-table ajustable would make quite some sense..

I have a couple of tables with a high turn-over rate that never get out
of the OS-cache
anyway, the benefit of index-only scans are quite small, especially if
they come with
additional overhead on INSERT/UPDATE/DELETE operations, whereas I also have
huge tables with a very small amount of changes. Just the saved IO of
not having to
go to the heap in some cases would be good.

I could see some benefits in having the index-only scan work on
tuple-level visibillity information
and not page-level, but that would require a bigger map
(allthough still less than 1% of the heap size if my calculations are
correct), but
would enable visibillity testing of a tuple without going to the heap
even other (unrelated)
changes happend on the same page.

Jesper

--
Jesper

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2011-01-05 20:58:24 Re: making an unlogged table logged
Previous Message Robert Haas 2011-01-05 20:08:13 Re: Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby conflicts from never-visible records