Re: Visibility map thoughts

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Visibility map thoughts
Date: 2007-11-05 19:46:11
Message-ID: 1194291971.22428.123.camel@dogma.ljc.laika.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2007-11-05 at 09:52 +0000, Heikki Linnakangas wrote:
> Reducing VACUUM time is important, but the real big promise is the
> ability to do index-only-scans. Because that's the main focus of this
> exercise, I'm calling it the the Visibility Map from now on, because
> it's not about tracking dead space, but tuple visibility in general.
> Don't worry, reduced VACUUM times on read-mostly tables with hot spots
> will still fall out of it.

I like "Visibility map" because it's a positive name, and that prevents
confusion over double-negatives (for the same reason it's called
"synchronous_commit" even though the new feature is the ability to be
asynchronous). With Dead Space Map, I wouldn't immediately know whether
a 1 means "always visible" or "might be invisible".

However, "DSM" is a much less overloaded acronym than "VM" ;)

Regarding the focus, it will depend a lot on the user. Some people will
care more about VACUUM and some will care more about index-only scans.

> It's not useful for VACUUM FREEZE, unless we're willing to freeze much
> more aggressively, and change the meaning of a set bit to "all tuples on
> heap page are frozen".
>

This means that a regular VACUUM will no longer be enough to ensure
safety from transaction id wraparound.

I don't think this will be hard to fix, but it's an extra detail that
would need to be decided. The most apparent options appear to be:

1) Do as you say above. What are some of the cost trade-offs here? It
seems that frequent VACUUM FREEZE runs would keep the visibility map
mostly full, but will also cause more writing. I suppose the worst case
is that every tuple write needs results in two data page writes, one
normal write and another to freeze it later, which sounds bad. Maybe
there's a way to try to freeze the tuples on a page before it's written
out?

The idea of "frozen" and "always visible" seem very close in concept to
me.

2) Change to autovacuum to FREEZE on the forced autovacuum to prevent
wraparound.

3) Use multiple bits per visibility map

4) Have multiple types of visibility maps

The more I think about the visibility map, the more I think it will be a
huge win for PostgreSQL. It's especially nice that it works so well with
HOT. Thanks for working on it!

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-11-05 19:46:45 Re: sending row data to frontend - reg
Previous Message Tom Lane 2007-11-05 19:38:35 Re: Is necessary to use SEQ_MAXVALUE in pg_dump?