Re: Visibility map, partial vacuums

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Visibility map, partial vacuums
Date: 2008-11-23 19:05:12
Message-ID: 26361.1227467112@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> I committed the changes to FSM truncation yesterday, that helps with the
> truncation of the visibility map as well. Attached is an updated
> visibility map patch.

I looked over this patch a bit ...

> 1. The bits in the visibility map are set in the 1st phase of lazy
> vacuum. That works, but it means that after a delete or update, it takes
> two vacuums until the bit in the visibility map is set. The first vacuum
> removes the dead tuple, and only the second sees that there's no dead
> tuples and sets the bit.

I think this is probably not a big issue really. The point of this change
is to optimize things for pages that are static over the long term; one
extra vacuum cycle before the page is deemed static doesn't seem like a
problem. You could even argue that this saves I/O because we don't set
the bit (and perhaps later have to clear it) until we know that the page
has stayed static across a vacuum cycle and thus has a reasonable
probability of continuing to do so.

A possible problem is that if a relation is filled all in one shot,
autovacuum would trigger a single vacuum cycle on it and then never have
a reason to trigger another; leading to the bits never getting set (or
at least not till an antiwraparound vacuum occurs). We might want to
tweak autovac so that an extra vacuum cycle occurs in this case. But
I'm not quite sure what a reasonable heuristic would be.

Some other points:

* ISTM that the patch is designed on the plan that the PD_ALL_VISIBLE
page header flag *must* be correct, but it's really okay if the backing
map bit *isn't* correct --- in particular we don't trust the map bit
when performing antiwraparound vacuums. This isn't well documented.

* Also, I see that vacuum has a provision for clearing an incorrectly
set PD_ALL_VISIBLE flag, but shouldn't it fix the map too?

* It would be good if the visibility map fork were never created until
there is occasion to set a bit in it; this would for instance typically
mean that temp tables would never have one. I think that
visibilitymap.c doesn't get this quite right --- in particular
vm_readbuf seems willing to create/extend the fork whether its extend
argument is true or not, so it looks like an inquiry operation would
cause the map fork to be created. It should be possible to act as
though a nonexistent fork just means "all zeroes".

* heap_insert's all_visible_cleared variable doesn't seem to get
initialized --- didn't your compiler complain?

* You missed updating SizeOfHeapDelete and SizeOfHeapUpdate

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Scott Marlowe 2008-11-23 19:08:30 Re: [Q]updating multiple rows with Different values
Previous Message Tom Lane 2008-11-23 16:56:54 Re: portability of "designated initializers"