forcing a rebuild of the visibility map

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: forcing a rebuild of the visibility map
Date: 2016-06-16 03:06:37
Message-ID: CA+TgmoZvY0dTaDvQG-vNU2OEaKdXzY-GP5GS0gTM9v8-=S0FrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[ Changing subject line in the hopes of keeping separate issues in
separate threads. ]

On Mon, Jun 6, 2016 at 11:00 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I'm intuitively sympathetic to the idea that we should have an option
>> for this, but I can't figure out in what case we'd actually tell
>> anyone to use it. It would be useful for the kinds of bugs listed
>> above to have VACUUM (rebuild_vm) to blow away the VM fork and rebuild
>> it, but that's different semantics than what we proposed for VACUUM
>> (even_frozen_pages). And I'd be sort of inclined to handle that case
>> by providing some other way to remove VM forks (like a new function in
>> the pg_visibilitymap contrib module, maybe?) and then just tell people
>> to run regular VACUUM afterwards, rather than putting the actual VM
>> fork removal into VACUUM.
>
> There's a lot to be said for that approach. If we do it, I'd be a bit
> inclined to offer an option to blow away the FSM as well.

After having been scared out of my mind by the discovery of
longstanding breakage in heap_update[1], it occurred to me that this
is an excellent example of a case in which the option for which Andres
was agitating - specifically forcing VACUUM to visit ever single page
in the heap - would be useful. Even if there's functionality
available to blow away the visibility map forks, you might not want to
just go remove them all and re-vacuum everything, because while you
were rebuilding the VM forks, index-only scans would stop being
index-only, and that might cause an operational problem. Being able
to simply force every page to be visited is better. Patch for that
attached. I decided to call the option disable_page_skipping rather
than even_frozen_pages because it should also force visiting pages
that are all-visible but not all-frozen. (I was sorely tempted to go
with the competing proposal of calling it VACUUM (THEWHOLEDAMNTHING)
but I refrained.)

However, I also think that the approach described above - providing a
way to excise VM forks specifically - is useful. Patch for that
attached, too. It turns out that can't be done without either adding
a new WAL record type or extending an existing one; I chose to add a
"flags" argument to XLOG_SMGR_TRUNCATE, specifying the forks to be
truncated. Since this will require bumping the XLOG version, if we're
going to do it, and I think we should, it would be good to try to get
it done before beta2 rather than closer to release. However, I don't
want to commit this without some review, so please review this.
Actually, please review both patches.[2] The same core support could
be used to add an option to truncate the FSM, but I didn't write a
patch for that because I'm incredibly { busy | lazy }.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] https://www.postgresql.org/message-id/CA+TgmoZ-1TXCsxWcp3i-KMo5DNB-6p-odqw7c-N_q3pzZY1TJg@mail.gmail.com
[2] This request is directed at the list generally, not at any
specific individual ... well, OK, I mean you. Yeah, you!

Attachment Content-Type Size
disable-page-skipping-v1.patch text/x-diff 10.3 KB
truncate-vm-v1.patch text/x-diff 10.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-06-16 03:08:59 Re: PgQ and pg_dump
Previous Message Michael Paquier 2016-06-16 03:06:32 Re: Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it