Re: Freeze avoidance of very large table.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Greg S <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Freeze avoidance of very large table.
Date: 2016-03-08 13:30:33
Message-ID: CA+TgmoZuHkFnVXnA_k8FcEef8+2hc3Ob3GK=L1TZuu3=6SDegQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 8, 2016 at 7:26 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Regarding pg_visibility module, I'd like to share some bugs and
> propose to add a relation type condition to each functions.

OK, thanks.

> Including it, I've attached remaining 2 patches; one is removing page
> conversion code from pg_upgarde, and another is supporting pg_upgrade
> for frozen bit.

Committed 001 with minor tweaks.

I find rewrite_vm_table to be pretty opaque. There's not even a
comment explaining what it is supposed to do. And I wonder why we
really need to be this efficient about it anyway. Like, would it be
too expensive to just do this:

for (i = 0; i < BITS_PER_BYTE; ++i)
if ((old & (1 << i)) != 0)
new |= 1 << (2 * i);

And how about adding some more comments explaining why we are doing
this rewriting, like this:

In versions of PostgreSQL prior to catversion 201602181, PostgreSQL's
visibility map included one bit per heap page; it now includes two.
When upgrading a cluster from before that time to a current PostgreSQL
version, we could refuse to copy visibility maps from the old cluster
to the new cluster; the next VACUUM would recreate them, but at the
price of scanning the entire table. So, instead, we rewrite the old
visibility maps in the new format. That way, the all-visible bit
remains set for the pages for which it was set previously. The
all-frozen bit is never set by this conversion; we leave that to
VACUUM.

Also, I'm slightly perplexed by the fact that I can't see how this
code succeeds in turning each page into two pages, which is something
that it seems like it would need to do. Wouldn't we need to write out
the old page header twice, one for the first of the two new pages and
again for the second? I probably need more caffeine here, so please
tell me what I'm missing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-03-08 13:32:28 Re: VS 2015 support in src/tools/msvc
Previous Message Michael Paquier 2016-03-08 13:23:17 Re: Parallel query fails on standby server