Re: Freeze avoidance of very large table.

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Freeze avoidance of very large table.
Date: 2015-07-13 15:05:00
Message-ID: CANP8+jLD0MbgwHwpXOvAuWEuEiLKW6UzkbhKiD0cN3gCjMq7iA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13 July 2015 at 15:48, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> wrote:

> On Mon, Jul 13, 2015 at 9:22 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2015-07-13 21:03:07 +0900, Sawada Masahiko wrote:
> >> Even If we implement rewriting tool for vm into pg_upgrade, it will
> >> take time as much as revacuum because it need whole scanning table.
> >
> > Why would it? Sure, you can only set allvisible and not the frozen bit,
> > but that's fine. That way the cost for freezing can be paid over time.
> >
> > If we require terrabytes of data to be scanned, including possibly
> > rewriting large portions due to freezing, before index only scans work
> > and most vacuums act in a partial manner the migration to 9.6 will be a
> > major pain for our users.
>
> Ah, If we set all bit as not all-frozen, we don't need to whole table
> scanning, only scan vm.
> And I agree with this.
>
> But please image the case where old cluster has table which is very
> large, read-only and vacuum freeze is done.
> In this case, the all-frozen bit of such table in new cluster will not
> set, unless we do vacuum freeze again.
> The information of all-frozen of such table is lacked.
>

The contents of the VM fork is essential to retain after an upgrade because
it is used for Index Only Scans. If we destroy that information it could
send SQL response times to unacceptable levels after upgrade.

It takes time to scan the VM and create the new VFM, but the time taken is
proportional to the size of VM, which seems like it will be acceptable.

Example calcs:
An 8TB PostgreSQL installation would need us to scan 128MB of VM into about
256MB of VFM. Probably the fsyncs will occupy the most time.
In comparison, we would need to scan all 8TB to rebuild the VMs, which will
take much longer (and fsyncs will still be needed).

Since we don't record freeze map information now it is acceptable to begin
after upgrade with all freeze info set to zero.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-07-13 15:05:13 Re: Freeze avoidance of very large table.
Previous Message Andrew Dunstan 2015-07-13 14:59:41 Re: [PATCH] Generalized JSON output functions