Re: VACUUM (DISABLE_PAGE_SKIPPING on)

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: VACUUM (DISABLE_PAGE_SKIPPING on)
Date: 2020-11-20 14:47:33
Message-ID: CANP8+jJmjCxNLv-mNeMUibARx0WGMKN45Sh=7z6rOqx5Ca-GGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 20 Nov 2020 at 14:07, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2020-Nov-20, Masahiko Sawada wrote:
>
> > I'm concerned that always freezing all tuples when we're going to make
> > the page dirty would affect the existing vacuum workload much. The
> > additional cost of freezing multiple tuples would be low but if we
> > freeze tuples we would also need to write WAL, which is not negligible
> > overhead I guess. In the worst case, if a table has dead tuples on all
> > pages we process them, but with this patch, in addition to that, we
> > will end up not only freezing all live tuples but also writing
> > XLOG_HEAP2_FREEZE_PAGE WAL for all pages. So I think it would be
> > better either to freeze all tuples if we find a tuple that needs to be
> > frozen or to make this behavior work only if the new VACUUM option is
> > specified.
>
> There are two costs associated with this processing. One is dirtying
> the page (which means it needs to be written down when evicted), and the
> other is to write WAL records for each change. The cost for the latter
> is going to be the same in both cases (with this change and without)
> because the same WAL will have to be written -- the only difference is
> *when* do you pay it. The cost of the former is quite different; with
> Simon's patch we dirty the page once, and without the patch we may dirty
> it several times before it becomes "stable" and no more writes are done
> to it.
>
> (If you have tables whose pages change all the time, there would be no
> difference with or without the patch.)
>
> Dirtying the page less times means less full-page images to WAL, too,
> which can be significant.

Summary of patch effects:

one_freeze_then_max_freeze.v5.patch
doesn't increase/decrease the number of pages that are dirtied by
freezing, but when a page will be dirtied **by any activity** it
maximises the number of rows frozen, so in some cases it may write a
WAL freeze record when it would not have done previously.

one_freeze_then_max_freeze.v6.patch
doesn't increase/decrease the number of pages that are dirtied by
freezing, but when a page will be dirtied **by freezing only** it
maximises the number of rows frozen, so the number of WAL records for
freezing is the same, but they may contain more tuples than before and
will increase the probability that the page is marked all_frozen.

So yes, as you say, the net effect will be to reduce the number of
write I/Os in subsequent vacuums required to move forward
relfrozenxid.

--
Simon Riggs http://www.EnterpriseDB.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-11-20 14:50:33 Re: Online verification of checksums
Previous Message Laurenz Albe 2020-11-20 14:41:05 Re: Add session statistics to pg_stat_database