Re: New strategies for freezing, advancing relfrozenxid early

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2023-01-23 11:17:09
Message-ID: CAFiTN-srf+PV0aj5oUhT1Ohxw2WbQuKhcu0d-=rA6odYEN5KmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 18, 2023 at 1:47 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Jan 17, 2023 at 10:05 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

My final set of comments for 0002

1.
+struct vmsnapshot
+{
+ /* Target heap rel */
+ Relation rel;
+ /* Scanning strategy used by VACUUM operation */
+ vmstrategy strat;
+ /* Per-strategy final scanned_pages */
+ BlockNumber rel_pages;
+ BlockNumber scanned_pages_lazy;
+ BlockNumber scanned_pages_eager;

I do not understand much use of maintaining these two
'scanned_pages_lazy' and 'scanned_pages_eager' variables. I think
just maintaining 'scanned_pages' should be sufficient. I do not see
in patches also they are really used. lazy_scan_strategy() is using
these variables but this is getting values of these out parameters
from visibilitymap_snap_acquire(). And visibilitymap_snap_strategy()
is also using this, but it seems there we just need the final result
of 'scanned_pages' instead of these two variables.

2.

+#define MAX_PAGES_YOUNG_TABLEAGE 0.05 /* 5% of rel_pages */
+#define MAX_PAGES_OLD_TABLEAGE 0.70 /* 70% of rel_pages */

Why is the logic behind 5% and 70% are those based on some
experiments? Should those be tuning parameters so that with real
world use cases if we realise that it would be good if the eager scan
is getting selected more frequently or less frequently then we can
tune those parameters?

3.
+ /*
+ * VACUUM's DISABLE_PAGE_SKIPPING option overrides our decision by forcing
+ * VACUUM to scan every page (VACUUM effectively distrusts rel's VM)
+ */
+ if (force_scan_all)
+ vacrel->vmstrat = VMSNAP_SCAN_ALL;

I think this should be moved as first if case, I mean why to do all
the calculations based on the 'tableagefrac' and
'TABLEAGEFRAC_XXPOINT' if we are forced to scan them all. I agree the
extra computation we are doing might not really matter compared to the
vacuum work we are going to perform but still seems logical to me to
do the simple check first.

4. Should we move prefetching as a separate patch, instead of merging
with the scanning strategy?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-01-23 11:20:06 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message David Rowley 2023-01-23 11:08:15 Re: heapgettup refactoring