Re: Eager page freeze criteria clarification

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Eager page freeze criteria clarification
Date: 2023-11-09 02:23:01
Message-ID: CAAKRu_ZQ_UNaV=LBRVq=Lm44fKBYXWexw=sOyUEC-Lj_=QxW=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 11, 2023 at 8:43 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> A rough sketch of a freezing heuristic:
>
> - We concluded that to intelligently control opportunistic freezing we need
> statistics about the number of freezes and unfreezes
>
> - We should track page freezes / unfreezes in shared memory stats on a
> per-relation basis
>
> - To use such statistics to control heuristics, we need to turn them into
> rates. For that we need to keep snapshots of absolute values at certain
> times (when vacuuming), allowing us to compute a rate.
>
> - If we snapshot some stats, we need to limit the amount of data that occupies
>
> - evict based on wall clock time (we don't care about unfreezing pages
> frozen a month ago)
>
> - "thin out" data when exceeding limited amount of stats per relation
> using random sampling or such
>
> - need a smarter approach than just keeping N last vacuums, as there are
> situations where a table is (auto-) vacuumed at a high frequency
>
>
> - only looking at recent-ish table stats is fine, because we
> - a) don't want to look at too old data, as we need to deal with changing
> workloads
>
> - b) if there aren't recent vacuums, falsely freezing is of bounded cost
>
> - shared memory stats being lost on crash-restart/failover might be a problem
>
> - we certainly don't want to immediate store these stats in a table, due
> to the xid consumption that'd imply
>
>
> - Attributing "unfreezes" to specific vacuums would be powerful:
>
> - "Number of pages frozen during vacuum" and "Number of pages unfrozen that
> were frozen during the same vacuum" provides numerator / denominator for
> an "error rate"
>
> - We can perform this attribution by comparing the page LSN with recorded
> start/end LSNs of recent vacuums
>
> - If the freezing error rate of recent vacuums is low, freeze more
> aggressively. This is important to deal with insert mostly workloads.
>
> - If old data is "unfrozen", that's fine, we can ignore such unfreezes when
> controlling "freezing aggressiveness"
>
> - Ignoring unfreezing of old pages is important to e.g. deal with
> workloads that delete old data
>
> - This approach could provide "goals" for opportunistic freezing in a
> somewhat understandable way. E.g. aiming to rarely unfreeze data that has
> been frozen within 1h/1d/...

I have taken a stab at implementing the freeze stats tracking
infrastructure we would need to drive a heuristic based on error rate.

Attached is a series of patches which adds a ring buffer of vacuum
freeze statistics to each table's stats (PgStat_StatTabEntry).

It also introduces a guc: target_page_freeze_duration. This is the time
in seconds which we would like pages to stay frozen for. The aim is to
adjust opportunistic freeze behavior such that pages stay frozen for at
least target_page_freeze_duration seconds.

When a table is vacuumed the next entry is initialized in the ring
buffer (PgStat_StatTabEntry->frz_buckets[frz_current]). If all of the
buckets are in use, the two oldest buckets both ending either before or
after now - target_page_freeze_duration are combined.

When vacuum freezes a page, we increment the freeze counter in the
current PgStat_Frz entry in the ring buffer and update the counters used
to calculate the max, min, and average page age.

When a frozen page is modified, we increment "unfreezes" in the bucket
spanning the page freeze LSN. We also update the counters used to
calculate the min, max, and average frozen duration. Because we have
both the LSNs and time elapsed during the vacuum which froze the page,
we can derive the approximate time at which the page was frozen (using
linear interpolation) and use that to speculate how long it remained
frozen. If we are unfreezing it sooner than target_page_freeze_duration,
this is counted as an "early unfreeze". Using early unfreezes and page
freezes, we can calculate an error rate.

The guc, opp_freeze_algo, remains to allow us to test different freeze
heuristics during development. I included a dummy algorithm (algo 2) to
demonstrate what we could do with the data. If the error rate is above a
hard-coded threshold and the page is older than the average page age, it
is frozen. This is missing an "aggressiveness" knob.

There are two workloads I think we can focus on. The first is a variant
of our former workload I. The second workload, J, is the pgbench
built-in simple-update workload.

I2. Work queue
WL 1: 32 clients inserting a single row, updating an indexed
column in another row twice, then deleting the last updated row
pgbench scale 100
WL 2: 2 clients, running the TPC-B like built-in workload
rate-limited to 10,000 TPS

J. simple-update
pgbench scale 450
WL 1: 16 clients running built-in simple-update workload

The simple-update workload works well because it only updates
pgbench_accounts and inserts into pgbench_history. This gives us an
insert-only table and a table with uniform data modification. The former
should be frozen aggressively, the latter should be frozen as little as
possible.

The convenience function pg_stat_get_table_vacuums(tablename) returns
all of the collected vacuum freeze statistics for all of the vacuums of
the specified table (the contents of the ring buffer in
PgStat_StatTabEntry).

To start with, I ran workloads I2 and J with the criteria from master
and with the criteria that we freeze any page that is all frozen and all
visible. Having run the benchmarks for 40 minutes with the
target_page_freeze_duration set to 300 seconds, the following was my
error and efficacy (I am calling efficacy the % of pages frozen at the
end of the benchmark run).

I2 (work queue)
+-------+---------+---------+-----------+--------+----------+----------+
| table | algo | freezes |early unfrz| error | #frzn end| %frzn end|
+-------+---------+---------+-----------+--------+----------+----------+
| queue | av + af | 274,459 | 272,794 | 99% | 906 | 75% |
| queue | master | 274,666 | 272,798 | 99% | 908 | 56% |
+-------+---------+---------+-----------+--------+----------+----------+

J (simple-update)

+-----------+-------+---------+-----------+-------+----------+---------+
| pgbench | algo | freezes |early unfrz| error |#frzn end |%frzn end|
| tab | | | | | | |
+-----------+-------+---------+-----------+-------+----------+---------+
| accounts |av + af| 258,482 | 258,482 | 100% | 0 | 0% |
| history |av + af| 287,362 | 357 | 0% | 287,005 | 86% |
| accounts | master| 0 | 0 | | 0 | 0% |
| history | master| 0 | 0 | | 0 | 0% |
+-----------+-------+---------+-----------+-------+----------+---------+

The next step is to devise different heuristics and measure their
efficacy. IMO, the goal of the algorithm it is to freeze pages in a
relation such that we drive early unfreezes/freezes -> 0 and pages
frozen/number of pages of a certain age -> 1.

We have already agreed to consider early unfreezes/freezes to be the
"error". I have been thinking of pages frozen/number of pages of a
certain age as efficacy.

An alternative measure of efficacy is the average page freeze duration.
I keep track of this as well as the min and max page freeze durations.

I think we also will want some measure of magnitude in order to
determine how much more or less aggressively we should freeze. To this
end, I track the total number of pages scanned by a vacuum, the total
number of pages in the relation at the end of the vacuum, and the total
number of frozen pages at the beginning and end of the vacuum.

Besides error, efficacy, and magnitude, we need a way to compare a
freeze candidate page to the relation-level metrics. We had settled on
using the page's age -- how long it has been since the page was last
modified. I track the average, min, and max page ages at the time of
freezing. The difference between the min and max gives us some idea how
reliable the average page age may be.

For example, if we have high error, low variation (max - min page age),
and high magnitude (high # pages frozen), freezing was uniformly
ineffective.

Right now, I don't have an "aggressiveness" knob. Vacuum has the average
error rate over the past vacuums, the average page age calculated over
the past vacuums, and the target freeze duration (from the user set
guc). We need some kind of aggressiveness knob, but I'm not sure if it
is a page age threshold or something else.

- Melanie

Attachment Content-Type Size
v1-0002-Rename-frozen_pages-to-pages_frozen-for-clarity.patch text/x-patch 2.7 KB
v1-0001-Make-LVRelState-accessible-to-stats.patch text/x-patch 12.0 KB
v1-0004-visibilitymap_set-clear-return-previous-vm-bits.patch text/x-patch 6.5 KB
v1-0005-Add-guc-target_page_freeze_duration.patch text/x-patch 2.9 KB
v1-0003-Add-pg_visibility_map_summary_extended.patch text/x-patch 7.2 KB
v1-0008-Display-freeze-statistics.patch text/x-patch 11.1 KB
v1-0007-Track-vacuum-freeze-and-page-unfreeze-stats.patch text/x-patch 29.7 KB
v1-0009-Make-opportunistic-freezing-heuristic-configurabl.patch text/x-patch 9.6 KB
v1-0006-Add-vacuum-freeze-statistics-structures.patch text/x-patch 12.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Wienhold 2023-11-09 02:40:24 Re: Fix output of zero privileges in psql
Previous Message Peter Smith 2023-11-09 02:14:05 Re: pg_upgrade and logical replication