Re: Show various offset arrays for heap WAL records

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: Show various offset arrays for heap WAL records
Date: 2023-03-22 01:31:09
Message-ID: CAH2-WznFCs25LSYS9wtzY+zk+sh3QjXHF2upRouxLEVRVcwQdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 21, 2023 at 3:37 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> One problem that I often run into when performing analysis of VACUUM
> using pg_walinspect is the issue of *who* pruned which heap page, for
> any given PRUNE record. Was it VACUUM/autovacuum, or was it
> opportunistic pruning? There is no way of knowing for sure right now.
> You *cannot* rely on an xid of 0 as an indicator of a given PRUNE
> record coming from VACUUM; it could just have been an opportunistic
> prune operation that happened to take place when a SELECT query ran,
> before any XID was ever allocated.

In case it's unclear how much of a problem this can be, here's an example:

The misc.sql regression test does a bulk update of the table "onek". A
little later, one of the queries that appears under the section "copy"
from the same file SELECTs from "onek". This produces a succession of
opportunistic prune records that look exactly like what you'd expect from
a VACUUM when viewed through pg_walinspect (without this patch). Each
PRUNE record has XID 0. The records appear in ascending heap block
number order, since there is a sequential scan involved (we go through
heapgetpage() to get to heap_page_prune_opt(), where the query prunes
opportunistically).

Another slightly surprising fact revealed by the patch is the ratio of
opportunistic prunes ("Heap2/PRUNE") to prunes run during VACUUM
("Heap2/PRUNE+BYVACUUM") with the regression tests:

│ resource_manager/record_type │ Heap2/PRUNE │
│ count │ 4,521 │
│ count_perc │ 0.220 │
│ rec_size │ 412,442 │
│ avg_rec_size │ 91 │
│ rec_size_perc │ 0.194 │
│ fpi_size │ 632,828 │
│ fpi_size_perc │ 1.379 │
│ combined_size │ 1,045,270 │
│ combined_size_perc │ 0.404 │
├─[ RECORD 61 ]────────────────┼─────────────────────────────┤
│ resource_manager/record_type │ Heap2/PRUNE+BYVACUUM │
│ count │ 2,784 │
│ count_perc │ 0.135 │
│ rec_size │ 467,057 │
│ avg_rec_size │ 167 │
│ rec_size_perc │ 0.219 │
│ fpi_size │ 546,344 │
│ fpi_size_perc │ 1.190 │
│ combined_size │ 1,013,401 │
│ combined_size_perc │ 0.391 │
├─[ RECORD 62 ]────────────────┼─────────────────────────────┤
│ resource_manager/record_type │ Heap2/VACUUM │
│ count │ 3,463 │
│ count_perc │ 0.168 │
│ rec_size │ 610,038 │
│ avg_rec_size │ 176 │
│ rec_size_perc │ 0.286 │
│ fpi_size │ 893,964 │
│ fpi_size_perc │ 1.948 │
│ combined_size │ 1,504,002 │
│ combined_size_perc │ 0.581 │
├─[ RECORD 63 ]────────────────┼─────────────────────────────┤
│ resource_manager/record_type │ Heap2/VISIBLE │
│ count │ 7,293 │
│ count_perc │ 0.354 │
│ rec_size │ 431,382 │
│ avg_rec_size │ 59 │
│ rec_size_perc │ 0.202 │
│ fpi_size │ 1,794,048 │
│ fpi_size_perc │ 3.909 │
│ combined_size │ 2,225,430 │
│ combined_size_perc │ 0.859 │

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shiy.fnst@fujitsu.com 2023-03-22 02:13:13 RE: Dropped and generated columns might cause wrong data on subs when REPLICA IDENTITY FULL
Previous Message Michael Paquier 2023-03-22 01:28:19 Re: Remove nonmeaningful prefixes in PgStat_* fields