Re: longfin and tamandua aren't too happy but I'm not sure why

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: longfin and tamandua aren't too happy but I'm not sure why
Date: 2022-09-28 20:20:37
Message-ID: CAH2-WznN50FWap8Q0zhUq9aMTTaQA6M8FXcBjy+sxO824qvQjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 28, 2022 at 12:20 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> What do you think would constitute a test here?

I would start with something simple. Focus on the record types that we
know are the most common. It's very skewed towards heap and nbtree
record types, plus some transaction rmgr types.

> Say: insert N records to a heapam table with one index of each kind
> (under controlled conditions: no checkpoint, no autovacuum, no FPIs),
> then measure the total number of bytes used by WAL records of each rmgr.
> Have a baseline and see how that changes over time.

There are multiple flavors of alignment involved here, which makes it
tricky. For example, in index AMs the lp_len field from each line
pointer is always MAXALIGN()'d. It is only aligned as required for the
underlying heap tuple attributes in the case of heap tuples, though.
Of course you also have alignment considerations for the record itself
-- buffer data can usually be stored without being aligned at all. But
you can still have an impact from WAL header alignment, especially for
record types that tend to be relatively small -- like nbtree index
tuple inserts on leaf pages.

I think that the most interesting variation is among boundary cases
for those records that affect a variable number of page items. These
record types may be impacted by alignment considerations in subtle
though important ways. Things like PRUNE records often don't have that
many items. So having coverage of the overhead of every variation of a
small PRUNE record could be important as a way of catching regressions
that would otherwise be hard to catch.

Continuing with that example, we could probably cover every possible
permutation of PRUNE records that affect 5 or so items. Let's say that
we have a regression where PRUNE records that happen to have 3 items
that must all be set LP_DEAD increase in size by one MAXALIGN()
quantum. This will probably make a huge difference in many workloads,
but it's difficult to spot after the fact when it only affects those
records that happen to have a number of items that happen to fall in
some narrow but critical range. It might not affect PRUNE records
with (say) 5 items at all. So if we're looking at the macro picture
with (say) pgbench and pg_waldump we'll tend to miss the regression
right now; it'll be obscured by the fact that the regression only
affects a minority of all PRUNE records, for whatever reason.

This is just a made up example, so the specifics might be off
significantly -- I'd have to work on it to be sure. Hopefully the
example still gets the general idea across.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-09-28 20:30:37 Re: more descriptive message for process termination due to max_slot_wal_keep_size
Previous Message Nathan Bossart 2022-09-28 20:12:22 Re: predefined role(s) for VACUUM and ANALYZE