Re: Proposal to add page headers to SLRU pages

From: "Li, Yong" <yoli(at)ebay(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Bagga, Rishu" <bagrishu(at)amazon(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Debnath, Shawn" <sdn(at)ebay(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, "Shyrabokau, Anton" <antons(at)ebay(dot)com>
Subject: Re: Proposal to add page headers to SLRU pages
Date: 2024-03-08 07:58:09
Message-ID: 85EAAC01-6B03-4777-8538-ED5A74312A82@ebay.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Mar 7, 2024, at 03:09, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> External Email
>
> From: Stephen Frost <sfrost(at)snowman(dot)net>
> Subject: Re: Proposal to add page headers to SLRU pages
> Date: March 7, 2024 at 03:09:59 GMT+8
> To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> Cc: "Li, Yong" <yoli(at)ebay(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Bagga, Rishu" <bagrishu(at)amazon(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Debnath, Shawn" <sdn(at)ebay(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, "Shyrabokau, Anton" <antons(at)ebay(dot)com>
>
>
> Greetings,
>
> * Alvaro Herrera (alvherre(at)alvh(dot)no-ip(dot)org) wrote:
>> I suppose this is important to do if we ever want to move SLRUs into
>> shared buffers. However, I wonder about the extra time this adds to
>> pg_upgrade. Is this something we should be concerned about? Is there
>> any measurement/estimates to tell us how long this would be? Right now,
>> if you use a cloning strategy for the data files, the upgrade should be
>> pretty quick ... but the amount of data in pg_xact and pg_multixact
>> could be massive, and the rewrite is likely to take considerable time.
>
> While I definitely agree that there should be some consideration of
> this concern, it feels on-par with the visibility-map rewrite which was
> done previously. Larger systems will likely have more to deal with than
> smaller systems, but it's still a relatively small portion of the data
> overall.
>
> The benefit of this change, beyond just the possibility of moving them
> into shared buffers some day in the future, is that this would mean that
> SLRUs will have checksums (if the cluster has them enabled). That
> benefit strikes me as well worth the cost of the rewrite taking some
> time and the minor loss of space due to the page header.
>
> Would it be useful to consider parallelizing this work? There's already
> parts of pg_upgrade which can be parallelized and so this isn't,
> hopefully, a big lift to add, but I'm not sure if there's enough work
> being done here CPU-wise, compared to the amount of IO being done, to
> have it make sense to run it in parallel. Might be worth looking into
> though, at least, as disks have gotten to be quite fast.
>
> Thanks!
>
> Stephen
>

Thank Alvaro and Stephen for your thoughtful comments.

I did a quick benchmark regarding pg_upgrade time, and here are the results.

Hardware spec:
MacBook Pro M1 Max - 10 cores, 64GB memory, 1TB Apple SSD

Operating system:
macOS 14.3.1

Complier:
Apple clang 15.0.0

Compiler optimization level: -O2

====
PG setups:
Old cluster: PG 16.2 release (source build)
New cluster: PG Git HEAD plus the patch (source build)

====
Benchmark steps:

1. Initdb for PG 16.2.
2. Initdb for PG HEAD.
3. Run pg_upgrade on the above empty database, and time the overall wall clock time.
4. In the old cluster, write 512MB all-zero dummy segment files (2048 segments) under pg_xact.
5. In the old cluster, write 512MB all-zero dummy segment files under pg_multixact/members.
6. In the old cluster, write 512MB all-zero dummy segment files under pg_multixact/offsets.
7. Purge the OS page cache.
7. Run pg_upgrade again, and time the overall wall clock time.

====
Test result:

On the empty database, pg_upgrade took 4.8 seconds to complete.

With 1.5GB combined SLRU data to convert, pg_upgrade took 11.5 seconds to complete.

It took 6.7 seconds to convert 1.5GB SLRU files for pg_upgrade.

====

For clog, 2048 segments can host about 2 billion transactions, right at the limit for wraparound.
That’s the maximum we can have. 2048 segments are also big for pg_multixact SLRUs.

Therefore, on a modern hardware, in the worst case, pg_upgrade will run for 7 seconds longer.

Regards,

Yong

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-03-08 08:20:32 Re: [17] CREATE SUBSCRIPTION ... SERVER
Previous Message Shlok Kyal 2024-03-08 07:03:00 Re: speed up a logical replica setup