Re: Add recovery to pg_control and remove backup_label

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add recovery to pg_control and remove backup_label
Date: 2023-11-21 17:59:18
Message-ID: 20231121175918.3s7k6iijg2zm6m4v@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-11-21 13:41:15 -0400, David Steele wrote:
> On 11/20/23 16:41, Andres Freund wrote:
> >
> > On 2023-11-20 15:56:19 -0400, David Steele wrote:
> > > I understand this is an option -- but does it need to be? What is the
> > > benefit of excluding the manifest?
> >
> > It's not free to create the manifest, particularly if checksums are enabled.
>
> It's virtually free, even with the basic CRCs.

Huh?

perf stat src/bin/pg_basebackup/pg_basebackup -h /tmp/ -p 5440 -D - -cfast -Xnone --format=tar > /dev/null

4,423.81 msec task-clock # 0.626 CPUs utilized
433,475 context-switches # 97.987 K/sec
5 cpu-migrations # 1.130 /sec
599 page-faults # 135.404 /sec
12,208,261,153 cycles # 2.760 GHz
6,805,401,520 instructions # 0.56 insn per cycle
1,273,896,027 branches # 287.964 M/sec
14,233,126 branch-misses # 1.12% of all branches

7.068946385 seconds time elapsed

1.106072000 seconds user
3.403793000 seconds sys

perf stat src/bin/pg_basebackup/pg_basebackup -h /tmp/ -p 5440 -D - -cfast -Xnone --format=tar --manifest-checksums=CRC32C > /dev/null

4,324.64 msec task-clock # 0.640 CPUs utilized
433,306 context-switches # 100.195 K/sec
3 cpu-migrations # 0.694 /sec
598 page-faults # 138.277 /sec
11,952,475,908 cycles # 2.764 GHz
6,816,888,845 instructions # 0.57 insn per cycle
1,275,949,455 branches # 295.042 M/sec
13,721,376 branch-misses # 1.08% of all branches

6.760321433 seconds time elapsed

1.113256000 seconds user
3.302907000 seconds sys

perf stat src/bin/pg_basebackup/pg_basebackup -h /tmp/ -p 5440 -D - -cfast -Xnone --format=tar --no-manifest > /dev/null

3,925.38 msec task-clock # 0.823 CPUs utilized
257,467 context-switches # 65.590 K/sec
4 cpu-migrations # 1.019 /sec
552 page-faults # 140.624 /sec
11,577,054,842 cycles # 2.949 GHz
5,933,731,797 instructions # 0.51 insn per cycle
1,108,784,719 branches # 282.466 M/sec
11,867,511 branch-misses # 1.07% of all branches

4.770347012 seconds time elapsed

1.002521000 seconds user
2.991769000 seconds sys

I'd not call 7.06->4.77 or 6.76->4.77 "virtually free".

And this actually *under* selling the cost - we waste a lot of cycles due to
bad buffering decisions. Once we fix that, the cost differential increases
further.

> Anyway, would you really want a backup without a manifest? How would you
> know something is missing? In particular, for page incremental how do you
> know something is new (but not WAL logged) if there is no manifest? Is the
> plan to just recopy anything not WAL logged with each incremental?

Shrug. If you just want to create a new standby by copying the primary, I
don't think creating and then validating the manifest buys you much. Long term
backups are a different story, particularly if data files are stored
individually, rather than in a single checksummed file.

> > Also, for external backups, there's no manifest...
>
> There certainly is a manifest for many external backup solutions. Not having
> a manifest is just running with scissors, backup-wise.

You mean that you have an external solution gin up a backup manifest? I fail
to see how that's relevant here?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-11-21 18:11:02 Re: PSQL error: total cell count of XXX exceeded
Previous Message David Steele 2023-11-21 17:41:15 Re: Add recovery to pg_control and remove backup_label