Re: Seeking guidance on extremely slow pg_restore despite strong I/O performance

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: MentionTheElephant <MentionTheElephant(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Seeking guidance on extremely slow pg_restore despite strong I/O performance
Date: 2025-12-08 00:53:40
Message-ID: CAMkU=1yak3bAWyTkhdtko5FCpT9i=Rzv8u3djafKT=eNpT1KPg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Dec 5, 2025 at 5:30 AM MentionTheElephant <
MentionTheElephant(at)gmail(dot)com> wrote:

> The central question is: What should I examine further to understand
> why checkpoint processing becomes the dominant bottleneck during
> restore, despite fsync=off, synchronous_commit=off, and excellent

random write latency?
>

I don't see any evidence that checkpoint is the dominant bottleneck.
Checkpoints are not (usually) meant to be fast. They pace themselves to
finish in the allotted time, while causing the least amount of drama while
they are underway. The whole system shouldn't freeze up while waiting for
a checkpoint to run.

> Checkpoints appear to

> stall the entire restore process.
>

What makes it appear that way? Is it just that they take a long time to
finish, or is there some other evidence you haven't described?

> Example log excerpts show checkpoint write times consistently in the
> range of 600–1100 seconds, with large numbers of buffers written (from
> hundreds of thousands to over 1.6 million).

You told it to shoot for checkpoint_timeout * checkpoint_completion_target
= 3600 * 0.9 = 3240 seconds. If anything, it is weird that it takes so
little time, compared to what you authorized it to take. As for the number
of buffers, it writes as many as it needs to write to get the job done.
Are the checkpoints driven by time, or by WAL volume?

The time reported for the write phase includes not only time spent doing
the writes, but also time spent sleeping so as to pace the checkpoint to
finish on schedule. Perhaps those times should be reported separately.

I think that you have misinterpreted the evidence you have on hand, and are
now looking for an explanation for something which isn't true in the first
place.

What is the percent IO wait (called "wa") reported by `top` or `vmstat`
while the restore is running?

If you run with --verbose, you can see each step as it starts them. Which
steps take the most time?

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2025-12-08 08:15:50 Re: proposal: schema variables
Previous Message Jim Jones 2025-12-06 11:29:48 Re: proposal: schema variables