Re: pg_upgrade parallelism

From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Jacob Champion <pchampion(at)vmware(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade parallelism
Date: 2022-01-12 04:51:07
Message-ID: Yd5eOwB6uBQdA11T@ahch-to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 17, 2021 at 08:04:41PM +0000, Jacob Champion wrote:
> On Wed, 2021-11-17 at 14:44 -0500, Jaime Casanova wrote:
> > I'm trying to add more parallelism by copying individual segments
> > of a relfilenode in different processes. Does anyone one see a big
> > problem in trying to do that? I'm asking because no one did it before,
> > that could not be a good sign.
>
> I looked into speeding this up a while back, too. For the use case I
> was looking at -- Greenplum, which has huge numbers of relfilenodes --
> spinning disk I/O was absolutely the bottleneck and that is typically
> not easily parallelizable. (In fact I felt at the time that Andres'
> work on async I/O might be a better way forward, at least for some
> filesystems.)
>
> But you mentioned that you were seeing disks that weren't saturated, so
> maybe some CPU optimization is still valuable? I am a little skeptical
> that more parallelism is the way to do that, but numbers trump my
> skepticism.
>

Sorry for being unresponsive too long. I did add a new --jobs-per-disk
option, this is a simple patch I made for the customer and ignored all
WIN32 parts because I don't know anything about that part. I was wanting
to complete that part but it has been in the same state two months now.

AFAIU, it seems there is a different struct for the parameters of the
function that will be called on the thread.

I also decided to create a new reap_*_child() function for using with
the new parameter.

Now, the customer went from copy 25Tb in 6 hours to 4h 45min, which is
an improvement of 20%!

> > - why we read()/write() at all? is not a faster way of copying the file?
> > i'm asking that because i don't actually know.
>
> I have idly wondered if something based on splice() would be faster,
> but I haven't actually tried it.
>

I tried and got no better result.

> But there is now support for copy-on-write with the clone mode, isn't
> there? Or are you not able to take advantage of it?
>

That's sadly not possible because those are different disks, and yes I
know that's something that pg_upgrade normally doesn't allow but is not
difficult to make it happen.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Attachment Content-Type Size
0001-Add-jobs-per-disk-option-to-allow-multiple-processes.patch text/x-diff 8.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-01-12 05:16:43 Re: row filtering for logical replication
Previous Message Julien Rouhaud 2022-01-12 04:23:00 Re: Can there ever be out of sequence WAL files?