Re: refactoring basebackup.c

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: refactoring basebackup.c
Date: 2020-05-08 21:27:45
Message-ID: 20200508212745.dbyxzteeok3pbkop@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-05-08 16:53:09 -0400, Robert Haas wrote:
> They represent closely-related concepts, so much so that I initially
> thought we could get by with just one new abstraction layer. I found
> on experimentation that this did not work well, so I split it up into
> two and that worked a lot better. The distinction is this: a bbsink is
> something to which you can send a bunch of archives -- currently, each
> would be a tarfile -- and also a backup manifest. A bbarchiver is
> something to which you send every file in the data directory
> individually, or at least the ones that are getting backed up, plus
> any that are being injected into the backup (e.g. the backup_label).
> Commonly, a bbsink will do something with the data and then forward it
> to a subsequent bbsink, or a bbarchiver will do something with the
> data and then forward it to a subsequent bbarchiver or bbsink. For
> example, there's a bbarchiver_tar object which, like any bbarchiver,
> sees all the files and their contents as input. The output is a
> tarfile, which gets send to a bbsink. As things stand in the patch set
> now, the tar archives are ultimately sent to the "libpq" bbsink, which
> sends them to the client.

Hm.

I wonder if there's cases where recursively forwarding like this will
cause noticable performance effects. The only operation that seems
frequent enough to potentially be noticable would be "chunks" of the
file. So perhaps it'd be good to make sure we read in large enough
chunks?

> 0010 invents two new bbarchivers, a tar bbarchiver and a tarsize
> bbarchiver, and refactors basebackup.c to make use of them. The tar
> bbarchiver puts the files it sees into tar archives and forwards the
> resulting archives to a bbsink. The tarsize bbarchiver is used to
> support the PROGRESS option to the BASE_BACKUP command. It just
> estimates the size of the backup by summing up the file sizes without
> reading them. This approach is good for a couple of reasons. First,
> without something like this, it's impossible to keep basebackup.c from
> knowing something about the tar format, because the PROGRESS option
> doesn't just figure out how big the files to be backed up are: it
> figures out how big it thinks the archives will be, and that involves
> tar-specific considerations.

ISTM that it's not actually good to have the progress calculations
include the tar overhead. As you say:

> This area needs more work, as the whole idea of measuring progress by
> estimating the archive size is going to break down as soon as
> server-side compression is in the picture.

This, to me, indicates that we should measure the progress solely based
on how much of the "source" data was processed. The overhead of tar, the
reduction due to compression, shouldn't be included.

> What do you all think?

I've not though enough about the specifics, but I think it looks like
it's going roughly in a better direction.

One thing I wonder about is how stateful the interface is. Archivers
will pretty much always track which file is currently open etc. Somehow
such a repeating state machine seems a bit ugly - but I don't really
have a better answer.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-05-08 22:02:29 Re: [PATCH] Fix division by zero (explain.c)
Previous Message Heikki Linnakangas 2020-05-08 21:21:18 pendingOps table is not cleared with fsync=off