Re: pg_basebackup, walreceiver and wal_sender_timeout

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Nick B <nbedxp(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup, walreceiver and wal_sender_timeout
Date: 2019-01-28 08:05:26
Message-ID: CABUevEw3_ufUbzRSo-zvJNLWvMqeD27VO4y4wJtH4dy6nkM4Kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 27, 2019 at 1:59 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Sat, Jan 26, 2019 at 01:45:46PM +0100, Magnus Hagander wrote:
> > One workaround you could perhaps look at here is to run pg_basebackup
> > with --no-sync. That way there will be no fsyncs issued while running.
> You
> > will then of course have to take care of syncing all the files to disk
> > after it's done, but a network filesystem might be happier in dealing
> with
> > a large "batch-sync" like that rather than piece-by-piece sync.
>
> Hm. Aren't we actually wrong in letting the WAL receive method use
> the value of do_sync depending on the command line arguments, with
> true being the default for pg_basebackup? In plain format, we flush
> the full data directory anyway when the backup ends. In tar format,
> each individual tar file is flushed one-by-one after being received,
> and we issue a final sync on the parent directory at the end. So
> what's missing is just to make sure that the fully generated
> pg_wal.tar is synced once completed. This would be way cheaper than
> letting the stream process issue syncs for each segments, which does
> not matter much in the event of a host crash because the base backup
> may finish in an inconsistent state, and one should not use it.
>

Yeah, that could be done without giving up any of the guarantees -- we only
give the guarantee at the end of the completed backup. I wouldn't
necessarily say we're wrong now, but it could definitely be a nice
performance improvement.

And for plain format, we'd do the same -- sync after each file segment, and
then a final one of the directory when done, right?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2019-01-28 08:14:00 Re: pg_stat_ssl additions
Previous Message Amit Langote 2019-01-28 07:45:24 Re: Delay locking partitions during query execution