Re: Backup solution over unreliable network

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Backup solution over unreliable network
Date: 2018-11-30 12:06:13
Message-ID: 20181130120613.GW3415@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Greetings,

* Achilleas Mantzios (achill(at)matrix(dot)gatewaynet(dot)com) wrote:
> we've been running our backup solution for the last 5 months to a second
> site which has an unreliable network connection. We had problems with
> barman, since it doesn't support backup resume, also no option to disable
> the replication slot, in the sense, that it is better to sacrifice the
> backup rather than fill up the primary with WALs and bring the primary down.
> Another issue is now supporting entirely backing up from the secondary. With
> barman this is not possible, streaming (or archiving) must originate from
> the primary.So I want to ask two things here :
> - Backing up to a remote site over an unreliable channel is a limited use
> case by itself, it is useful for local PITR restores on specific
> tables/data, or in case the whole primary suffers a disaster. Is there any
> other benefit that would justify building a solution for it?

Please don't build your own solution, it's really quite difficult to get
backups done correctly.

> - I have only read the best reviews about PgBackRest, can PgBackRest address those issues?

Glad to hear you've read good reviews about pgbackrest. As for
addressing these issues, pgbackrest has:

- Backup resume
- Max WAL lag (in other words, you can have it simply start throwing WAL
away if it can't archive it, rather than allowing the primary to run
out of disk space)
- Backup using the replica, primairly (note that this, currently,
requires access to the primary, but the bulk of the data comes from
the replica)
- Incremental/differential backup
- Parallel backup/resume and parallel archiving/fetching
- Backup verification- we checksum every file backed up and verify those
checksums on a resume, and we make sure that every WAL file needed to
restore the backup has made it into the archive.
- Delta restore

Which I believe covers most of the use-cases you brought up.

When we first implemented backup using the replica we had concerns
regarding doing a 'full' replica-based backup, and we didn't really see
there being a lot of demand for such a use-case (the replica has access
to the primary in general if it's a streaming replica, after all...),
but we might be open to revisiting that.

Thanks!

Stephen

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Achilleas Mantzios 2018-11-30 13:46:46 Re: Backup solution over unreliable network
Previous Message Vishal Kohli 2018-11-30 11:50:13 Re: Dropped User Session