Re: trying again to get incremental backup

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: trying again to get incremental backup
Date: 2023-06-19 13:46:12
Message-ID: CA+TgmoaTZPKzcTW0pf30n+_uQtrEKRWB8qoOgkmnVh+QK2Wn_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 14, 2023 at 4:40 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > But I'm not sure that's a great approach, because that LSN gap might be
> > large and then we're duplicating a lot of work that the summarizer has
> > probably already done most of.
>
> I guess that really depends on what the summary granularity is. If you create
> a separate summary every 32MB or so, recomputing just the required range
> shouldn't be too bad.

Yeah, but I don't think that's the right approach, for two reasons.
First, one of the things I'm rather worried about is what happens when
the WAL distance between the prior backup and the incremental backup
is large. It could be a terabyte. If we have a WAL summary for every
32MB of WAL, that's 32k files we have to read, and I'm concerned
that's too many. Maybe it isn't, but it's something that has really
been weighing on my mind as I've been thinking through the design
questions here. The files are really very small, and having to open a
bazillion tiny little files to get the job done sounds lame. Second, I
don't see what problem it actually solves. Why not just signal the
summarizer to write out the accumulated data to a file instead of
re-doing the work ourselves? Or else adopt the
WAL-record-at-the-redo-pointer approach, and then the whole thing is
moot?

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-06-19 15:51:51 Re: trying again to get incremental backup
Previous Message Yugo NAGATA 2023-06-19 13:39:37 Re: Make pgbench exit on SIGINT more reliably