Re: trying again to get incremental backup

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: trying again to get incremental backup
Date: 2023-11-16 17:13:53
Message-ID: CA+TgmobRHE20ZA+_L96H8GKFGynU+fTf5_8KsBWEOETq9=NbXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 16, 2023 at 5:21 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> I meant code like this
>
> memcpy(&key.rlocator, rlocator, sizeof(RelFileLocator));
> key.forknum = forknum;
> entry = blockreftable_lookup(brtab->hash, key);

Ah, I hadn't thought about that. Another way of handling that might be
to add = {0} to the declaration of key. But I can do the initializer
thing too if you think it's better. I'm not sure if there's an
argument that the initializer might optimize better.

> An output pointer, you mean :-) (Should it be const?)

I'm bad at const, but that seems to work, so sure.

> When the return value is BACK_UP_FILE_FULLY, it's not clear what happens
> to these output values; we modify some, but why? Maybe they should be
> left alone? In that case, the "if size == 0" test should move a couple
> of lines up, in the brtentry == NULL block.

OK.

> BTW, you could do the qsort() after deciding to backup the file fully if
> more than 90% needs to be replaced.

OK.

> BTW, in sendDir() why do
> lookup_path = pstrdup(pathbuf + basepathlen + 1);
> when you could do
> lookup_path = pstrdup(tarfilename);
> ?

No reason, changed.

> > If we want to inject more underscores here, my vote is to go all the
> > way and make it per_wal_range_cb.
>
> +1

Will look into this.

> Yeah, I just think that endless stream of hex chars are hard to read,
> and I've found myself following digits in the screen with my fingers in
> order to parse file names. I guess you could say thousands separators
> for regular numbers aren't needed either, but we do have them for
> readability sake.

Sigh.

> I think a new section in chapter 30 "Reliability and the Write-Ahead
> Log" is warranted. It would explain the summarization process, what the
> summary files are used for, and the deletion mechanism. I can draft
> something if you want.

Sure, if you want to take a crack at it, that's great.

> It's not clear to me if WalSummarizerCtl->pending_lsn if fulfilling some
> purpose or it's just a leftover from prior development. I see it's only
> read in an assertion ... Maybe if we think this cross-check is
> important, it should be turned into an elog? Otherwise, I'd remove it.

I've been thinking about that. One thing I'm not quite sure about
though is introspection. Maybe there should be a function that shows
summarized_tli and summarized_lsn from WalSummarizerData, and maybe it
should expose pending_lsn too.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-11-16 17:23:00 Re: trying again to get incremental backup
Previous Message Robert Haas 2023-11-16 17:13:44 Re: trying again to get incremental backup