Re: On markers of changed data

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: On markers of changed data
Date: 2017-10-10 22:50:47
Message-ID: 20171010225047.GD4628@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro,

* Alvaro Herrera (alvherre(at)alvh(dot)no-ip(dot)org) wrote:
> Greg Stark wrote:
>
> > The general shape of what I would like to see is some log which lists
> > where each checkpoint starts and ends and what blocks are modified
> > since the previous checkpoint. Then to generate an incremental backup
> > from any point in time to the current you union all the block lists
> > between them and fetch those blocks. There are other ways of using
> > this aside from incremental backups on disk too -- you could imagine a
> > replica that has fallen behind requesting the block lists and then
> > fetching just those blocks instead of needing to receive and apply all
> > the wal.
>
> Hmm, this sounds pretty clever. And we already have the blocks touched
> by each record thanks to the work for pg_rewind (so we don't have to do
> any nasty tricks like the stuff Suzuki-san did for pg_lesslog, where
> each WAL record had to be processed individually to know what blocks it
> referenced), so it shouldn't be *too* difficult ...

Yeah, it sounds interesting, but I was just chatting w/ David about it
and we were thinking about how checkpoints are really rather often done,
so you end up with quite a few of these lists being out there.

Now, if the lists were always kept in a sorted fashion, then perhaps we
would be able to essentially merge-sort them all back together and
de-dup that way, but even then, you're talking about an awful lot if
you're looking at daily incrementals- that's 288 standard 5-minute
checkpoints, each with some 128k pages changed, assuming max_wal_size of
1GB, and I think we can all agree that the default max_wal_size is for
rather small systems. That ends up being something around 2MB per
checkpoint to store the pages in or half a gig per day just to keep
track of the pages which changed in each checkpoint across that day.

There's a bit of hand-waving in there, but I don't think it's all that
much to reach a conclusion that this might not be the best approach.
David and I were kicking around the notion of a 'last LSN' which is kept
on a per-relation basis, but, of course, that ends up not really being
granular enough, and would likely be a source of contention unless we
could work out a way to make it "lazy" updated somehow, or similar.

> > It would also be useful for going in the reverse direction: look up
> > all the records (or just the last record) that modified a given block.
>
> Well, a LSN map is what I was suggesting.

Not sure I entirely followed what you were getting at here..?

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-10-11 00:39:08 Re: Is it time to kill support for very old servers?
Previous Message Alvaro Herrera 2017-10-10 22:35:50 Re: On markers of changed data