Re: finding changed blocks using WAL scanning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: finding changed blocks using WAL scanning
Date: 2019-04-18 19:43:30
Message-ID: CA+Tgmob3EppkRCJiMsZxUTtM3P9Aj99Aou891NKALe+mdB3ukA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 15, 2019 at 10:22 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > > I am thinking tools could retain modblock files along with WAL, could
> > > pull full-page-writes from WAL, or from PGDATA. It avoids the need to
> > > scan 16MB WAL files, and the WAL files and modblock files could be
> > > expired independently.
> >
> > That is pretty much exactly what I was intending to propose.
>
> OK, good. Some of your wording was vague so I was unclear exactly what
> you were suggesting.

Well, I guess the part that isn't like what I was suggesting is the
idea that there should be exactly one modified block file per segment.
The biggest problem with that idea is that a single WAL record can be
split across two segments (or, in pathological cases, perhaps more).
I think it makes sense to talk about the blocks modified by WAL
between LSN A and LSN B, but it doesn't make much sense to talk about
the block modified by the WAL in segment XYZ.

You can make it kinda make sense by saying "the blocks modified by
records *beginning in* segment XYZ" or alternatively "the blocks
modified by records *ending in* segment XYZ", but that seems confusing
to me. For example, suppose you decide on the first one --
000000010000000100000068.modblock will contain all blocks modified by
records that begin in 000000010000000100000068. Well, that means that
to generate the 000000010000000100000068.modblock, you will need
access to 000000010000000100000068 AND probably also
000000010000000100000069 and in rare cases perhaps
00000001000000010000006A or even later files. I think that's actually
pretty confusing.

It seems better to me to give the files names like
${TLI}.${STARTLSN}.${ENDLSN}.modblock, e.g.
00000001.0000000168000058.00000001687DBBB8.modblock, so that you can
see exactly which *records* are covered by that segment.

And I suspect it may also be a good idea to bunch up the records from
several WAL files. Especially if you are using 16MB WAL files,
collecting all of the block references from a single WAL file is going
to produce a very small file. I suspect that the modified block files
will end up being 100x smaller than the WAL itself, perhaps more, and
I don't think anybody will appreciate us adding another PostgreSQL
systems that spews out huge numbers of tiny little files. If, for
example, somebody's got a big cluster that is churning out a WAL
segment every second, they would probably still be happy to have a new
modified block file only, say, every 10 seconds.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-04-18 19:50:44 Re: Runtime pruning problem
Previous Message Tom Lane 2019-04-18 19:41:37 Re: pgsql: Fix plan created for inherited UPDATE/DELETE with all tables exc