Re: finding changed blocks using WAL scanning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: finding changed blocks using WAL scanning
Date: 2019-04-18 19:51:14
Message-ID: CA+Tgmobo+sd+T1b_NQpEEL=MMf55Y9NzJJJ7Ym5OZ5FUs4-u-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 15, 2019 at 11:45 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Mon, Apr 15, 2019 at 09:04:13PM -0400, Robert Haas wrote:
> > That is pretty much exactly what I was intending to propose.
>
> Any caller of XLogWrite() could switch to a new segment once the
> current one is done, and I am not sure that we would want some random
> backend to potentially slow down to do that kind of operation.
>
> Or would a separate background worker do this work by itself? An
> external tool can do that easily already:
> https://github.com/michaelpq/pg_plugins/tree/master/pg_wal_blocks

I was thinking that a dedicated background worker would be a good
option, but Stephen Frost seems concerned (over on the other thread)
about how much load that would generate. That never really occurred
to me as a serious issue and I suspect for many people it wouldn't be,
but there might be some.

It's cool that you have a command-line tool that does this as well.
Over there, it was also discussed that we might want to have both a
command-line tool and a background worker. I think, though, that we
would want to get the output in some kind of compressed binary format,
rather than text. e.g.

4-byte database OID
4-byte tablespace OID
any number of relation OID/block OID pairings for that
database/tablespace combination
4-byte zero to mark the end of the relation OID/block OID list
and then repeat all of the above any number of times

That might be too dumb and I suspect we want some headers and a
checksum, but we should try to somehow exploit the fact that there
aren't likely to be many distinct databases or many distinct
tablespaces mentioned -- whereas relation OID and block number will
probably have a lot more entropy.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-04-18 19:51:57 Re: finding changed blocks using WAL scanning
Previous Message Tom Lane 2019-04-18 19:50:44 Re: Runtime pruning problem