Re: finding changed blocks using WAL scanning

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: finding changed blocks using WAL scanning
Date: 2019-04-22 02:20:43
Message-ID: 20190422022043.GA2712@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 20, 2019 at 12:21:36AM -0400, Robert Haas wrote:
> The segment size doesn't have much to do with it. If you make
> segments bigger, you'll have to scan fewer larger ones; if you make
> them smaller, you'll have more smaller ones. The only thing that
> really matters is the amount of I/O and CPU required, and that doesn't
> change very much as you vary the segment size.

If you create the extra file when a segment is finished and we switch
to a new one, then the extra work would happen for a random backend,
and it is going to be more costly to scan a 1GB segment than a 16MB
segment as a one-time operation, and less backends would see a
slowdown at equal WAL data generated. From what I can see, you are
not planning to do such operations when a segment finishes being
written, which would be much better.

> As to that, what I'm proposing here is no different than what we are
> already doing with physical and logical replication, except that it's
> probably a bit cheaper. Physical replication reads all the WAL and
> sends it all out over the network. Logical replication reads all the
> WAL, does a bunch of computation, and then sends the results, possibly
> filtered, out over the network. This would read the WAL and then
> write a relatively small file to your local disk.
>
> I think the impact will be about the same as having one additional
> standby, give or take.

If you put the load on an extra process, yeah I don't think that it
would be noticeable.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-04-22 02:23:03 Re: [PATCH v20] GSSAPI encryption support
Previous Message Peter Geoghegan 2019-04-22 00:46:09 Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation