Re: storing an explicit nonce

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Kincaid <tomjohnkincaid(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: storing an explicit nonce
Date: 2021-05-27 15:18:59
Message-ID: 20210527151859.GE5646@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 27, 2021 at 10:47:13AM -0400, Robert Haas wrote:
> On Wed, May 26, 2021 at 4:40 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > You are saying that by using a non-LSN nonce, you can write out the page
> > with a new nonce, but the same LSN, and also discard the page during
> > crash recovery and use the WAL copy?
>
> I don't know what "discard the page during crash recovery and use the
> WAL copy" means.

I was asking how decoupling the nonce from the LSN allows for us to
avoid full page writes for hint bit changes. I am guessing you are
saying that on recovery, if we see a hint-bit-only change in the WAL
(with a new nonce), we just throw away the page because it could be torn
and use the WAL full page write version.

> > I am confused why checksums, which are widely used, acceptably require
> > wal_log_hints, but there is concern that file encryption, which is
> > heavier, cannot acceptably require wal_log_hints. I must be missing
> > something.
>
> I explained this in the first complete paragraph of my first email
> with this subject line: "For example, right now, we only need to WAL
> log hints for the first write to each page after a checkpoint, but in
> this approach, if the same page is written multiple times per
> checkpoint cycle, we'd need to log hints every time." That's a huge
> difference. Page eviction in some workloads can push the same pages
> out of shared buffers every few seconds, whereas something that has to
> be done once per checkpoint cycle cannot affect each page nearly so
> often. A checkpoint is only going to occur every 5 minutes by default,
> or more realistically every 10-15 minutes in a well-tuned production
> system. In other words, we're not holding up some kind of double
> standard, where the existing feature is allowed to depend on doing a
> certain thing but your feature isn't allowed to depend on the same
> thing. Your design depends on doing something which is potentially
> 100x+ more expensive than the existing thing. It's not always going to
> be that expensive, but it can be.

Yes, it might be 1e100+++ more expensive too, but we don't know, and I
am not ready to add a lot of complexity for such an unknown.

> > Why can't checksums also throw away hint bit changes like you want to do
> > for file encryption and not require wal_log_hints?
>
> Well, I don't want to throw away hint bit changes, just like we don't
> throw them away right now. And I want to do that by making sure that
> each time the page is written, we use a different nonce, but without
> the expense of having to advance the LSN.
>
> Now, another option is to do what you suggest here. We could say that
> if a dirty page is evicted, but the page is only dirty because of
> hint-type changes, we don't actually write it out. That does avoid
> using the same nonce for multiple writes, because now there's only one
> write. It also fixes the problem on standbys that Andres was
> complaining about, because on a standby, the only way a page can
> possibly be dirtied without an associated WAL record is through a
> hint-type change. However, I think we'd find that this, too, is pretty
> expensive in certain workloads. It's useful to write hint bits -
> that's why we do it.

Oh, that does sound nice. It is kind of an exit hatch if we are
evicting pages often for hint bit changes. I like it.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-05-27 15:31:48 Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation
Previous Message Bruce Momjian 2021-05-27 15:12:17 Re: storing an explicit nonce