Re: [Patch] ALTER SYSTEM READ ONLY

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: [Patch] ALTER SYSTEM READ ONLY
Date: 2020-08-28 19:53:29
Message-ID: CA+TgmoYMyw-m3O5XQ8tRy4mdEArGcfXr+9niO5Fmq1wVdKxYmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 19, 2020 at 6:28 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> Attached is a rebased on top of the latest master head (# 3e98c0bafb2).

Does anyone, especially anyone named Andres Freund, have comments on
0001? That work is somewhat independent of the rest of this patch set
from a theoretical point of view, and it seems like if nobody sees a
problem with the line of attack there, it would make sense to go ahead
and commit that part. Considering that this global barrier stuff is
new and that I'm not sure how well we really understand the problems
yet, there's a possibility that we might end up revising these details
again. I understand that most people, including me, are somewhat
reluctant to see experimental code get committed, in this case that
ship has basically sailed already, since neither of the patches that
we thought would use the barrier mechanism end up making it into v13.
I don't think it's really making things any worse to try to improve
the mechanism.

0002 isn't separately committable, but I don't see anything wrong with it.

Regarding 0003:

I don't understand why ProcessBarrierWALProhibit() can safely assert
that the WALPROHIBIT_STATE_READ_ONLY is set.

+ errhint("Cannot continue a
transaction if it has performed writes while system is read only.")));

This sentence is bad because it makes it sound like the current
transaction successfully performed a write after the system had
already become read-only. I think something like errdetail("Sessions
with open write transactions must be terminated.") would be better.

I think SetWALProhibitState() could be in walprohibit.c rather than
xlog.c. Also, this function appears to have obvious race conditions.
It fetches the current state, then thinks things over while holding no
lock, and then unconditionally updates the current state. What happens
if somebody else has changed the state in the meantime? I had sort of
imagined that we'd use something like pg_atomic_uint32 for this and
manipulate it using compare-and-swap operations. Using some kind of
lock is probably fine, too, but you have to hold it long enough that
the variable can't change under you while you're still deciding
whether it's OK to modify it, or else recheck after reacquiring the
lock that the value doesn't differ from what you expect.

I think the choice to use info_lck to synchronize
SharedWALProhibitState is very strange -- what is the justification
for that? I thought the idea might be that we frequently need to check
SharedWALProhibitState at times when we'd be holding info_lck anyway,
but it looks to me like you always do separate acquisitions of
info_lck just for this, in which case I don't see why we should use it
here instead of a separate lock. For that matter, why does this need
to be part of XLogCtlData rather than a separate shared memory area
that is private to walprohibit.c?

- else
+ /*
+ * Can't perform checkpoint or xlog rotation without writing WAL.
+ */
+ else if (XLogInsertAllowed())

Not project style.

+ case WAIT_EVENT_SYSTEM_WALPROHIBIT_STATE_CHANGE:

Can we drop the word SYSTEM here to make this shorter, or would that
break some convention?

+/*
+ * NB: The return string should be the same as the _ShowOption() for boolean
+ * type.
+ */
+ static const char *
+ show_system_is_read_only(void)
+{

I'm not sure the comment is appropriate here, but I'm very sure the
extra spaces before "static" and "show" are not per style.

+ /* We'll be done once in-progress flag bit is cleared */

Another whitespace mistake.

+ elog(DEBUG1, "WALProhibitRequest: Waiting for checkpointer");
+ elog(DEBUG1, "Done WALProhibitRequest");

I think these should be removed.

Can WALProhibitRequest() and performWALProhibitStateChange() be moved
to walprohibit.c, just to bring more of the code for this feature
together in one place? Maybe we could also rename them to
RequestWALProhibitChange() and CompleteWALProhibitChange()?

- * think it should leave the child state in place.
+ * think it should leave the child state in place. Note that the upper
+ * transaction will be a force to ready-only irrespective of
its previous
+ * status if the server state is WAL prohibited.
*/
- XactReadOnly = s->prevXactReadOnly;
+ XactReadOnly = s->prevXactReadOnly || !XLogInsertAllowed();

Both instances of this pattern seem sketchy to me. You don't expect
that reverting the state to a previous state will instead change to a
different state that doesn't match up with what you had before. What
is the bad thing that would happen if we did not make this change?

- * Else, must check to see if we're still in recovery.
+ * Else, must check to see if we're still in recovery

Spurious change.

+ /* Request checkpoint */
+ RequestCheckpoint(CHECKPOINT_IMMEDIATE);
+ ereport(LOG, (errmsg("system is now read write")));

This does not seem right. Perhaps the intention here was that the
system should perform a checkpoint when it switches to read-write
state after having skipped the startup checkpoint. But why would we do
this unconditionally in all cases where we just went to a read-write
state?

There's probably quite a bit more to say about 0003 but I think I'm
running too low on mental energy to say more now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-08-28 19:56:53 Re: new heapcheck contrib module
Previous Message Ranier Vilela 2020-08-28 18:54:57 Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior