Re: [Patch] ALTER SYSTEM READ ONLY

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] ALTER SYSTEM READ ONLY
Date: 2020-07-24 05:13:54
Message-ID: CAAJ_b94HChmPoax9MCS_sWa6Bj56c8hV_72DxQKp1y_MGMY8ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 24, 2020 at 6:28 AM Soumyadeep Chakraborty <
soumyadeep2007(at)gmail(dot)com> wrote:
>
> On Thu, Jun 18, 2020 at 7:54 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > I think we'd want the FIRST write operation to be the end-of-recovery
> > checkpoint, before the system is fully read-write. And then after that
> > completes you could do other things.
>
> I can't see why this is necessary from a correctness or performance
> point of view. Maybe I'm missing something.
>
> In case it is necessary, the patch set does not wait for the checkpoint to
> complete before marking the system as read-write. Refer:
>
> /* Set final state by clearing in-progress flag bit */
> if (SetWALProhibitState(wal_state &
~(WALPROHIBIT_TRANSITION_IN_PROGRESS)))
> {
> if ((wal_state & WALPROHIBIT_STATE_READ_ONLY) != 0)
> ereport(LOG, (errmsg("system is now read only")));
> else
> {
> /* Request checkpoint */
> RequestCheckpoint(CHECKPOINT_IMMEDIATE);
> ereport(LOG, (errmsg("system is now read write")));
> }
> }
>
> We should RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT) before
> we SetWALProhibitState() and do the ereport(), if we have a read-write
> state change request.
>
+1, I too have the same question.

FWIW, I don't we can request CHECKPOINT_WAIT for this place, otherwise, it
think
it will be deadlock case -- checkpointer process waiting for itself.

> Also, we currently request this checkpoint even if there was no startup
> recovery and we don't set CHECKPOINT_END_OF_RECOVERY in the case where
> the read-write request does follow a startup recovery.
> So it should really be:
> RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
> CHECKPOINT_END_OF_RECOVERY);
> We would need to convey that an end-of-recovery-checkpoint is pending in
> shmem somehow (and only if one such checkpoint is pending, should we do
> it as a part of the read-write request handling).
> Maybe we can set CHECKPOINT_END_OF_RECOVERY in ckpt_flags where we do:
> /*
> * Skip end-of-recovery checkpoint if the system is in WAL prohibited
state.
> */
> and then check for that.
>
Yep, we need some indication that end-of-recovery was skipped at the
startup,
but I haven't added that since I wasn't sure do we really need
CHECKPOINT_END_OF_RECOVERY as part of the previous concern?

> Some minor comments about the code (some of them probably doesn't
> warrant immediate attention, but for the record...):
>
> 1. There are some places where we can use a local variable to store the
> result of RelationNeedsWAL() to avoid repeated calls to it. E.g.
> brin_doupdate()
>
Ok.

> 2. Similarly, we can also capture the calls to GetWALProhibitState() in
> a local variable where applicable. E.g. inside WALProhibitRequest().
>
I don't think so.

> 3. Some of the functions that were added such as GetWALProhibitState(),
> IsWALProhibited() etc could be declared static inline.
>
IsWALProhibited() can be static but not GetWALProhibitState() since it
needed to
be accessible from other files.

> 4. IsWALProhibited(): Shouldn't it really be:
> bool
> IsWALProhibited(void)
> {
> uint32 walProhibitState = GetWALProhibitState();
> return (walProhibitState & WALPROHIBIT_STATE_READ_ONLY) != 0
> && (walProhibitState & WALPROHIBIT_TRANSITION_IN_PROGRESS) == 0;
> }
>
I think the current one is better, this allows read-write transactions from
existing backend which has absorbed barrier or from new backend while we
changing stated to read-write in the assumption that we never fallback.

> 5. I think the comments:
> /* Must be performing an INSERT or UPDATE, so we'll have an XID */
> and
> /* Can reach here from VACUUM, so need not have an XID */
> can be internalized in the function/macro comment header.
>
Ok.

> 6. Typo: ConditionVariable readonly_cv; /* signaled when ckpt_started
> advances */
> We need to update the comment here.
>
Ok.

Will try to address all the above review comments in the next version along
with
Andres' concern/suggestion. Thanks again for your time.

Regards,
Amul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-07-24 07:10:52 Re: Parallel worker hangs while handling errors.
Previous Message David Rowley 2020-07-24 04:56:49 Re: Row estimates for empty tables