Re: CheckpointLock needed in CreateCheckPoint()?

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CheckpointLock needed in CreateCheckPoint()?
Date: 2021-01-07 07:51:25
Message-ID: CAAJ_b95tQ4F5g=+mrCQqmdvAKG-FEBSqXOsihqL=HZNxsicPCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 7, 2021 at 12:45 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Thu, Jan 7, 2021 at 11:32 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > Snip from CreateCheckPoint():
> > --
> > /*
> > * Acquire CheckpointLock to ensure only one checkpoint happens at a time.
> > * (This is just pro forma, since in the present system structure there is
> > * only one process that is allowed to issue checkpoints at any given
> > * time.)
> > */
> > LWLockAcquire(CheckpointLock, LW_EXCLUSIVE);
> > --
> >
> > As per this comment, it seems to be that we don't really need this LW lock. We
> > could have something else instead if we are afraid of having multiple
> > checkpoints at any given time which isn't possible, btw.
>
> Looks like CheckpointLock() can also be called in standalone backends
> (RequestCheckpoint->CreateCheckPoint) along with the checkpointer
> process. Having said that, I'm not quite sure whether it can get
> called at the same time from a backend(may be with checkpoint;
> command) and the checkpointer process.
>
> /*
> * If in a standalone backend, just do it ourselves.
> */
> if (!IsPostmasterEnvironment)
> {
> /*
> * There's no point in doing slow checkpoints in a standalone backend,
> * because there's no other backends the checkpoint could disrupt.
> */
> CreateCheckPoint(flags | CHECKPOINT_IMMEDIATE);
>
> See the below comment for IsPostmasterEnvironment:
>
> /*
> * IsPostmasterEnvironment is true in a postmaster process and any postmaster
> * child process; it is false in a standalone process (bootstrap or
> * standalone backend).
>

I am not sure I understood your point completely. Do you mean there could be
multiple bootstrap or standalone backends that could exist at a time and can
perform checkpoint?

> > This CheckpointLock is acquired almost at the start of CreateCheckPoint() and
> > released at the end. The in-between code can take a while to be complete. All
> > that time interrupt will be on hold which doesn't seem to be a great idea but
> > that wasn't problematic until now. I am having trouble due to this interrupt
> > hold for a long since I was trying to add code changes where the checkpointer
> > process is suppose to respond to the system read-write state change request as
> > soon as possible[1]. That cannot be done if the interrupt is disabled
> > since read-write state change uses the global barrier mechanism to convey system
> > state changes to other running processes.
> >
> > So my question is, can we completely get rid of the CheckpointLock need in
> > CreateCheckPoint()?
> >
> > Thoughts/Comments/Suggestions?
>
> I don't think we can completely get rid of CheckpointLock in
> CreateCheckPoint given the fact that it can get called from multiple
> processes.
>

How?

> How about serving that ASRO barrier request alone for checkpointer
> process in ProcessInterrupts even though the CheckpointLock is held by
> the checkpointer process? And also while the checkpointing is
> happening, may be we should call CHECK_FOR_INTERRUPTS to see if the
> ASRO barrier has arrived. This may sound bit hacky, but just a
> thought. I'm thinking that in ProcessInterrupts we can get to know the
> checkpointer process type via MyAuxProcType, and whether the lock is
> held or not using CheckpointLock, and then we can handle the ASRO
> global barrier request.
>

I am afraid that this will easily get rejected by the community. Note that this
is a very crucial code path of the database server. There are other options
too, but don't want to propose them until clarity on the CheckpointLock
necessity.

Regards,
Amul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-01-07 07:52:21 Re: Single transaction in the tablesync worker?
Previous Message Heikki Linnakangas 2021-01-07 07:51:00 Re: Incorrect allocation handling for cryptohash functions with OpenSSL