Re: CheckpointLock needed in CreateCheckPoint()?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CheckpointLock needed in CreateCheckPoint()?
Date: 2021-01-18 19:36:48
Message-ID: CA+TgmoYmFduRyajd6Bp7GjN=WUqgHLUA3PX5nX5PctzCn5cbTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 14, 2021 at 10:21 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Yeah, I think this lock is useless. In fact, I think it's harmful,
> because it makes large sections of the checkpointer code, basically
> all of it really, run with interrupts held off for no reason.
>
> It's not impossible that removing the lock could reveal bugs
> elsewhere: suppose we have segments of code that actually aren't safe
> to interrupt, but because of this LWLock, it never happens. But, if
> that happens to be true, I think we should just view those cases as
> bugs to be fixed.
>
> One thing that blunts the impact of this quite a bit is that the
> checkpointer doesn't use rely much on CHECK_FOR_INTERRUPTS() in the
> first place. It has its own machinery: HandleCheckpointerInterrupts().
> Possibly that's something we should be looking to refactor somehow as
> well.

Here's a patch to remove CheckpointLock completely. For
ProcessInterrupts() to do anything, one of the following things would
have to be true:

1. ProcDiePending = true. Normally this is set by die(), but the
checkpointer does not use die(). Perhaps someday it should, or maybe
not, but that's an issue for another day.

2. ClientConnectionLost = true. This gets set in pqcomm.c's
internal_flush(), but the checkpointer has no client connection.

3. DoingCommandRead = true. Can't happen; only set in PostgresMain().

4. QueryCancelPending = true. Can be set by recovery conflicts, but
those shouldn't happen in the checkpointer. Normally set by
StatementCancelHandler, which the checkpointer doesn't use.

5. IdleInTransactionSessionTimeoutPending = true. Set up by
InitPostgres(), which is not used by auxiliary processes.

6. ProcSignalBarrierPending = true. This is the case we actually care
about allowing for the ASRO patch, so if this occurs, it's good.

7. ParallelMessagePending = true. The checkpointer definitely never
starts parallel workers.

So I don't see any problem with just ripping this out entirely, but
I'd like to know if anybody else does.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v1-0001-Remove-CheckpointLock.patch application/octet-stream 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-01-18 19:44:58 Re: Deleting older versions in unique indexes to avoid page splits
Previous Message Zhihong Yu 2021-01-18 19:23:42 Re: search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault