Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?
Date: 2021-03-01 04:46:03
Message-ID: CA+hUKGKq+N7_eMQN8c9zNrOC=iQPHpfqLEeDxOK5JkKvwS0hSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 27, 2021 at 4:14 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Here's a new version. The condition variable patch 0001 fixes a bug:
> CleanupProcSignalState() also needs to broadcast. The hunk that
> allows the interrupt handlers to use CVs while you're already waiting
> on a CV is now in a separate patch 0002. I'm thinking of going ahead
> and committing those two.

Done. Of course nothing in the tree reaches any of this code yet.
It'll be exercised by cfbot in this thread and (I assume) Amul's
"ALTER SYSTEM READ { ONLY | WRITE }" thread.

> The 0003 patch to achieve $SUBJECT needs
> more discussion.

Rebased.

The more I think about it, the more I think that this approach is good
enough for an initial solution to the problem. It only affects
Windows, dropping tablespaces is hopefully rare, and it's currently
broken on that OS. That said, it's complex enough, and I guess more
to the point, enough of a compromise, that I'm hoping to get some
explicit consensus about that.

A better solution would probably have to be based on the sinval queue,
somehow. Perhaps with a new theory or rule making it safe to process
at every CFI(), or by deciding that we're prepared have the operation
wait arbitrarily long until backends reach a point where it is known
to be safe (probably near ProcessClientReadInterrupt()'s call to
ProcessCatchupInterrupt()), or by inventing a new kind of lightweight
"sinval peek" that is safe to run at every CFI() point, being based on
reading (but not consuming!) the sinval queue and performing just
vfd-close of referenced smgr relations in this case. The more I think
about all that complexity for a super rare event on only one OS, the
more I want to just do it the stupid way and close all the fds.
Robert opined similarly in an off-list chat about this problem.

Attachment Content-Type Size
v5-0001-Use-a-global-barrier-to-fix-DROP-TABLESPACE-on-Wi.patch text/x-patch 8.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nitin Jadhav 2021-03-01 04:52:37 [PATCH] Bug fix in initdb output
Previous Message Justin Pryzby 2021-03-01 04:33:55 Re: doc review for v14