Re: dropdb --force

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Anthony Nowocien <anowocien(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Filip Rembiałkowski <filip(dot)rembialkowski(at)gmail(dot)com>
Subject: Re: dropdb --force
Date: 2019-10-25 02:55:43
Message-ID: CAA4eK1JJPChUam3zCzarohdTWo+BueFVMfE4c5GwbUcM5gEhSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 24, 2019 at 8:22 PM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
> čt 24. 10. 2019 v 11:10 odesílatel Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> napsal:
>>
>> While making some changes in the patch, I noticed that in
>> TerminateOtherDBBackends, there is a race condition where after we
>> release the ProcArrayLock, the backend process to which we decided to
>> send a signal exits by itself and the same pid can be assigned to
>> another backend which is connected to some other database. This leads
>> to terminating a wrong backend. I think we have some similar race
>> condition at few other places like in pg_terminate_backend(),
>> ProcSleep() and CountOtherDBBackends(). I think here the risk is a
>> bit more because there could be a long list of pids.
>>
>> One idea could be that we write a new function similar to IsBackendPid
>> which takes dbid and ensures that pid belongs to that database and use
>> that before sending kill signal, but still it will not be completely
>> safe. But, I think it can be closer to cases like we already have in
>> code.
>>
>> Another possible idea could be to use the SendProcSignal mechanism
>> similar to how we have used it in CancelDBBackends() to allow the
>> required backends to exit by themselves. This might be safer.
>>
>> I am not sure if we can entirely eliminate this race condition and
>> whether it is a good idea to accept such a race condition even though
>> it exists in other parts of code. What do you think?
>>
>> BTW, I have added/revised some comments in the code and done few other
>> cosmetic changes, the result of which is attached.
>
>
> Tomorrow I'll check variants that you mentioned.
>
> We sure so there are not any new connect to removed database, because we hold lock there.
>

Right.

> So check if oid db is same should be enough.
>

We can do this before sending a kill signal but is it enough? Because
as soon as we release ProcArrayLock anytime the other process can exit
and a new process can use its pid. I think this is more of a
theoretical risk and might not be easy to hit, but still, we can't
ignore it.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2019-10-25 03:44:16 Re: [HACKERS] Block level parallel vacuum
Previous Message Andres Freund 2019-10-25 02:46:36 Re: EXPLAIN BUFFERS and I/O timing accounting questions