Re: ERROR: invalid spinlock number: 0

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ERROR: invalid spinlock number: 0
Date: 2021-02-16 14:47:52
Message-ID: efca765d-a523-dd5d-549d-04f8d76837ac@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/02/16 15:50, Michael Paquier wrote:
> On Tue, Feb 16, 2021 at 12:43:42PM +0900, Fujii Masao wrote:
>> On 2021/02/16 6:28, Andres Freund wrote:
>>> So what? It's just about free to initialize a spinlock, whether it's
>>> using the fallback implementation or not. Initializing upon walsender
>>> startup adds a lot of complications, because e.g. somebody could already
>>> hold the spinlock because the previous walsender just disconnected, and
>>> they were looking at the stats.
>
> Okay.
>
>> Even if we initialize "writtenUpto" in WalRcvShmemInit(), WalReceiverMain()
>> still needs to initialize (reset to 0) by using pg_atomic_write_u64().
>
> Yes, you have to do that.
>
>> Basically we should not acquire new spinlock while holding another spinlock,
>> to shorten the spinlock duration. Right? If yes, we need to move
>> pg_atomic_read_u64() of "writtenUpto" after the release of spinlock in
>> pg_stat_get_wal_receiver.
>
> It would not matter much as a NULL tuple is returned as long as the
> WAL receiver information is not ready to be displayed. The only
> reason why all the fields are read before checking for
> ready_to_display is that we can be sure that everything is consistent
> with the PID. So reading writtenUpto before or after does not really
> matter logically. I would just move it after the check, as you did
> previously.

OK.

>
> + /*
> + * Read "writtenUpto" without holding a spinlock. So it may not be
> + * consistent with other WAL receiver's shared variables protected by a
> + * spinlock. This is OK because that variable is used only for
> + * informational purpose and should not be used for data integrity checks.
> + */
> What about the following?
> "Read "writtenUpto" without holding a spinlock. Note that it may not
> be consistent with the other shared variables of the WAL receiver
> protected by a spinlock, but this should not be used for data
> integrity checks."

Sounds good. Attached is the updated version of the patch.

>
> I agree that what has been done with MyProc->waitStart in 46d6e5f is
> not safe, and that initialization should happen once at postmaster
> startup, with a write(0) when starting the backend. There are two of
> them in proc.c, one in twophase.c. Do you mind if I add an open item
> for this one?

Yeah, please feel free to do that! BTW, I already posted the patch
addressing that issue, at [1].

[1] https://postgr.es/m/1df88660-6f08-cc6e-b7e2-f85296a2bdab@oss.nttdata.com

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
bugfix_pg_stat_wal_receiver_v3.patch text/plain 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-02-16 14:51:14 Re: [Proposal] Page Compression for OLTP
Previous Message chenhj 2021-02-16 14:45:59 Re: [Proposal] Page Compression for OLTP