Re: Concurrent-update problem in bufmgr.c

From: Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Concurrent-update problem in bufmgr.c
Date: 2000-09-25 00:26:38
Message-ID: 39CE9BBD.3AEFEEDF@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

[snip]

>
> The window of vulnerability is considerably wider in 7.0 than in prior
> releases, because in prior releases *any* transaction commit will write
> all dirty pages. In 7.0 the dirtied page will not get written out until
> we commit a transaction that modified that particular page (or decide to
> recycle the buffer). The odds of seeing a problem are still pretty
> small, but the risk is definitely there.
>
> I believe the correct fix for this problem is for bufmgr.c to grab
> a read lock (BUFFER_LOCK_SHARED) on any page that it is writing out.
> A read lock is sufficient since there's no need to prevent other
> backends from reading the page, we just need to prevent them from
> changing it during the I/O.
>
> Comments anyone?

This seems to be almost same as I posted 4 months ago(smgrwrite()
without LockBuffer(was RE: ...).
Maybe Vadim would take care of it in the inplementation of WAL.
The following was Vadim's reply to you and me.

>
> > "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
> > > As far as I see,PostgreSQL doesn't call LockBuffer() before
> > > calling smgrwrite(). This seems to mean that smgrwrite()
> > > could write buffers to disk which are being changed by
> > > another backend. If the(another) backend was aborted by
> > > some reason the buffer page would remain half-changed.
> >
> > Hmm ... looks fishy to me too. Seems like we ought to hold
> > BUFFER_LOCK_SHARE on the buffer while dumping it out. It
> > wouldn't matter under normal circumstances, but as you say
> > there could be trouble if the other backend crashed before
> > it could mark the buffer dirty again, or if we had a system
> > crash before the dirtied page got written again.
>
> Well, known issue. Buffer latches were implemented in 6.5 only
> and there was no time to think about subj hard -:)
> Yes, we have to shlock buffer before writing and this is what
> bufmgr will must do for WAL anyway (to ensure that last buffer
> changes already logged)... but please check that buffer is not
> exc-locked by backend itself somewhere before smgrwrite()...
>
> Vadim
>
>

Regards.

Hiroshi Inoue

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-09-25 00:30:20 Re: Concurrent-update problem in bufmgr.c
Previous Message Dominic J. Eidson 2000-09-25 00:22:29 Re: [HACKERS] RFC - change of behaviour of pg_get_userbyid & pg_get_viewdef?