Re: Think I see a btree vacuuming bug

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Think I see a btree vacuuming bug
Date: 2002-09-02 02:51:52
Message-ID: 200209020251.g822pqr21689@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Any status on this? I know we talked about it but never came to a
good solution. Is it TODO?

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Is this fixed, and if not, can I have some TODO text?
>
> It's not fixed. I'd like to fix it for 7.3, but I was hoping someone
> would think of a better way to fix it than I did ...
>
> regards, tom lane
>
> > ---------------------------------------------------------------------------
>
> > Tom Lane wrote:
> >> If a VACUUM running concurrently with someone else's indexscan were to
> >> delete the index tuple that the indexscan is currently stopped on, then
> >> we'd get a failure when the indexscan resumes and tries to re-find its
> >> place. (This is the infamous "my bits moved right off the end of the
> >> world" error condition.) What is supposed to prevent that from
> >> happening is that the indexscan retains a buffer pin (but not a read
> >> lock) on the index page containing the tuple it's stopped on. VACUUM
> >> will not delete any tuple until it can get a "super exclusive" lock on
> >> the page (cf. LockBufferForCleanup), and the pin prevents it from doing
> >> so.
> >>
> >> However: suppose that some other activity causes the index page to be
> >> split while the indexscan is stopped, and that the tuple it's stopped
> >> on gets relocated into the new righthand page of the pair. Then the
> >> indexscan is holding a pin on the wrong page --- not the one its tuple
> >> is in. It would then be possible for the VACUUM to arrive at the tuple
> >> and delete it before the indexscan is resumed.
> >>
> >> This is a pretty low-probability scenario, especially given the new
> >> index-tuple-killing mechanism (which renders it less likely that an
> >> indexscan will stop on a vacuum-able tuple). But it could happen.
> >>
> >> The only solution I've thought of is to make btbulkdelete acquire
> >> "super exclusive" lock on *every* leaf page of the index as it scans,
> >> rather than only locking the pages it actually needs to delete something
> >> from. And we'd need to tweak _bt_restscan to chain its pins (pin the
> >> next page to the right before releasing pin on the previous page).
> >> This would prevent a btbulkdelete scan from overtaking ordinary
> >> indexscans, and thereby ensure that it couldn't arrive at the tuple
> >> on which an indexscan is stopped, even with splitting.
> >>
> >> I'm somewhat concerned that the more stringent locking will slow down
> >> VACUUM a good deal when there's lots of concurrent activity, but I don't
> >> see another answer. Ideas anyone?
> >>
> >> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-09-02 02:52:33 Re: Default privileges for new databases (was Re: Can't import
Previous Message Bruce Momjian 2002-09-02 02:51:09 Re: contrib/ intarray, ltree, intagg broken(?) by array changes