Re: Brain dump: btree collapsing

From: "Curtis Faith" <curtis(at)galtcapital(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'Hannu Krosing'" <hannu(at)tm(dot)ee>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Brain dump: btree collapsing
Date: 2003-02-14 20:41:06
Message-ID: 001801c2d469$66ff09c0$a200a8c0@curtislaptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

tom lane wrote:
> Sorry, that *does* create deadlocks. Remember the deletion
> process is going to need superexclusive lock (not only a
> BT_WRITE buffer lock, but no concurrent pins) in order to be
> sure there are no scans stopped on the page it wants to
> delete. (In the above pseudocode, the fact that you still
> hold a pin on the previously-current page makes you look
> exactly like someone who's in the middle of scanning that
> page, rather than trying to leave it.) The same would be
> true of both pages if it's trying to merge.

First, recall that under my very first proposal, the VACUUM process
would try to acquire locks but NOT WAIT. Only in the event that
superexclusive locks could be obtained on all pages would the merge
proceed, otherwise it would DROP all the locks, sleep and retry. This
would prevent the VACUUM merge from participating in deadlocks since it
would never wait while holding any lock.

I was assuming that here as well but did not explicitly restate this,
sorry.

One also needs to drop the mutex in the event you could not get the lock
after placing the process in the waiter list for the next page.

This entry will prevent VACUUM that wants to merge from gaining the
superexclusive lock until after the scan has finished since the scans
waiting lock request will block it, and as you point out, so will the
pin.

The mutex only needs to guard the crossing of the pages, so the pin
existing outside the mutex won't cause a problem.

> "Stored in the index"? And how will you do that portably?

Sorry for the lack of rigorous language. I meant that there would be one
mutex per index stored in the header or internal data structures
associated with each index somewhere. Probably in the same structure the
root node reference for each btree is stored.

- Curtis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-02-14 20:53:45 Re: location of the configuration files
Previous Message Tom Lane 2003-02-14 20:27:53 Re: location of the configuration files