Re: B-tree descend for insertion locking

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: B-tree descend for insertion locking
Date: 2014-03-19 11:35:42
Message-ID: CA+U5nMLJ4pvAH8bhAn5HTmWu8OqxPVx83-tRVu4q=DcO86jHjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18 March 2014 11:12, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> When inserting into a B-tree index, all the pages are read-locked when
> descending the tree. When we reach the leaf page, the read-lock is exchanged
> for a write-lock.
>
> There's nothing wrong with that, but why don't we just directly grab a
> write-lock on the leaf page? When descending, we know the level we're on,
> and what level the child page is. The only downside I can see is that we
> would unnecessarily hold a write-lock when a read-lock would suffice, if the
> page was just split and we have to move right. But that seems like a really
> bad bet - hitting the page when it was just split is highly unlikely.

Sounds good.

Grabbing write lock directly will reduce contention on the buffer, not
just reduce the code path.

If we have a considerable number of duplicates we would normally step
thru until we found a place to insert. Presumably that will happen
with write lock enabled, rather than read lock. Would this slow down
the insertion of highly duplicate keys under concurrent load? i.e. is
this a benefit for nearly-unique but not for other cases?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sandro Santilli 2014-03-19 11:43:25 Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Previous Message Greg Stark 2014-03-19 10:57:10 Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)