From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Subject: | Re: B-tree parent pointer and checkpoints |
Date: | 2011-09-06 13:45:35 |
Message-ID: | 4E6623FF.1070500@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 06.09.2011 16:40, Robert Haas wrote:
> On Tue, Sep 6, 2011 at 6:21 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> The way it would work is that on page split the right page is flagged with
>> MISSING_DOWNLINK flag. When the downlink is inserted into the parent, the
>> flag is cleared in the same critical section as the WAL record for the
>> insertion of the parent is written. Normally, a backend would never see the
>> flag set, because the locks on the split pages are not released until the
>> parent record is written and the flag cleared again. But if inserting the
>> downlink fails for any reason, the next inserter or vacuum that steps on the
>> page can finish the split by inserting the downlink.
>>
>> Unfortunately that means holding the locks on the split pages longer than we
>> do at the moment. Currently they are released as soon as the parent page is
>> locked; with this change they would need to be held until the WAL record of
>> the downlink insertion is done. B-tree is so heavily used that I'm a bit
>> hesitant to sacrifice any concurrency there, but I don't think it would be
>> noticeable in practice.
>
> Do you really need to hold the page locks for all that time, or could
> you cheat? Like... release the locks on the split pages but then go
> back and reacquire them to clear the flag...
Hmm, there's two issues with that:
1. While you're not holding the locks on the child pages, someone can
step onto the page and see that the MISSING_DOWNLINK flag is set, and
try to finish the split for you.
2. If you don't hold the page locked while you clear the flag, someone
can start and finish a checkpoint after you've inserted the downlink,
and before you've cleared the flag. You end up in a scenario where the
flag is set, but the page in fact *does* have a downlink in the parent.
So, nope, we can't cheat.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2011-09-06 13:47:25 | Re: [v9.1] sepgsql - userspace access vector cache |
Previous Message | Stefan Keller | 2011-09-06 13:40:44 | Re: WIP: Fast GiST index build |