Re: [RFC] LSN Map

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] LSN Map
Date: 2015-02-23 17:52:26
Message-ID: 54EB68DA.6000006@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/13/2015 01:22 PM, Marco Nenciarini wrote:
> Il 08/01/15 20:18, Jim Nasby ha scritto:
>> On 1/7/15, 3:50 AM, Marco Nenciarini wrote:
>>> The current implementation tracks only heap LSN. It currently does not
>>> track any kind of indexes, but this can be easily added later.
>>
>> Would it make sense to do this at a buffer level, instead of at the heap
>> level? That means it would handle both heap and indexes.
>> I don't know if LSN is visible that far down though.
>
> Where exactly you are thinking to handle it?

Dunno, but Jim's got a point. This is a maintenance burden to all
indexams, if they all have to remember to update the LSN map separately.
It needs to be done in some common code, like in PageSetLSN or
XLogInsert or something.

Aside from that, isn't this horrible from a performance point of view?
The patch doubles the buffer manager traffic, because any update to any
page will also need to modify the LSN map. This code is copied from the
visibility map code, but we got away with it there because the VM only
needs to be updated the first time a page is modified. Subsequent
updates will know the visibility bit is already cleared, and don't need
to access the visibility map.

Ans scalability: Whether you store one value for every N pages, or the
LSN of every page, this is going to have a huge effect of focusing
contention to the LSN pages. Currently, if ten backends operate on ten
different heap pages, for example, they can run in parallel. There will
be some contention on the WAL insertions (much less in 9.4 than before).
But with this patch, they will all fight for the exclusive lock on the
single LSN map page.

You'll need to find a way to not update the LSN map on every update. For
example, only update the LSN page on the first update after a checkpoint
(although that would still have a big contention focusing effect right
after a checkpoint).

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-02-23 17:56:47 Re: json_populate_record issue - TupleDesc reference leak
Previous Message Thom Brown 2015-02-23 17:51:44 Re: mogrify and indent features for jsonb