Re: Reviewing freeze map code

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reviewing freeze map code
Date: 2016-06-21 03:38:34
Message-ID: CAEepm=3wY8mVUEBNDKYq7N2cdtr=D+ydLBZ0LzJM3gcyM3kSWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 21, 2016 at 3:29 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Jun 21, 2016 at 1:03 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> Well, I think generally nobody seriously looked at actually refactoring
>> heap_update(), even though that'd be a good idea. But in this instance,
>> the problem seems relatively fundamental:
>>
>> We need to lock the origin page, to do visibility checks, etc. Then we
>> need to figure out the target page. Even disregarding toasting - which
>> we could be doing earlier with some refactoring - we're going to have to
>> release the page level lock, to lock them in ascending order. Otherwise
>> we'll risk kinda likely deadlocks.
>
> Can we consider to use some strategy to avoid deadlocks without releasing
> the lock on old page? Consider if we could have a mechanism such that
> RelationGetBufferForTuple() will ensure that it always returns a new buffer
> which has targetblock greater than the old block (on which we already held a
> lock). I think here tricky part is whether we can get anything like that
> from FSM. Also, there could be cases where we need to extend the heap when
> there were pages in heap with space available, but we have ignored them
> because there block number is smaller than the block number on which we have
> lock.

Doesn't that mean that over time, given a workload that does only or
mostly updates, your records tend to migrate further and further away
from the start of the file, leaving a growing unusable space at the
beginning, until you eventually need to CLUSTER/VACUUM FULL?

I was wondering about speculatively asking for a free page with a
lower block number than the origin page, if one is available, before
locking the origin page. Then after locking the origin page, if it
turns out you need a page but didn't get it earlier, asking for a free
page with a higher block number than the origin page.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2016-06-21 03:46:02 Re: Reviewing freeze map code
Previous Message Michael Paquier 2016-06-21 03:32:01 Re: primary_conninfo missing from pg_stat_wal_receiver