Re: Relation extension scalability

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Relation extension scalability
Date: 2015-07-19 16:07:39
Message-ID: 20150719160739.GJ25610@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-07-19 11:56:47 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2015-07-19 11:28:25 -0400, Tom Lane wrote:
> >> At this point session 1 will go and create page 44, won't it, and you
> >> just wasted a page.
>
> > My local code now recognizes that case and uses the page. We just need
> > to do an PageIsNew().
>
> Er, what? How can you tell whether an all-zero page was or was not
> just written by another session?

The check is only done while holding the io lock on the relevant page
(have to hold that anyway), after reading it in ourselves, just before
setting BM_VALID. As we only can get to that point when there wasn't any
other entry for the page in the buffer table, that guarantees that no
other backend isn't currently expanding into that page. Others might
wait to read it, but those'll wait behind the IO lock.

The situation the read() protect us against is that two backends try to
extend to the same block, but after one of them succeeded the buffer is
written out and reused for an independent page. So there is no in-memory
state telling the slower backend that that page has already been used.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2015-07-19 16:35:46 Re: Bug in bttext_abbrev_convert()
Previous Message Tom Lane 2015-07-19 15:56:47 Re: Relation extension scalability