Re: CLOG extension

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CLOG extension
Date: 2012-05-04 12:59:58
Message-ID: CA+TgmoaCtq5yd8yR3GkMOj=g8xeAqQpjdy7VWCBrmeV4u7XfCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 4, 2012 at 3:35 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Thu, May 3, 2012 at 9:56 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, May 3, 2012 at 3:20 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> Your two paragraphs have roughly opposite arguments...
>>>
>>> Doing it every 32 pages would give you 30 seconds to complete the
>>> fsync, if you kicked it off when half way through the previous file -
>>> at current maximum rates. So there is utility in doing it in larger
>>> chunks.
>>
>> Maybe, but I'd like to try changing one thing at a time.  If we change
>> too much at once, it's likely to be hard to figure out where the
>> improvement is coming from.  Moving the task to a background process
>> is one improvement; doing it in larger chunks is another.  Those
>> deserve independent testing.
>
> You gave a good argument why background pre-allocation wouldn't work
> very well if we do it a page at a time. I believe you.

Your confidence is sort of gratifying, but in this case I believe it's
misplaced. On more careful analysis, it seems that ExtendCLOG() does
just two things: (1) evict a CLOG buffer and replace it with a zero'd
page representing the new page and (2) write an XLOG record for the
change. Apparently, "extending" CLOG doesn't actually involve
extending anything on disk at all. We rely on the future buffer
eviction to do that, which is surprisingly different from the way
relation extension is handled.

So CLOG extension is normally fast, but occasionally something goes
wrong. So far I see two ways that can happen: (1) the WAL insertion
stalls because wal_buffers are full, and we're forced to wait for WAL
to be written (and perhaps fsync'd, since both are covered by the same
lock) or (2) the page we choose to evict happens to be dirty, and we
have to write+fsync it before repurposing it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-05-04 13:01:28 Re: Future In-Core Replication
Previous Message Hannu Krosing 2012-05-04 12:32:42 Re: Future In-Core Replication