Re: log_newpage header comment

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: log_newpage header comment
Date: 2012-06-08 12:03:17
Message-ID: CA+Tgmob5-7D4RA23ErhTP9piTRanswFjhnutoXtZBTuXJE_4uQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 7, 2012 at 8:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> It seems that in implementing ginbuildempty(), I falsified the first
>> "note" in the header comment for log_newpage():
>
>>  * Note: all current callers build pages in private memory and write them
>>  * directly to smgr, rather than using bufmgr.  Therefore there is no need
>>  * to pass a buffer ID to XLogInsert, nor to perform MarkBufferDirty within
>>  * the critical section.
>
>> 1. Considering that we're logging the entire page, is it necessary (or
>> even desirable) to include the buffer ID in the rdata structure?  If
>> so, why?  To put that another way, does my abuse of log_newpage
>> constitute a bug in gistbuildempty()?
>
> AFAICS, not passing the buffer ID to XLogInsert is not an issue, since
> we are logging the whole page in any case.  However, failing to perform
> MarkBufferDirty within the critical section definitely is an issue.

However, I'm not failing to do that: there's an enclosing critical section.

>> 2. Should we add a new function that does the same thing as
>> log_newpage for a shared buffer?  I'm imagining that the signature
>> would be:
>
> Either that or rethink building this data in shared buffers.  What's the
> point of that, exactly, for a page that we are most certainly not going
> to use in normal operation?

Well, right. I mean, I think the current implementation is mostly
design-by-accident, and my first thought was the same as yours: don't
clutter shared_buffers with pages we don't actually need for anything.
But there is a down side to doing it the other way, too. Look at
what btbuildempty does:

/* Write the page. If archiving/streaming, XLOG it. */
smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
if (XLogIsNeeded())
log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BTREE_METAPAGE, metapage);

/*
* An immediate sync is require even if we xlog'd the page, because the
* write did not go through shared_buffers and therefore a concurrent
* checkpoint may have move the redo pointer past our xlog record.
*/
smgrimmedsync(index->rd_smgr, INIT_FORKNUM);

So we have to write the page out immediately, then we have to XLOG it,
and then even though we've XLOG'd it, we still have to fsync it
immediately. It might be better to go through shared_buffers, which
would allow the write and fsync to happen in the background. The
cache-poisoning effect is probably trivial - even on a system with
32MB of shared_buffers there are thousands of pages, and init forks
are never going to consume more than a handful unless you're creating
an awful lot of unlogged relations very quickly - in which case maybe
avoiding the immediate fsyncs is a more pressing concern. On the
other hand, we haven't had any complaints about the way it works now,
either. What's your thought?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-08 12:48:07 Re: Inconsistency in libpq connection parameters, and extension thereof
Previous Message Magnus Hagander 2012-06-08 11:53:56 Re: Inconsistency in libpq connection parameters, and extension thereof