Re: unlogged tables vs. GIST

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables vs. GIST
Date: 2013-01-23 15:30:42
Message-ID: CA+Tgmoa4gd7CoREPFoVKrBU=RKbWQJ3ZebmngJtTfYMGOhkSiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 23, 2013 at 4:04 AM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> Yes.
>
> I guess my earlier patch, which was directly incrementing
> ControlFile->unloggedLSN counter was the concern as it will take
> ControlFileLock several times.
>
> In this version of patch I did what Robert has suggested. At start of the
> postmaster, copying unloggedLSn value to XLogCtl, a shared memory struct.
> And
> in all access to unloggedLSN, using this shared variable using a SpinLock.
> And since we want to keep this counter persistent across clean shutdown,
> storing it in ControlFile before updating it.
>
> With this approach, we are updating ControlFile only when we shutdown the
> server, rest of the time we are having a shared memory counter. That means
> we
> are not touching pg_control every other millisecond or so. Also since we are
> not caring about crashes, XLogging this counter like OID counter is not
> required as such.

On a quick read-through this looks reasonable to me, but others may
have different opinions, and I haven't reviewed in detail.

One obvious oversight is that bufmgr.c needs a detailed comment
explaining the reason behind the change you are making there.
Otherwise, someone might think to undo it in the future, and that
would be bad. Perhaps add something like:

However, this rule does not apply for unlogged relations, which will
be lost after a crash anyway. Most unlogged relation pages do not
bear LSNs since we never emit WAL records for them, and therefore
flushing up through the buffer LSN would be useless, but harmless.
However, GiST indexes use LSNs internally to track page-splits, and
therefore temporary and unlogged GiST pages bear "fake" LSNs generated
by GetXLogRecPtrForUnloggedRel. It is unlikely but possible that the
fake-LSN counter used for unlogged relations could advance past the
WAL insertion point; and if it did happen, attempting to flush WAL
through that location would fail, with disastrous system-wide
consequences. To make sure that can't happen, skip the flush if the
buffer isn't permanent.

A related problem is that you're examining the buffer header flags
without taking the buffer-header spinlock. I'm not sure this can
actually matter, because we'll always hold the content lock on the
buffer at this point, which presumably precludes any operation that
might flip that bit, but it looks like the callers all have the buffer
header flags conveniently available, so maybe they should pass that
information down to FlushBuffer. That would also avoid accessing that
word in memory both before and after releasing the spinlock (all
callers have just released it) which can lead to memory-access stalls.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-01-23 15:37:23 Re: [PATCH] PQping Docs
Previous Message Tom Lane 2013-01-23 15:29:56 Re: WIP: index support for regexp search