Re: unlogged tables vs. GIST

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables vs. GIST
Date: 2013-01-15 21:54:51
Message-ID: CA+Tgmoa+=+Lv0ppkuB2YSAh-eECJJAapbqaqCQiB5xX091g1uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 15, 2013 at 4:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I think that might be acceptable from a performance point of view -
>> after all, if the index is unlogged, you're saving the cost of WAL -
>> but I guess I still prefer a generic solution to this problem (a
>> generalization of GetXLogRecPtrForTemp) rather than a special-purpose
>> solution based on the nitty-gritty of how GiST uses these values.
>> What's the difference between storing this value in pg_control and,
>> say, the OID counter?
>
> Well, the modularity argument is that GiST shouldn't have any special
> privileges compared to a third-party index AM. (I realize that
> third-party AMs already have problems plugging into WAL replay, but
> that doesn't mean we should create more problems.)
>
> We could possibly dodge that objection by regarding the global counter
> as some sort of generic "unlogged operation counter", available to
> anybody who needs it. It would be good to have a plausible example of
> something else needing it, but assume somebody can think of one.
>
> The bigger issue is that the reason we don't have to update pg_control
> every other millisecond is that the OID counter is capable of tracking
> its state between checkpoints without touching pg_control, that is it
> can emit WAL records to track its increments. I think that we should
> insist that GiST do likewise, even if we give it some space in
> pg_control. Remember that pg_control is a single point of failure for
> the database, and the more often it's written to, the more likely it is
> that something will go wrong there.
>
> So I guess what would make sense to me is that we invent an "unlogged
> ops counter" that is managed exactly like the OID counter, including
> having WAL records that are treated as consuming some number of values
> in advance. If it's 64 bits wide then the WAL records could safely be
> made to consume quite a lot of values, like a thousand or so, thus
> reducing the actual WAL I/O burden to about nothing.

I didn't look at the actual patch (silly me?) but the only time you
need to update the control file is when writing the shutdown
checkpoint just before stopping the database server. If the server
crashes, it's OK to roll the value back to some smaller value, because
unlogged relations will be reset anyway. And while the server is
running the information can live in a shared memory copy protected by
a spinlock. So the control file traffic should be limited to once per
server lifetime, AFAICS.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2013-01-15 22:14:19 Parallel query execution
Previous Message Tom Lane 2013-01-15 21:50:08 Re: [PATCH] COPY .. COMPRESSED