Re: update on global temporary and unlogged tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: update on global temporary and unlogged tables
Date: 2010-09-13 02:49:32
Message-ID: AANLkTi=jDv=5FE_KpUysNQrCR687h2ei0a-E0MU_ZLrS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 6, 2010 at 10:55 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> 3. With respect to unlogged tables, the major obstacle seems to be
> figuring out a way for these to get automatically truncated at startup
> time.  As with temporary table cleanup in general, the problem here is
> that you can't do the obvious thing of iterating through pg_class and
> truncating each unlogged table you find without greatly complicating
> the startup sequence.  However, I think there's a fairly easy way
> around this problem: truncating a table basically means removing all
> segments and relation forks other than the first segment of the main
> fork, and truncating that one to zero length.  So we could do it the
> same way we clean up temporary files - namely, based on the file name
> - if we made the filenames for unlogged tables distinguishable from
> those for regular and temporary tables.  What I'm thinking about is
> reserving a backend ID of -2 for this purpose via some suitable
> constant definition, just as -1 (InvalidBackendId) represents a
> permanent table in this context.

I tried this approach and got fairly far with it, but ran into a snag
in the buffer manager. It's fairly obvious that the buffer manager
has to know whether a particular buffer is from an unlogged relation
or not; for example, FlushBuffer() should skip the XLOG flush for an
unlogged buffer, and must pass the correct backend ID to smgropen().
So my first thought was just to define a bit BM_IS_UNLOGGED, with the
obvious interpretation.

That's not quite good enough, though, because GetNewRelFileNode
doesn't guarantee that the OID chosen is absolutely unique; it just
guarantees that it's unique within the space defined by the database
ID and backend ID. So it's possible that you could have a logged
relation and an unlogged relation with the same value for
pg_class.relfilenode, which means that the buffer manager can't store
the unlogged status as a random bit someplace, but actually needs to
have it as part of the buffer tag (otherwise, a buffer descriptor hash
table lookup might find the wrong buffer). You could maybe work
around this problem by having GetNewRelFileNode(), when generating an
OID for either a permanent or unlogged relation, check that the OID
isn't in use for either of those things already, but that requires an
extra system call, so it doesn't seem ideal. I'd be willing to go
that route if people think it's cheap enough and more desirable for
some reason, though.

So I went looking for bit-space in the buffer tag and quickly found
some. ForkNumber is an enum which I suppose means a 32-bit integer,
but we've only got three forks right now and it's hard to imagine more
than a handful of additional ones, so what I'm tempted to do is change
this from an enum to a 2-byte integer and replace the enum values with
#defines. That frees up 2 bytes in the buffer tag which is more than
plenty.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-09-13 03:03:01 Report: removing the inconsistencies in our CVS->git conversion
Previous Message Tom Lane 2010-09-13 01:56:01 Re: cvs2git reports a "sprout" from a nonexistent commit?