including PID or backend ID in relpath of temp rels

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: including PID or backend ID in relpath of temp rels
Date: 2010-04-26 01:07:46
Message-ID: y2j603c8f071004251807udf91480bwabb36a126d1bf396@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Time for a new thread specific to this subject. For previous
discussion, see here:

http://archives.postgresql.org/pgsql-hackers/2010-04/msg01140.php
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01152.php

I attempted to implement this by adding an isTemp argument to relpath,
but ran into problems. It turns out that when we create a temporary
relation and then exit the backend, the relation is merely truncated
it, and it's the background writer which actually removes the file
following the next checkpoint. Therefore, relpath() for the temprel
must return the same answer in the background writer as it does in the
original backend, so passing isTemp isn't enough - we actually need to
pass whatever identifier we're including in the file name. As far as
I can see, though I'm not 100% sure of this, it looks like we never
actually ask the background writer to fsync any of these files because
we never fsync them at all; but we do ask it to remove them, which is
enough to create a problem. So, what to do about this? Ideas:

1. We could move the responsibility for removing the files associated
with temp rels from the background writer to the owning backend. I
think the reason why we initially truncate the files and only later
remove them is because somebody else might have 'em open, so it
mightn't be necessary for temp rels.

2. Instead of embedding a PID or backend ID in the filename, we could
just embed a boolean: isTemp or not? This seems like cutting
ourselves off from quite a bit of useful information but maybe it
would be OK. We could nuke all the temp stuff on cluster startup, but
we'd have to rely on catalog entries to identify orphaned files that
accumulated during normal running, which isn't ideal since one of our
long-term goals is to eliminate the need for those catalog entries.

3. We could change RelFileNode.relNode from an OID to an unsigned
32-bit integer drive off of a separate counter, and reserve some
portion of the 4 billion available values for temp relations. I doubt
we'd have enough bits to embed something like a PID though, so this
would end up being basically an embedded boolean, along the lines of
#2.

4. We could add an additional 32-bit value to RelFileNode to identify
the backend (or a sentinel value when not temp) and create a separate
structure XLogRelFileNode or PermRelFileNode or somesuch for use in
contexts where no temp rels are allowed.

Either #3 or #4 has some possible advantages for Hot Standby in terms
of perhaps making it feasible to assign relfilenodes on a standby
server without danger of conflicting with one already assigned on the
master.

5. ???

Thoughts?

...Robert

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2010-04-26 02:19:51 Re: including PID or backend ID in relpath of temp rels
Previous Message Jim Nasby 2010-04-26 01:02:47 Re: inlining SQL functions