Re: including PID or backend ID in relpath of temp rels

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: including PID or backend ID in relpath of temp rels
Date: 2010-04-28 01:59:31
Message-ID: p2t603c8f071004271859hce160f31zafb6995812bd7da4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 25, 2010 at 9:07 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> 4. We could add an additional 32-bit value to RelFileNode to identify
> the backend (or a sentinel value when not temp) and create a separate
> structure XLogRelFileNode or PermRelFileNode or somesuch for use in
> contexts where no temp rels are allowed.

I experimented with this approach and created LocalRelFileNode and
GlobalRelFileNode and, for use in the buffer headers,
BufferRelFileNode (same as GlobalRelFileNode, but named differently
for clarity). LocallRelFileNode = GlobalRelFileNode + the ID of the
owning backend for temp rels; or InvalidBackendId if referencing a
non-temporary rel. These might not be the greatest names, but I think
the concept is good, because it really breaks the things that need to
be adjusted quite thoroughly. In the course of repairing the damage I
came across a couple of things I wasn't sure about:

[relcache.c] RelationInitPhysicalAddr can't initialize
relation->rd_node.backend properly for a non-local temporary relation,
because that information isn't available. But I'm not clear on why we
would need to create a relcache entry for a non-local temporary
relation. If we do need to, then we'll probably need to store the
backend ID in pg_class. That seems like something that would be best
avoided, all things being equal, especially since I can't see how to
generalize it to global temporary tables.

[smgr.c,inval.c] Do we need to call CacheInvalidSmgr for temporary
relations? I think the only backend that can have an smgr reference
to a temprel other than the owning backend is bgwriter, and AFAICS
bgwriter will only have such a reference if it's responding to a
request by the owning backend to unlink the associated files, in which
case (I think) the owning backend will have no reference.

[dbsize.c] As with relcache.c, there's a problem if we're asked for
the size of a temporary relation that is not our own: we can't call
relpath() without knowing the ID of the owning backend, and there's no
way to acquire that information for pg_class. I guess we could just
refuse to answer the question in that case, but that doesn't seem real
cool. Or we could physically scan the directory for files that match
a suitably constructed wildcard, I suppose.

[storage.c,xact.c,twophase.c] smgrGetPendingDeletes returns via an out
parameter (its second argument) a list of RelFileNodes pending delete,
which we then write to WAL or to the two-phase state file. Of course,
if the backend ID (or pid, but I picked backend ID somewhat
arbitrarily) is part of the filename, then we need to write that to
WAL, too. It seems somewhat unfortunate to have to WAL-log temprels
here; as best I can tell, this is the only case where it's necessary.
But if we implement a more general mechanism for cleaning up temp
files, then might the need to do this go away? Not sure.

[syncscan.c] It seems we pursue this optimization even for temprels; I
can't think of why that would be useful in practice. If it's useless
overhead, should we skip it? This is really independent of this
project; just a side thought.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-04-28 02:34:42 Error handling for ShmemInitStruct and ShmemInitHash
Previous Message Tom Lane 2010-04-28 00:47:32 Re: Schema.Table.Col resolution seems broken in Alpha5