Re: CompactCheckpointerRequestQueue versus pad bytes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: CompactCheckpointerRequestQueue versus pad bytes
Date: 2012-07-16 23:17:35
Message-ID: 8687.1342480655@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Jul 16, 2012 at 2:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> So I started to do this, and immediately hit a roadblock: although it's
>> clear that we can discard any attempt to fsync a backend-local relation,
>> it's not so clear that we don't need to queue UNLINK_RELATION_REQUEST
>> operations for local relations.
>>
>> I think that this is in fact okay. The reason for delaying unlink until
>> after the next checkpoint is that if we did not, and the relfilenode got
>> re-used for an unrelated relation, and then we crashed and had to replay
>> WAL, we would replay any WAL referencing the old relation into the
>> unrelated relation's storage, with potential bad consequences if you're
>> unlucky. However, no WAL should ever be generated for a backend-local
>> relation, so there is nothing to guard against and hence no harm in
>> immediately unlinking backend-local rels when they are deleted. So we
>> can adjust mdunlink to include SmgrIsTemp() among the reasons to unlink
>> immediately rather than doing the truncate-and-register_unlink dance.
>>
>> If anybody sees a hole in this reasoning, speak now ...

> Hmm, yeah, I have a feeling this might be why I didn't do this when I
> created RelFileNodeBackend. But I think your reasoning is correct.
> Sticking the above text in a comment might not be out of order,
> however.

Most of this info was already in the comment for mdunlink, so I just
added a bit there.

The attached patch covers everything discussed in this thread, except
for the buggy handling of stats, which I think should be fixed in a
separate patch since it's only relevant to 9.2+.

I had thought that we might get a performance boost here by saving fsync
queue traffic, but I see that md.c was already not calling
register_dirty_segment for temp rels, so there's no joy there. We might
win a bit by not queuing deletion of temp rels, though. There's also
some distributed savings by eliminating one field from the request queue
and hash table, but probably not enough to notice. I think the main
reason to do this is just to tighten up the question of whether pad
bytes matter.

Haven't started on back-patching yet. Some but not all of this will
need to go back as far as we back-patched the queue compaction logic.

regards, tom lane

Attachment Content-Type Size
fsync-request-queue-cleanup.patch text/x-patch 27.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Samuel Vogel 2012-07-16 23:51:08 b-tree index search algorithms
Previous Message Josh Berkus 2012-07-16 20:50:56 Re: Closing out the June commitfest