Re: heap vacuum & cleanup locks

From: Jim Nasby <jim(at)nasby(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap vacuum & cleanup locks
Date: 2011-06-06 06:35:13
Message-ID: EDB81868-4996-4D91-8CDF-1BAFA4FA42DC@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jun 6, 2011, at 1:00 AM, Robert Haas wrote:
> On Mon, Jun 6, 2011 at 12:19 AM, Itagaki Takahiro
> <itagaki(dot)takahiro(at)gmail(dot)com> wrote:
>> On Sun, Jun 5, 2011 at 12:03, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> If other buffer pins do exist, then we can't
>>> defragment the page, but that doesn't mean no useful work can be done:
>>> we can still mark used line pointers dead, or dead line pointers
>>> unused. We cannot defragment, but that can be done either by the next
>>> VACUUM or by a HOT cleanup.
>>
>> This is just an idea -- Is it possible to have copy-on-write techniques?
>> VACUUM allocates a duplicated page for the pinned page, and copy valid
>> tuples into the new page. Following buffer readers after the VACUUM will
>> see the cloned page instead of the old pinned one.
>
> Heikki suggested the same thing, and it's not a bad idea, but I think
> it would be more work to implement than what I proposed. The caller
> would need to be aware that, if it tries to re-acquire a content lock
> on the same page, the offset of the tuple within the page might
> change. I'm not sure how much work would be required to cope with
> that possibility.

I've had a related idea that I haven't looked into... if you're scanning a relation (ie: index scan, seq scan) I've wondered if it would be more efficient to deal with the entire page at once, possibly be making a copy of it. This would reduce the number of times you pin the page (often quite dramatically). I realize that means copying the entire page, but I suspect that would occur entirely in the L1 cache, which would be fast.

So perhaps instead of copy on write we should try for copy on read on all appropriate plan nodes.

On a related note, I've also wondered if it would be useful to allow nodes to deal with more than one tuple at a time; the idea being that it's better to execute a smaller chunk of code over a bigger chunk of data instead of dribbling tuples through an entire execution tree one at a time. Perhaps that will only be useful if nodes are executing in parallel...
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2011-06-06 06:36:01 Re: heap vacuum & cleanup locks
Previous Message Robert Haas 2011-06-06 06:00:15 Re: heap vacuum & cleanup locks