Re: BUG #5946: Long exclusive lock taken by vacuum (not full)

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jan Wieck <JanWieck(at)yahoo(dot)com>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5946: Long exclusive lock taken by vacuum (not full)
Date: 2011-03-28 20:21:50
Message-ID: AANLkTimB_F=KMHUnhFE196AQ0J4QYbaua3Kt4FLjDc9P@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Mar 28, 2011 at 4:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Christopher Browne <cbbrowne(at)gmail(dot)com> writes:
>> - Grab timestamp
>> - Grab exclusive lock
>> - Process [Some number of pages]
>> - Check time.
>> - If [# of ms] have passed then check to see if anyone else has a lock
>> O/S on the table.
>>   - Commit & give up the lock for a bit if they do
>>   - Go back and process more pages if they don't
>
> Actually, we could simplify that even further.  Keep the code exactly
> as-is, but every small-number-of-pages, check to see if someone is
> waiting on a conflicting lock, and if so, fall out of the page checking
> loop.  Truncate away however many pages we know at that time are safe,
> and end the vacuum normally.
>
> We'd have to rejigger the stuff in the lock manager that tries to boot
> autovacuum off the lock forcibly, but with a bit of luck that would get
> less crocky not more so.
>
> This wouldn't really have any parameters that require tuning, I think,
> and the max delay till the lock is released is not too much more than
> the time needed for ftruncate().  The really good thing about it is that
> vacuum's work is never wasted.
>
>                        regards, tom lane

That mostly sounds excellent.

One caveat is that this has the risk, for a busy table, of having it
take nearly forever to get through the truncation of the empty space
at the end.

If the VACUUM falls out, under this logic, after truncating only a few
pages, then there's the considerable cost of rummaging through the
table, over and over, truncating only a few pages each time.

Supposing we set it up to truncate 32 pages (assuming that to be the
"safe" level), and there are 10 empty 1GB files at the end of the
table, then it's potentially going to take tens of thousands of VACUUM
requests to empty that whole chunk of space out. That seems close
enough to "forever" for my purposes :-), assuming I'm understanding
that correctly.

I hope I'm wrong, and that there's potential here to get quite a bit
more pages than that dropped out.
--
http://linuxfinances.info/info/linuxdistributions.html

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jan Wieck 2011-03-28 23:55:57 Re: BUG #5946: Long exclusive lock taken by vacuum (not full)
Previous Message Tom Lane 2011-03-28 20:01:33 Re: BUG #5946: Long exclusive lock taken by vacuum (not full)