Re: autovacuum truncate exclusive lock round two

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Jan Wieck'" <JanWieck(at)Yahoo(dot)com>
Cc: "'Stephen Frost'" <sfrost(at)snowman(dot)net>, "'PostgreSQL Development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum truncate exclusive lock round two
Date: 2012-10-26 10:35:41
Message-ID: 001901cdb365$a6677800$f3366800$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, October 26, 2012 11:50 AM Jan Wieck wrote:
> On 10/26/2012 1:29 AM, Amit Kapila wrote:
> > On Thursday, October 25, 2012 9:46 PM Jan Wieck wrote:
> >> On 10/25/2012 10:12 AM, Stephen Frost wrote:
> >> > Jan,
> >> >
> >> > * Jan Wieck (JanWieck(at)Yahoo(dot)com) wrote:
> >> >> The problem case this patch is dealing with is rolling window
> tables
> >> >> that experienced some bloat. The typical example is a log table,
> >> >> that has new data constantly added and the oldest data constantly
> >> >> purged out. This data normally rotates through some blocks like a
> >> >> rolling window. If for some reason (purging turned off for
> example)
> >> >> this table bloats by several GB and later shrinks back to its
> normal
> >> >> content, soon all the used blocks are at the beginning of the heap
> >> >> and we find tens of thousands of empty pages at the end. Only now
> >> >> does the second scan take more than 1000ms and autovacuum is at
> risk
> >> >> to get killed while at it.
> >> >
> >> > My concern is that this could certainly also happen to a heavily
> >> updated
> >> > table in an OLTP type of environment where the requirement to take
> a
> >> > heavy lock to clean it up might prevent it from ever happening.. I
> >> was
> >> > simply hoping we could find a mechanism to lock just those pages
> we're
> >> > getting ready to nuke rather than the entire relation. Perhaps we
> can
> >> > consider how to make those changes alongside of changes to
> eliminate
> >> or
> >> > reduce the extent locking that has been painful (for me at least)
> when
> >> > doing massive parallel loads into a table.
> >>
> >> I've been testing this with loads of 20 writes/s to that bloated
> table.
> >> Preventing not only the clean up, but the following ANALYZE as well
> is
> >> precisely what happens. There may be multiple ways how to get into
> this
> >> situation, but once you're there the symptoms are the same. Vacuum
> fails
> >> to truncate it and causing a 1 second hiccup every minute, while
> vacuum
> >> is holding the exclusive lock until the deadlock detection code of
> >> another transaction kills it.
> >>
> >> My patch doesn't change the logic how we ensure that we don't zap any
> >> data by accident with the truncate and Tom's comments suggest we
> should
> >> stick to it. It only makes autovacuum check frequently if the
> >> AccessExclusiveLock is actually blocking anyone and then get out of
> the
> >> way.
> >>
> >> I would rather like to discuss any ideas how to do all this without 3
> >> new GUCs.
> >>
> >> In the original code, the maximum delay that autovacuum can cause by
> >> holding the exclusive lock is one deadlock_timeout (default 1s). It
> >> would appear reasonable to me to use max(deadlock_timeout/10,10ms) as
> >> the interval to check for a conflicting lock request. For another
> >> transaction that needs to access the table this is 10 times faster
> than
> >> it is now and still guarantees that autovacuum will make some
> progress
> >> with the truncate.
> >
> > One other way could be to check after every few pages for a
> conflicting
> > lock request.
>
> How is this any different from what my patch does?
The difference is that in the patch it checks for waiters by using 2
parameters autovacuum_truncate_lock_check and blkno%32 and what I
had mentioned was to check only based on blkno.
Will it effect too much if we directly check for waiters after every 32
(any feasible number) blocks?

> Did you even look at the code?
I haven't looked at code when I had given reply to your previous mail. But
now I have checked it.

With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2012-10-26 10:59:43 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Amit Kapila 2012-10-26 07:04:47 Re: autovacuum truncate exclusive lock round two