heap vacuum & cleanup locks

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: heap vacuum & cleanup locks
Date: 2011-06-05 03:03:45
Message-ID: BANLkTinmWFR1-mPu4nduUjxUfvWXZni-7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We've occasionally seen problems with VACUUM getting stuck for failure
to acquire a cleanup lock due to, for example, a cursor holding a pin
on the buffer page. In the worst case, this can cause an undetected
deadlock, if the backend holding the buffer pin blocks trying to
acquire a heavyweight lock that is in turn blocked by VACUUM. A while
back, someone (Greg Stark? me?) floated the idea of not waiting for
the cleanup lock. If we can't get it immediately, or within some
short period of time, then we just skip the page and continue on.

Today I had what might be a better idea: don't try to acquire a
cleanup lock at all. Instead, acquire an exclusive lock. After
having done so, observe the pin count. If there are no other buffer
pins, that means our exclusive lock is actually a cleanup lock, and we
proceed as now. If other buffer pins do exist, then we can't
defragment the page, but that doesn't mean no useful work can be done:
we can still mark used line pointers dead, or dead line pointers
unused. We cannot defragment, but that can be done either by the next
VACUUM or by a HOT cleanup. We can even arrange - using existing
mechanism - to leave behind a hint that the page is a good candidate
for a HOT cleanup, by setting pd_prune_xid to, say, FrozenXID.

Like the idea of skipping pages on which we can't acquire a cleanup
lock altogether, this should prevent VACUUM from getting stuck trying
to lock a heap page. While buffer pins can be held for extended
periods of time, I don't think there is any operation that holds a
buffer content lock more than very briefly. Furthermore, unlike the
idea of skipping the page altogether, we could use this approach even
during an anti-wraparound vacuum.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mitsuru IWASAKI 2011-06-05 12:50:14 Re: patch for new feature: Buffer Cache Hibernation
Previous Message Andrew Dunstan 2011-06-05 02:09:25 Re: ts_count