|From:||Simon Riggs <simon(at)2ndQuadrant(dot)com>|
|Subject:||Avoiding pin scan during btree vacuum|
|Views:||Raw Message | Whole Thread | Download mbox|
During VACUUM of btrees, we need to pin all pages, even those where tuples
are not removed, which I am calling here the "pin scan". This is especially
a problem during redo, where removing one tuple from a 100GB btree can take
a minute or more. That causes replication lags. Bad Thing.
Previously, I have suggested ways of optimizing that and code comments
reflect that. I would like to look at removing the pin scan entirely, on a
the focus was on removing pins from btrees.
In the commit message for that thread/patch,
we see that there are some preconditions to altering the locking.
"This patch leaves behavior unchanged for some cases, which can be
addressed separately so that each case can be evaluated on its own
merits. These unchanged cases are when a scan uses a non-MVCC
snapshot, an index-only scan, and a scan of a btree index for which
modifications are not WAL-logged. If later patches allow all of
these cases to drop the buffer pin after reading a leaf page, then
the btree vacuum process can be simplified; it will no longer need
the "super-exclusive" lock to delete tuples from a page."
The case for master and standby are different. The standby is simpler, yet
more important to change since the pin scan happens in the foreground.
Looking just at the case for standbys, we see there are 3 cases
* non-WAL logged indexes - does not apply on a standby, so ignore
* non-MVCC snapshot - looks like only TOAST scans are a problem on standbys
* index only scans (IOS) - the analysis of which looks wrong to me...
IOSs always check the visibility before using the tuple. If a tuple is
about to be removed by a VACUUM then the tuple will already be dead and the
visibility map will always be set to not-all-visible. So any tuple being
removed by vacuum can never cause a problem to an IOS. Hence the locking
interactions are not required, at least on standbys, for normal tables.
So it looks like we can skip the "pin scan" during redo unless we are
vacuuming a toast index.
Notice that the patch does not slacken the requirement to
super-exclusive-lock the block from which tuples are being removed. The
only thing it does is skip past the requirement to pin each of the
intervening blocks where nothing has happened.
|Next Message||Tom Lane||2015-12-21 14:57:04||Re: SET SESSION AUTHORIZATION superuser limitation.|
|Previous Message||Robert Haas||2015-12-21 14:54:07||Re: Freeze avoidance of very large table.|