Re: Turning off HOT/Cleanup sometimes

From: Jim Nasby <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Turning off HOT/Cleanup sometimes
Date: 2014-01-09 21:25:08
Message-ID: 52CF13B4.3070400@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/9/14, 12:54 PM, Tom Lane wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>> That said, I'm not entirely convinced that traversing these dead tuples
>> is all *that* painful during SELECT. If there's that many levels then
>> hopefully it's not long til an UPDATE comes along and cleans them up.
>
> There's always VACUUM ;-)
>
> If you take about ten steps back, what's happening here is that
> maintenance work that we'd originally delegated to VACUUM, precisely so
> that it wouldn't have to be done by foreground queries, is now being done
> by foreground queries. And oddly enough, people don't like that.
>
> There is a reasonable argument for forcing UPDATE queries to do it anyway,
> to improve the odds they can do same-page updates (whether HOT or
> otherwise). And probably an INSERT should do it on a page that it's
> selected as an insertion target. But I think the argument that the
> original do-maintenance-in-background-whenever-possible design was wrong
> is a lot harder to sustain for SELECT or even DELETE queries. As I said
> upthread, I think the current behavior was *not* chosen for performance
> reasons but just to limit the scope of what we had to change for HOT.

Instead of looking at how to avoid this work in SELECTs maybe it'd be more useful to look at how we can get it done more quickly in the background. The VSM is already a step in the right direction, but it seems the big use case here is when some bulk operation comes through and touches a sizeable number of blocks (but perhaps not enough to hit autovac thresholds).

ISTM it wouldn't be too difficult for a backend to track how many blocks in a relation it's dirtied (keep in mind that count doesn't have to be perfect). If we tracked that info, it could be put into a maintenance queue (LISTEN/NOTIFY?) along with our XID. That gives us a list of relations to vacuum and exactly when to vacuum them. Thanks to the VSM we wouldn't need to track individual pages (though it might be useful to track the minimum and maximum block IDs we hit, per relation).
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-09 21:27:23 Re: Turning off HOT/Cleanup sometimes
Previous Message Tom Lane 2014-01-09 21:14:10 Re: Planning time in explain/explain analyze