Turning off HOT/Cleanup sometimes

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Turning off HOT/Cleanup sometimes
Date: 2014-01-08 08:33:54
Message-ID: CA+U5nMJz5u_7Pm7jrabKkqsfRjeCMJYejVfN+-2aJ1Y4Y2P0Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

VACUUM cleans up blocks, which is nice because it happens offline in a
lazy manner.

We also make SELECT clean up blocks as it goes. That is useful in OLTP
workloads, but it means that large SQL queries and pg_dump effectively
do much the same work as VACUUM, generating huge amounts of I/O and
WAL on the master, the cost and annoyance of which is experienced
directly by the user. That is avoided on standbys.

Effects of that are that long running statements often run much longer
than we want, increasing bloat as a result. It also produces wildly
varying response times, depending upon extent of cleanup required.

It is a simple task to make that behaviour optional on the master.

I propose a USERSET parameter, prune_cost_limit (<---insert better name here)
which will make the behaviour optional, default -1, in normal user
processes. VACUUM will ignore this parameter and so its actions will
never be deferred.

In detail, this parameter would disable pruning for any scan larger
than the cost limit. So large scans will disable the behaviour. The
default, -1, means never disable pruning, which is the current
behavour.

We track the number of pages dirtied by the current statement. When
this reaches prune_cost_limit, we will apply these behaviours to all
shared_buffer block accesses...

(1) avoid running heap_page_prune_opt()

(2) avoid dirtying the buffer for hints. (This is safe because the
hinted changes will either be lost or will be part of the full page
image when we make a logged-change).

(i.e. doesn't apply to temp tables)

For example, if we set prune_cost_limit = 4 this behaviour allows
small index lookups via bitmapheapscan to continue to cleanup, while
larger index and seq scans will avoid cleanup.

There would be a postgresql.conf parameter prune_cost_limit, as well
as a table level parameter that would prevent pruning except via
VACUUM.

This will help in these ways
* Reduce write I/O from SELECTs and pg_dump - improving backups and BI queries
* Allow finer grained control over Hot Standby conflicts
* Potentially allow diagnostic inspection of older data via SeqScan

Prototype patch shows this is possible and simple enough for 9.4.
Major objections? Or should I polish up and submit?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message taskov 2014-01-08 08:44:36 Re: gSoC - ADD MERGE COMMAND - code patch submission
Previous Message Simon Riggs 2014-01-08 08:17:15 Re: Simple improvements to freespace allocation