Re: Turning off HOT/Cleanup sometimes

From: David Steele <david(at)pgmasters(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Turning off HOT/Cleanup sometimes
Date: 2015-04-14 23:12:41
Message-ID: 552D9EE9.7030507@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/14/15 6:07 PM, Simon Riggs wrote:
> On 11 March 2015 at 20:55, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
>> I don't know how to move forward. We could give users a knob: This
>> might make your queries faster or not -- good luck. But of course
>> nobody will like that either.
>
> What is clear is that large SELECT queries are doing the work VACUUM
> should do. We should not be doing large background tasks (block
> cleanup) during long running foreground tasks. But there is no need
> for changing behaviour during small SELECTs. So the setting of 4 gives
> current behaviour for small SELECTs and new behaviour for larger
> SELECTs.
>
> The OP said this...
> <op>
> We also make SELECT clean up blocks as it goes. That is useful in OLTP
> workloads, but it means that large SQL queries and pg_dump effectively
> do much the same work as VACUUM, generating huge amounts of I/O and
> WAL on the master, the cost and annoyance of which is experienced
> directly by the user. That is avoided on standbys.
>
> Effects of that are that long running statements often run much longer
> than we want, increasing bloat as a result. It also produces wildly
> varying response times, depending upon extent of cleanup required.
> </op>
>
> This is not a performance patch. This is about one user doing the
> cleanup work for another. People running large SELECTs should not be
> penalised. The patch has been shown to avoid that and no further
> discussion should be required.
>
> I don't really care whether we have a parameter for this or not. As
> long as we have the main feature.
>
> It's trivial to add/remove a parameter to control this. Currently
> there isn't one.
>
> I'd like to commit this.

+1 from me. One of the last databases I worked on had big raw
partitions that were written to and then sequentially scanned exactly
once before being dropped. It was painful to see all those writes
happening for nothing.

In other cases there were sequential scans that happened directly after
the main writes, but then the next read might be days in the future (if
ever) and the system was basically idle for a while which would have
allowed vacuum to come in and do the job without affecting performance
of the main job.

I think that in batch-oriented databases this patch will definitely be a
boon to performance.

--
- David Steele
david(at)pgmasters(dot)net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2015-04-14 23:13:54 Re: Auditing extension for PostgreSQL (Take 2)
Previous Message Greg Stark 2015-04-14 23:10:50 Re: Clock sweep not caching enough B-Tree leaf pages?