Re: Turning off HOT/Cleanup sometimes

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Turning off HOT/Cleanup sometimes
Date: 2015-04-15 15:11:47
Message-ID: 552E7FB3.7090801@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/15/2015 05:44 PM, Alvaro Herrera wrote:
> Simon Riggs wrote:
>> On 15 April 2015 at 09:10, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
>>> I don't really see the downside to this suggestion.
>>
>> The suggestion makes things better than they are now but is still less
>> than I have proposed.
>>
>> If what you both mean is "IMHO this is an acceptable compromise", I
>> can accept it also, at this point in the CF.
>
> Let me see if I understand things.
>
> What we have now is: when reading a page, we also HOT-clean it. This
> runs HOT-cleanup a large number of times, and causes many pages to
> become dirty.
>
> Your patch is "when reading a page, HOT-clean it, but only 5 times in
> each scan". This runs HOT-cleanup at most 5 times, and causes at most 5
> pages to become dirty.
>
> Robert's proposal is "when reading a page, if dirty HOT-clean it; if not
> dirty, also HOT-clean it but only 5 times in each scan". This runs
> HOT-cleanup some number of times (as many as there are dirty), and
> causes at most 5 pages to become dirty.
>
>
> Am I right in thinking that HOT-clean in a dirty page is something that
> runs completely within CPU cache? If so, it would be damn fast and
> would have benefits for future readers, for very little cost.

If there are many tuples on the page, it takes some CPU effort to scan
all the HOT chains and move tuples around. Also, it creates a WAL
record, which isn't free.

Another question is whether the patch can reliably detect whether it's
doing a "read-only" scan or not. I haven't tested, but I suspect it'd
not do pruning when you do something like "INSERT INTO foo SELECT * FROM
foo WHERE blah". I.e. when the target relation is referenced twice in
the same statement: once as the target, and second time as a source.
Maybe that's OK, though.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sawada Masahiko 2015-04-15 15:30:56 Re: Auditing extension for PostgreSQL (Take 2)
Previous Message Andrew Dunstan 2015-04-15 15:11:02 Re: [COMMITTERS] pgsql: Move pg_upgrade from contrib/ to src/bin/