Re: [PERFORM] encouraging index-only scans

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] encouraging index-only scans
Date: 2013-09-07 22:47:35
Message-ID: 20130907224735.GG626072@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Hi,

On 2013-09-07 12:50:59 -0400, Bruce Momjian wrote:
> That seems very complicated. I think it would be enough to record the
> current xid at the time of the vacuum, and when testing for later
> vacuums, if that saved xid is earlier than the RecentGlobalXmin, and
> there have been no inserts/updates/deletes, we know that all of
> the pages can now be marked as allvisible.

But that would constantly trigger vacuums, or am I missing something? Or
what are you suggesting this xid to be used for?

What I was talking about was how to evaluate the benefit of triggering
an VACUUM even if there's not a significant amount of new dead rows. If
we know that for a certain xmin horizon there's N pages that potentially
can be cleaned and marked all visible we have a change of making
sensible decisions.
We could just use one bin (i.e. use one cutoff xid as you propose) and
count the number of pages that would be affected. But that would mean
we'd only trigger vacuums very irregularly if you have a workload with
several longrunning transactions. When the oldest of a set of
longrunning transactions finishes you possibly can already clean up a
good bit reducing the chance of further bloat. Otherwise you have to
wait for all of them to finish.

> What this doesn't handle is the insert case. What we could do there is
> to record the total free space map space, and if the FSM has not changed
> between the last vacuum, we can even vacuum if inserts happened in that
> period because we assume the inserts are on new pages. One problem
> there is that the FSM is only updated if an insert will not fit on the
> page. We could record the table size and make sure the table size has
> increased before we allow inserts to trigger a vm-set vacuum.

Not sure why that's better than just counting the number of pages that
have unset vm bits?
Note that you cannot rely on the FSM data to be correct all the time, we
can only use such tricks to trigger vacuums not for the actual operation
in the vacuum.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2013-09-07 22:55:19 Re: [RFC] overflow checks optimized away
Previous Message Pavel Stehule 2013-09-07 19:22:48 Re: review: psql and pset without any arguments

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2013-09-08 21:05:00 Re: [PERFORM] encouraging index-only scans
Previous Message Bruce Momjian 2013-09-07 16:50:59 Re: [PERFORM] encouraging index-only scans