Re: Avoiding second heap scan in VACUUM

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Avoiding second heap scan in VACUUM
Date: 2008-05-30 10:01:55
Message-ID: 1212141715.4120.69.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Fri, 2008-05-30 at 14:50 +0530, Pavan Deolasee wrote:
> On Fri, May 30, 2008 at 2:41 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > What I still
> > don't accept is that an unconstrained wait is justifiable. You've just
> > said its a minor detail, but that's not the way I see it. It might be a
> > second, but it might be an hour or more.
> >
>
> I am suggesting a timed wait. May be say between 60-300 seconds.
> That's the maximum VACUUM would get delayed. If exiting transactions
> don't finish within that time, VACUUM just works as it does today. So
> it can't certainly be much worse than what it is today.
>
> > A non-waiting solution seems like the only way to proceed.
> >

I understand what you're saying and agree that in (some) cases a small
wait is not important. I'm just saying some != all, and the gap covers
important cases:

If we have a database with 100 tables (very common) and we add 5 minutes
waiting time to each vacuum, then we'll make a complete database VACUUM
take ~7 hours longer than it did before. 1000 tables would cause rioting
in the streets.

Waiting for 5 minutes for a 0.5 second vacuum isn't sensible either,
whatever the gain. It's clear Amdahl's Law would not advise us to
optimise that (in this way).

So if its a large table and we submitted it with a non-zero vacuum wait,
then maybe a wait is an acceptable optimisation.

Perhaps we can start first scan, check xid after we scan each few
blocks. Once we find the xid is older, then we know the size of the
second scan can be limited to only those blocks already scanned. So the
two endpoints of behaviour are we skip the scan completely or we do the
whole scan, but at least there is a saving in many cases without
waiting.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2008-05-30 10:08:05 GIN improvements
Previous Message Pavan Deolasee 2008-05-30 09:20:42 Re: Avoiding second heap scan in VACUUM