Re: New VACUUM FULL

From: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: New VACUUM FULL
Date: 2009-12-22 09:11:41
Message-ID: 20091222181141.8B8E.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> I think we
> either need to implement that or document that vacuum will not skip
> all-visible pages when running VACUUM FULL.

All-visible does not always mean "filled enough", no? We will need to
check both visibility and fillfactor when we optimize copying tuples.

> Old VACUUM FULL was substantially faster than this on tables that had
> nothing to remove.

Yeah, that's why traditional VACUUM FULL is still kept as INPLACE vacuum.

> Please can you arrange for the cluster operation to skip rebuilding
> indexes if the table is not reduced in size?

Do you think we should dispose the rewritten table when we find the
VACUUM FULL (or CLUSTER) is useless? We could save the time to reindex,
but we've already consumed time to rewrite tables. It will be still
slower than VACUUM FULL INPLACE because it is read-only.

Instead, I'd suggest to have conditional database-wide maintenance
commands, something like:
VACUUM FULL WHERE <the table is fragmented>

We don't have to support the feature by SQL syntax; it could be done
in client tools. How about pg_maintain or pg_ctl maintain that cleans
up each relation with appropriate command? We could replace vacuumdb,
clusterdb, and reindexdb with it then.

> Part of the reason why these happen is that the code hasn't been
> refactored much at all from its original use for cluster. There are
> almost zero comments to explain the additional use of this code for
> VACUUM, or at least to explain it still all works even when there is no
> index.

Ok, I'll check for additional comments. The flow of code might be still
confusable because vacuum() calls cluster(), but we need major replacement
of codes to refactor it. I'm not sure we need the code rewrites for it.

Also, I think we need additional messages for VACUUM VERBOSE
(and CLUSTER VERBOSE), though it might be adjusted in another patch.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin Pihlak 2009-12-22 09:30:42 Re: fdw validation function vs zero catalog id
Previous Message Simon Riggs 2009-12-22 08:34:32 Re: alpha3 release schedule?