Re: Proposal: Another attempt at vacuum improvements

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Another attempt at vacuum improvements
Date: 2011-05-25 11:31:46
Message-ID: BANLkTi=jM_OaXF4HXco++Y05ucXNeQ-=cQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 24, 2011 at 7:58 AM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:

> The biggest gripe today is that vacuum needs two heap scans and each scan
> dirties the buffer.

That's not that clear to me. The debate usually stalls because we
don't have sufficient info from real world analysis of where the time
goes.

> So the idea is to separate the index vacuum (removing index pointers to dead
> tuples) from the heap vacuum. When we do heap vacuum (either by HOT-pruning
> or using regular vacuum), we can spool the dead line pointers somewhere.

ISTM it will be complex to attempt to store the exact list of TIDs
between VACUUMs.

At the moment we scan indexes if we have > 0 rows to remove, which is
probably wasteful. Perhaps it would be better to keep a running total
of rows to remove, by updating pg_stats, then when we hit a certain
threshold in total we can do the index scan. So we don't need to
remember the TIDs, just remember how many there were and use that to
avoid cleaning too vigorously.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-05-25 11:34:59 Re: The way to know whether the standby has caught up with the master
Previous Message Noah Misch 2011-05-25 11:15:04 Re: Domains versus polymorphic functions, redux