Re: synchronized scans for VACUUM

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jeff Davis" <pgsql(at)j-davis(dot)com>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: synchronized scans for VACUUM
Date: 2008-06-01 12:20:51
Message-ID: 87lk1pjqho.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> The objections to synchronized scans for VACUUM as listed in that thread
>> (summary):
>
>> 2. vacuum takes breaks from the scan to clean up the indexes when it
>> runs out of maintenance_work_mem.
>
>> 2. There have been suggestions about a more compact representation for
>> the tuple id list. If this works, it will solve this problem.
>
> It will certainly not "solve" the problem. What it will do is mean that
> the breaks are further apart and longer, which seems to me to make the
> conflict with syncscan behavior worse not better.

How would it make them longer? They still have the same amount of i/o to do
scanning the indexes. I suppose they would dirty more pages which might slow
them down?

In any case I think the representation you proposed back when this idea last
came up was so compact that pretty much any size table ought to be
representable in a reasonable work_mem -- at least for the kind of machine
which would normally be dealing with that size table.

> It still seems to me that vacuum is unlikely to be a productive member
> of a syncscan herd --- it just isn't going to have similar scan-speed
> behavior to typical queries.

That's my thinking too. Our general direction has been toward reducing
vacuum's i/o bandwidth requirements, not worrying about making it run as fast
as possible.

That said if it happened to latch on to a sync scan herd it would have very
few cache misses which would cause it to rack up very few vacuum cost delay
points. Perhaps the vacuum cost delay for a cache hit ought to be 0?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dirk Riehle 2008-06-01 13:31:44 Re: Feedback on blog post about Replication Feature decision and its impact
Previous Message Pavel Stehule 2008-06-01 09:30:19 explain doesn't work with execute using