Re: [PERFORM] More detail on settings for pgavd?

From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: Shridhar Daithankar <shridhar_daithankar(at)myrealbox(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] More detail on settings for pgavd?
Date: 2003-11-20 14:30:43
Message-ID: 3FBCD013.3030402@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Shridhar Daithankar wrote:

> Josh Berkus wrote:
>
>> Shridhar,
>
> >>However I do not agree with this logic entirely. It pegs the next
> vacuum
>
>>> w.r.t current table size which is not always a good thing.
>>
Ok, what do you recommend? The point of two separate variables allows
you to specify if you want vacuum based on a fixed number, based on
table size or something inbetween.

>>
>> No, I think the logic's fine, it's the numbers which are wrong. We
>> want to vacuum when updates reach between 5% and 15% of total rows.
>> NOT when updates reach 110% of total rows ... that's much too late.
>
For small tables, you don't need to vacuum too often. In the testing I
did a small table ~100 rows, didn't really show significant performance
degredation until it had close to 1000 updates. For large tables,
vacuum is so expensive, that you don't want to do it very often, and
scanning the whole table when there is only 5% wasted space is not very
helpful.

>> Hmmm ... I also think the threshold level needs to be lowered; I
>> guess the purpose was to prevent continuous re-vacuuuming of small
>> tables? Unfortunately, in the current implementation, the result is
>> tha small tables never get vacuumed at all.
>>
>> So for defaults, I would peg -V at 0.1 and -v at 100, so our default
>> calculation for a table with 10,000 rows is:
>>
>> 100 + ( 0.1 * 10,000 ) = 1100 rows.
>
Yes, the I set the defaults a little high perhaps so as to err on the
side of caution. I didn't want people to say pg_autovacuum kills the
performance of my server. A small table will get vacuumed, just not
until it has reached the threshold. So a table with 100 rows, will get
vacuumed after 1200 updates / deletes. In my testing it showed that
there was no major performance problems until you reached several
thousand updates / deletes.

>>> Furthermore analyze threshold depends upon inserts+updates. I think it
>>> should also depends upon deletes for obvious reasons.
>>
>> Yes. Vacuum threshold is counting deletes, I hope?
>
> It does.
>
>> My comment about the frequency of vacuums vs. analyze is that
>> currently the *default* is to analyze twice as often as you
>> vacuum. Based on my experiece as a PG admin on a variety of
>> databases, I believe that the default should be to analyze half as
>> often as you vacuum.
>
HUH? analyze is very very cheap compared to vacuum. Why not do it more
often?

>>> I am all for experimentation. If you have real life data to play
>>> with, I
>>> can give you some patches to play around.
>>
>> I will have real data very soon .....
>
> I will submit a patch that would account deletes in analyze threshold.
> Since you want to delay the analyze, I would calculate analyze count as

deletes are already accounted for in the analyze threshold.

> I am still wary of inverting vacuum analyze frequency. You think it is
> better to set inverted default rather than documenting it?

I think inverting the vacuum and analyze frequency is wrong.

What I think I am hearing is that people would like very much to be able
to tweak the settings of pg_autovacuum for individual tables / databases
etc. So that you could set certain tables to be vacuumed more
agressivly than others. I agree this would be a good and welcome
addition. I hope have time to work on this at some point, but in the
near future I won't.

Matthew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-11-20 14:44:17 Re: PG7.4 ordering operator
Previous Message Glenn Wiorek 2003-11-20 14:25:46 Re: question about fixes in v7.4...

Browse pgsql-performance by date

  From Date Subject
Next Message Shridhar Daithankar 2003-11-20 14:59:47 Re: [HACKERS] More detail on settings for pgavd?
Previous Message Shridhar Daithankar 2003-11-20 07:23:25 Re: More detail on settings for pgavd?