Re: More detail on settings for pgavd?

From: Shridhar Daithankar <shridhar_daithankar(at)myrealbox(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: More detail on settings for pgavd?
Date: 2003-11-20 07:23:25
Message-ID: 3FBC6BED.9090809@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Josh Berkus wrote:

> Shridhar,
>>However I do not agree with this logic entirely. It pegs the next vacuum
>>w.r.t current table size which is not always a good thing.
>
>
> No, I think the logic's fine, it's the numbers which are wrong. We want to
> vacuum when updates reach between 5% and 15% of total rows. NOT when
> updates reach 110% of total rows ... that's much too late.

Well, looks like thresholds below 1 should be norm rather than exception.

> Hmmm ... I also think the threshold level needs to be lowered; I guess the
> purpose was to prevent continuous re-vacuuuming of small tables?
> Unfortunately, in the current implementation, the result is tha small tables
> never get vacuumed at all.
>
> So for defaults, I would peg -V at 0.1 and -v at 100, so our default
> calculation for a table with 10,000 rows is:
>
> 100 + ( 0.1 * 10,000 ) = 1100 rows.

I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that
thresholds less than 1 should be used.

>>Furthermore analyze threshold depends upon inserts+updates. I think it
>>should also depends upon deletes for obvious reasons.
> Yes. Vacuum threshold is counting deletes, I hope?

It does.

> My comment about the frequency of vacuums vs. analyze is that currently the
> *default* is to analyze twice as often as you vacuum. Based on my
> experiece as a PG admin on a variety of databases, I believe that the default
> should be to analyze half as often as you vacuum.

OK.

>>I am all for experimentation. If you have real life data to play with, I
>>can give you some patches to play around.
> I will have real data very soon .....

I will submit a patch that would account deletes in analyze threshold. Since you
want to delay the analyze, I would calculate analyze count as

n=updates + inserts *-* deletes

Rather than current "n = updates + inserts". Also update readme about examples
and analyze frequency.

What does statistics gather BTW? Just number of rows or something else as well?
I think I would put that on Hackers separately.

I am still wary of inverting vacuum analyze frequency. You think it is better to
set inverted default rather than documenting it?

Shridhar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2003-11-20 07:39:53 Re: RPM building fun
Previous Message Shridhar Daithankar 2003-11-20 06:34:01 Re: Background writer committed

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew T. O'Connor 2003-11-20 14:30:43 Re: [PERFORM] More detail on settings for pgavd?
Previous Message Shridhar Daithankar 2003-11-20 05:58:13 Re: High Processor consumption