Re: PostgreSQL 8.2.3 VACUUM Timings/Performance

From: "Bruce McAlister" <bruce(dot)mcalister(at)blueface(dot)ie>
To: pgsql-general(at)postgresql(dot)org pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL 8.2.3 VACUUM Timings/Performance
Date: 2007-03-16 19:06:57
Message-ID: etepst$18fc$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Hi All,

Okay, I'm getting a little further now. I'm about to create entries in the
pg_autovacuum system tables. However, I'm a little confused as to how I go
about finding out the OID value of the tables. The pg_autovacuum table
requires the OID of the table you want to create settings for (vacrelid).
Can anyone shed some light on how I can extract the OID of the table? Also,
what happens if you create a table without OID's, are you still able to add
it's details in the pg_autovacuum table if there is no OID associated with a
table?

Name Type References Description
vacrelid oid pg_class.oid The table this entry is for
enabled bool If false, this table is never autovacuumed
vac_base_thresh integer Minimum number of modified tuples before
vacuum
vac_scale_factor float4 Multiplier for reltuples to add to
vac_base_thresh
anl_base_thresh integer Minimum number of modified tuples before
analyze
anl_scale_factor float4 Multiplier for reltuples to add to
anl_base_thresh
vac_cost_delay integer Custom vacuum_cost_delay parameter
vac_cost_limit integer Custom vacuum_cost_limit parameter
freeze_min_age integer Custom vacuum_freeze_min_age parameter
freeze_max_age integer Custom autovacuum_freeze_max_age parameter

Thanks
Bruce

"Bruce McAlister" <bruce(dot)mcalister(at)blueface(dot)ie> wrote in message
news:esrdsb$1klg$1(at)news(dot)hub(dot)org(dot)(dot)(dot)
> Hi Tom,
>
> Thanks for the suggestion. It's been a while since I replied to this as I
> had to go and do some further investigation of the docs with regards the
> autovacuum daemons configuration. According to the documentation, the
> formula's for the vacuum and analyze are as follows:
>
> Vacuum
> vacuum threshold = vacuum base threshold + vacuum scale factor * number
> of tuples
> Analyze
> analyze threshold = analyze base threshold + analyze scale factor *
> number of tuples
>
> My current settings for autovacuum are as follows:
>
> # - Cost-Based Vacuum Delay -
>
> vacuum_cost_delay = 200 # 0-1000 milliseconds
> vacuum_cost_page_hit = 1 # 0-10000 credits
> vacuum_cost_page_miss = 10 # 0-10000 credits
> vacuum_cost_page_dirty = 20 # 0-10000 credits
> vacuum_cost_limit = 200 # 0-10000 credits
>
> #---------------------------------------------------------------------------
> # AUTOVACUUM PARAMETERS
> #---------------------------------------------------------------------------
>
> autovacuum = on #
> enable autovacuum subprocess?
>
> # 'on' requires stats_start_collector
>
> # and stats_row_level to also be on
> autovacuum_naptime = 1min # time
> between autovacuum runs
> autovacuum_vacuum_threshold = 500 # min # of tuple
> updates before
>
> # vacuum
> autovacuum_analyze_threshold = 250 # min # of
> tuple updates before
>
> # analyze
> autovacuum_vacuum_scale_factor = 0.2 # fraction of rel
> size before
>
> # vacuum
> autovacuum_analyze_scale_factor = 0.1 # fraction of
> rel size before
>
> # analyze
> autovacuum_freeze_max_age = 200000000 # maximum XID age
> before forced vacuum
>
> # (change requires restart)
> autovacuum_vacuum_cost_delay = -1 # default vacuum
> cost delay for
>
> # autovacuum, -1 means use
>
> # vacuum_cost_delay
> autovacuum_vacuum_cost_limit = -1 # default
> vacuum cost limit for
>
> # autovacuum, -1 means use
>
> # vacuum_cost_limit
>
> Thus to make the autovacuum more aggressive I am thinking along the lines
> of changing the following parameters:
>
> autovacuum_vacuum_threshold = 250
> autovacuum_analyze_threshold = 125
>
> The documentation also mentions that when the autovacuum runs it selects a
> single database to process on that run. This means that the particular
> table that we are interrested in will only be vacuumed once every 17
> minutes, assuming we have 18 databases and the selection process is
> sequential through the database list.
>
> From my understanding of the documentation, the only way to work around
> this issue is to manually update the system catalog table pg_autovacuum
> and set the pg_autovacuum.enabled field to false to skip the autovacuum on
> tables that dont require such frequent vacuums. If I do enable this
> feature, and I manually run a vacuumdb from the command line against that
> particular disabled table, will the vacuum still process the table? I'm
> assuming too, that the best tables to disable autovacuum on will be ones
> with a minimal amount of update/delete queries run against it. For
> example, if we have a table that only has inserts applied to it, it is
> safe to assume that that table can safely be ignored by autovacuum.
>
> Do you have any other suggestions as to which tables generally can be
> excluded from the autovacuum based on the usage patterns?
> Can you see anything with respect to my new autovacuum parameters that may
> cause issue's and are there any other parameters that you suggest I need
> to change to make the autovacuum daemon more aggressive?
>
> PS: Currently we have the Cluster command running on the sipaccounts table
> as the vacuum full is taking too long. It would be nice though to have
> some piece of mind that the cluster command is mvcc safe, as Heikki and
> Aidan have mentioned that it is not and may break things in our
> environment, I'm a little afraid of running with the cluster command, and
> should possibly go back to the vacuum full :/
>
> Thanks all for any and all suggestions/comments.
>
> Thanks
> Bruce
>
>
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in message
> news:19106(dot)1173111068(at)sss(dot)pgh(dot)pa(dot)us(dot)(dot)(dot)
>> "Bruce McAlister" <bruce(dot)mcalister(at)blueface(dot)ie> writes:
>>> [1] AutoVacuum runs during the day over the entire PostgreSQL cluster,
>>
>> Good, but evidently you need to make it more aggressive.
>>
>>> [2] A Vacuum Full Verbose is run during our least busy period (generally
>>> 03:30) against the Database,
>>
>>> [3] A Re-Index on the table is performed,
>>
>>> [4] A Cluster on the table is performed against the most used index,
>>
>>> [5] A Vacuum Analyze Verbose is run against the database.
>>
>> That is enormous overkill. Steps 2 and 3 are a 100% waste of time if
>> you are going to cluster in step 4. Just do the CLUSTER and then
>> ANALYZE (or VACUUM ANALYZE if you really must, but the value is
>> marginal).
>>
>> regards, tom lane
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: Have you checked our extensive FAQ?
>>
>> http://www.postgresql.org/docs/faq
>>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2007-03-16 19:17:37 Re: planning issue
Previous Message Jonathan Vanasco 2007-03-16 18:31:49 planning issue

Browse pgsql-performance by date

  From Date Subject
Next Message Erik Jones 2007-03-16 19:22:04 Re: PostgreSQL 8.2.3 VACUUM Timings/Performance
Previous Message Carlos Moreno 2007-03-16 18:13:31 Re: Postgres batch write very slow - what to do