Re: optimizer behavior in the case of highly updated tables

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Mark Rostron <mrostron(at)ql2(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: optimizer behavior in the case of highly updated tables
Date: 2010-06-09 06:04:04
Message-ID: 4C0F2ED4.2080102@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Mark Rostron wrote:
>
> - It was necessary to take a site outage and perform a “vacuum full
> analyze” on the table
>
> - Following this, the query plan reverted to the more efficient btree
> lookup
>
> Clearly, the garbage buildup resulting from transaction activity on
> the table is the villain here.
>
> - Is it possible to calculate expected space usage given row count and
> average row size
>
> - At what point might the ratio of “expected”/”actual” space usage be
> able to indicate the need to perform “full vacuum”, or similar maintenance
>

I think you're right to focus on this part, because with your usage
pattern--deleting all old data constantly--you have to get this under
control in order for the query planner to do the right thing here.

Start by reading http://wiki.postgresql.org/wiki/VACUUM_FULL

What you probably should have done in order to return to sane query
plans was to run CLUSTER instead of VACUUM FULL. Site would have been
down less time, and you actually made some future problems a bit worse
by screwing your indexes up some using the problematic FULL.

As for measuring what's going wrong here, what you want to do is run
this around once a week during a slow period and save the output for
analysis:

VACUUM VERBOSE;

This will give you a bunch of statistics about space usage, including a
summary at the end that will tell if there's a serious problem you might
already be running into (running out of space in the free space map).
Combine that with a look at pg_stat_user_tables and you should have a
decent initial statistical picture of what's going on with your data.

There are two large scale things you may need to consider if volume on
your site expands in the future:

1) Deleting the historical data in smaller chunks and doing an immediate
VACUUM afterwards. Perhaps as often as daily. This keeps the amount of
dead space VACUUM has to clean up as small as possible, at the expensive
of higher site overhead.

2) Start partitioning your tables. This allows you to just DROP old
partitions rather than deleting rows from a single master table. It can
make this whole class of problem go away.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Marlowe 2010-06-09 06:08:58 Re: optimizer behavior in the case of highly updated tables
Previous Message Scott Marlowe 2010-06-09 06:02:09 Re: Details about pg_stat_bgwriter