optimizer behavior in the case of highly updated tables

From: Mark Rostron <mrostron(at)ql2(dot)com>
To: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: optimizer behavior in the case of highly updated tables
Date: 2010-06-09 02:58:34
Message-ID: FD020D3E50E7FA479567872E5F5F31E304599C0B41@ex01.corp.ql2.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I am looking for some specific information regarding optimizer behavior.
We recently experienced a situation where a query that was previously using a btree lookup (efficient) SWITCHED to using seqscan/hash lookup.

My questions would be:

- Under what circumstances is the optimizer likely to CHANGE behavior from using a btree index lookup to using a seq scan/hash lookup?

- What are the critical decision factors that would feed into the optimizer making such a change?

- Is it possible to measure any metrics in a way that would enable a prediction of such a change?

Platform

- 8.3.10 (64bit) on RHEL5.

- Linux xxxxx 2.6.18-164.10.1.el5xen #1 SMP Thu Jan 7 20:28:30 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

Application
The table in question is:

- 30m rows, variable length (contains varchar fields), rowlength avg about 120B

- Approx. 3m unique values in the index column

Activity on the table would be, per row: "Insert, multiple updates, delete after 90 days"

We vacuum analyze this table once/weekly.
No partitions are used.

Our experience which prompts this question was as follows:

- If the table is not "vacuum analyze'd" at least once/week, the query plans become unpredictable as to whether they will use btree or seqscan/hash lookup

- Until last week, "vacuum analyze" was sufficient

- Friday evening of last week, the query plan for selected queries against this index changed again, but "vacuum analyze" was insufficient

- Rebuilding index on primary key and on the column index was insufficient

- It was necessary to take a site outage and perform a "vacuum full analyze" on the table

- Following this, the query plan reverted to the more efficient btree lookup

Clearly, the garbage buildup resulting from transaction activity on the table is the villain here.

- Is it possible to calculate expected space usage given row count and average row size

- At what point might the ratio of "expected"/"actual" space usage be able to indicate the need to perform "full vacuum", or similar maintenance

Any observations/comments that anyone would care to make are welcome.
Thanks in advance for your time
Mr

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Greg Smith 2010-06-09 05:14:33 Re: Details about pg_stat_bgwriter
Previous Message Thomas Kellerer 2010-06-08 17:46:30 Re: Details about pg_stat_bgwriter