After VACUUM, statistics become skewed

From: Robert(dot)Farrugia(at)go(dot)com(dot)mt
To: pgsql-admin(at)postgresql(dot)org
Subject: After VACUUM, statistics become skewed
Date: 2003-05-21 15:00:21
Message-ID: 20030521150018.0CBB1924EC2@developer.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I have noticed the following problem on various large tables for certain
queries.

Given a table around 4GB in size containing millions of records, before
vacuuming the following query used the correct index
(mo_200302_called_idx) which is built on answertime and callednumber_type.

Query is as follows:
select answertime::date, count(*), sum(callduration) from
mobileorig_200302 where answertime between '2003/2/3 00:00:00' and
'2003/2/14 23:59:59' and normal(dialleddigits_value) = '50043992' and
callednumber_type in ('P', 'M') group by answertime::date

After doing a database wide vacuum full analyze, the above query is
starting to use an incorrect index i.e. the callingnumber_type (which uses
answertime and callingnumber_type fields). This eventually slows down the
system since the query takes more time to finish.

NOTICE: QUERY PLAN:

Aggregate (cost=9218923.08..9218935.56 rows=166 width=12)
-> Group (cost=9218923.08..9218927.24 rows=1664 width=12)
-> Sort (cost=9218923.08..9218923.08 rows=1664 width=12)
-> Index Scan using mo_200302_calling_idx on
mobileorig_200302 (cost=0.00..9218834.06 rows=1664 width=12)

EXPLAIN

I managed to track down the problem to this. If I used only one
callednumber_type, i.e. the query becomes
select answertime::date, count(*), sum(callduration) from
mobileorig_200302 where answertime between '2003/2/3 00:00:00' and
'2003/2/14 23:59:59' and normal(dialleddigits_value) = '50043992' and callednumber_type in ('M') group by answertime::date

the query planner uses the correct index, while using more than one type,
it skews up.

NOTICE: QUERY PLAN:
Aggregate (cost=137870.25..137871.57 rows=18 width=12)
-> Group (cost=137870.25..137870.69 rows=177 width=12)
-> Sort (cost=137870.25..137870.25 rows=177 width=12)
-> Index Scan using mo_200302_called_idx on
mobileorig_200302 (cost=0.00..137863.66 rows=177 width=12)
EXPLAIN

Anyone else encountered something similar ? Any ideas on what is happening
and if is solvable ? Usually by dropping the table and reloading it from
disk may solve the problem (it may take more than one try), but this is
becoming unpractical due to the amount of data to restore each time. I'll
try re-indexing the tables and see if something happens. Unfortunately
this also happens on tables which have not been changed for a very long
time and have also been re-indexed so I'm not confident on this.

The system runs on postgres 7.2.3.

Regards
Robert

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2003-05-21 16:51:06 Re: After VACUUM, statistics become skewed
Previous Message Tom Lane 2003-05-21 14:25:42 Re: Logfile removal