Skip site navigation (1) Skip section navigation (2)

Auto ANALYZE criteria

From: Joe Miller <joe(dot)d(dot)miller(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Auto ANALYZE criteria
Date: 2010-09-20 20:38:37
Message-ID: AANLkTi=W02Lcn63qqxoP9jjQ0CJ8wZfCJkKV7fry7CP=@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
The autovacuum daemon currently uses the number of inserted and
updated tuples to determine if it should run VACUUM ANALYZE on a
table.  Why doesn’t it consider deleted tuples as well?

For example, I have a table which gets initially loaded with several
million records. A batch process grabs the records 100 at a time, does
some processing and deletes them from the table in the order of the
primary key.  Eventually, performance degrades because an autoanalyze
is never run.  The planner decides that it should do a sequential scan
instead of an index scan because the stats don't reflect reality.  See
example below.

I can set up a cron job to run the ANALYZE manually, but it seems like
the autovacuum daemon should be smart enough to figure this out on its
own.  Deletes can have as big an impact on the stats as inserts and
updates.

Joe Miller

---------------------------

testdb=# \d test
     Table "public.test"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer | not null
 data   | bytea   |
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

testdb=# insert into public.test select s.a, gen_random_bytes(256)
from generate_series(1,10000000) as s(a);
INSERT 0 10000000

testdb=# SELECT *
FROM pg_stat_all_tables
WHERE schemaname='public' AND relname='test';
  relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd |
n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze
| last_autoanalyze
---------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------+-----------------+--------------+------------------
 5608158 | public     | test    |        1 |            0 |        0 |
            0 |  10000000 |         0 |         0 |             0 |
      0 |          0 |             |                 |              |
2010-09-20 10:46:37.283775-04
(1 row)


testdb=# explain analyze delete from public.test where id <= 100;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Index Scan using test_pkey on test  (cost=0.00..71.63 rows=1000
width=6) (actual time=13.251..22.916 rows=100 loops=1)
   Index Cond: (id <= 100)
 Total runtime: 23.271 ms
(3 rows)

{ delete records ad nauseum }

testdb=# explain analyze delete from public.test where id <= 7978800;
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..410106.17 rows=2538412 width=6) (actual
time=48771.772..49681.562 rows=100 loops=1)
   Filter: (id <= 7978800)
 Total runtime: 49682.006 ms
(3 rows)

testdb=# SELECT *
FROM pg_stat_all_tables
WHERE schemaname='public' AND relname='test';
  relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd |
n_live_tup | n_dead_tup | last_vacuum |        last_autovacuum
| last_analyze |       last_autoanalyze
---------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------+-------------------------------+--------------+-------------------------------
 5608158 | public     | test    |        1 |            0 |    54345 |
      5433206 |  10000000 |         0 |   5433200 |             0 |
5459506 |     725300 |             | 2010-09-20 14:45:54.757611-04 |
           | 2010-09-20 10:46:37.283775-04

Responses

pgsql-performance by date

Next:From: George SextonDate: 2010-09-20 21:54:32
Subject: Re: 3ware trivia overload
Previous:From: Josh KupershmidtDate: 2010-09-20 19:24:22
Subject: Re: cleanup on pg_ system tables?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group