Re: PostgreSQL slow after VACUUM

From: Pierre-Frédéric Caillaud <lists(at)boutiquenumerique(dot)com>
To: "PgSQL General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL slow after VACUUM
Date: 2004-11-26 13:00:48
Message-ID: opsh2o3mtacq72hf@musicbox
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


It seems this issue has been mentionned several times lately...
I'd propose something to do to avoid it :

* When TRUNCAT'ing a table :
- the row-count is reset to 0 (of course !)
- however, the column stats are kept, on the basis that the data which
will be inserted later in the table will most likely have at least some
correlation to the data that was in it before. Of course, VACUUM ANALYZE
will update the statistics with real ones when it is run.
This would make TRUNCATE behave more like DELETE FROM...

* When INSERT'ing into a table :
The OP's problem was that the query is planned as if the table was almost
empty (hence seq scan), well it is in fact empty, but many rows are
inserted. Of course, the query can't be replanned in the middle of its
execution (although that'd be a nice feature to ad on the withlist for
postgres version 2020)...
When planning for inserts, the planner currently uses the estimated
number of rows in the target table, taken from the statistics.
However, at this time, the planner has an estimation of how many rows
it's going to insert :
- INSERT ... SELECT, or CREATE TABLE ... AS SELECT :
The SELECT from which the rows will come is planned, so there
should be an estimated number of rows readily available.
- COPY FROM...
The number of rows could be estimated by peeking at the first N
rows in the file (not on COPY FROM stdin though), or an additional
parameter to COPY could be give,.

Then, the INSERT can be planned as if the target table contained
(estimated number of rows in the target table) + (estimated number of rows
to be inserted), and it will choose a good plan for populating empty
tables...

What do you think ?

>
> The problem is, that while doing the vacuum full ANALYZE the table was
> empty. It therefore gathered statistics of a situation which isn't there
> anymore when you fill up the table. In an empty or small table, it is
> normal to do sequential scans. Which you most of the time don't want in
> a large table.
> My suggestion is to VACUUM (FULL) the table after you've deleted the
> data. Then fill up the table and do a ANALYZE when you're done filling
> it.
>
> That way, the analysis of the data will be much more accurate. My guess
> is, it'll use indexes much sooner and be much faster. Have a look at the
> output of EXPLAIN ANALYZE yourstatement; before doing ANALYZE and after.
>
> Best regards,
>
> Arjen

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message diya das 2004-11-26 13:28:56 Query on exception handling in PL/pgSQL
Previous Message Richard Huxton 2004-11-26 11:16:09 Re: PostgreSQL Config.