Re: VACUUM and ANALYZE Follow-Up

From: Pierre-Frédéric Caillaud <lists(at)boutiquenumerique(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: VACUUM and ANALYZE Follow-Up
Date: 2004-11-30 01:05:04
Message-ID: opsh86mqz3cq72hf@musicbox
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> 4. Isn't ANALYZE on a totally empty table really a special case? The
> presumption should be that the table will not remain empty. To optimize
> the performance assuming that there will be zero (or close to zero) rows
> seems somewhat pointless. However, there are valid reasons why a table
> might be empty at the moment in time when the ANALYZE is run. (In our
> case, we use "work" tables that get cleared at the end of an application
> process.) And, as mentioned above, it is easier to VACUUM ANALYZE an
> entire database than it is to list tables individually.

Well, for instance I have a few tables which contain just a few rows, for
instance a list of states in which an object in another table may be, or a
list of tax rates... for these kinds of tables with like, 10 rows, or just
a few pages, you don't want index scans, so VACUUM and ANALYZE are doing
their job.

If you were going to insert 5 rows in an empty table, you would also want
this behaviour. The problems start when you make a large INSERT in an
empty or almost empty table.

So, how to solve your problem without slowing the other requests (ie.
selecting and inserting a few rows into a very small table) ?

Nobody responded to my suggestion that the planner take into account the
number of rows to be inserted into the table in its plan... so I'll repost
it :
- INSERT ... SELECT :
Planner has an estimate of how many rows the SELECT will yield. So it
could plan the queries involving SELECTs on the target table (like, UNIQUE
checks et al) using the number of rows in the table + number of rows to be
inserted. This solves your problem.

Problems with this approach :
- This only gives a number of rows, not more precise statistics
It's the only information available so why not use it ? And it's enough
to solve the OP's problem.

- Can get recursive
What if there is a self-join ? I guess, just fall back to the previous
behaviour...

- Does not work for COPY
argument : COPY should act like it's going to insert many rows. Most of
the time, that's how it's used.

- When the estimated number of rows to insert is imprecise
(for instance a SELECT with UNION's or DISTINCT or a huge join), the
outcome would be incertain.

What do you think ?

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Todd P Marek 2004-11-30 01:35:57 SQL Server stored procedures?
Previous Message Joshua D. Drake 2004-11-30 01:02:21 Re: VACUUM and ANALYZE Follow-Up