Skip site navigation (1) Skip section navigation (2)

Re: VACUUM and ANALYZE Follow-Up

From: Pierre-Frédéric Caillaud <lists(at)boutiquenumerique(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: VACUUM and ANALYZE Follow-Up
Date: 2004-11-30 01:05:04
Message-ID: opsh86mqz3cq72hf@musicbox (view raw or flat)
Thread:
Lists: pgsql-general
> 4. Isn't ANALYZE on a totally empty table really a special case?  The
> presumption should be that the table will not remain empty.  To optimize
> the performance assuming that there will be zero (or close to zero) rows
> seems somewhat pointless.  However, there are valid reasons why a table
> might be empty at the moment in time when the ANALYZE is run.  (In our
> case, we use "work" tables that get cleared at the end of an application
> process.)  And, as mentioned above, it is easier to VACUUM ANALYZE an
> entire database than it is to list tables individually.

	Well, for instance I have a few tables which contain just a few rows, for  
instance a list of states in which an object in another table may be, or a  
list of tax rates... for these kinds of tables with like, 10 rows, or just  
a few pages, you don't want index scans, so VACUUM and ANALYZE are doing  
their job.

	If you were going to insert 5 rows in an empty table, you would also want  
this behaviour. The problems start when you make a large INSERT in an  
empty or almost empty table.

	So, how to solve your problem without slowing the other requests (ie.  
selecting and inserting a few rows into a very small table) ?

	Nobody responded to my suggestion that the planner take into account the  
number of rows to be inserted into the table in its plan... so I'll repost  
it :
	- INSERT ... SELECT :
	Planner has an estimate of how many rows the SELECT will yield. So it  
could plan the queries involving SELECTs on the target table (like, UNIQUE  
checks et al) using the number of rows in the table + number of rows to be  
inserted. This solves your problem.

	Problems with this approach :
	- This only gives a number of rows, not more precise statistics
	It's the only information available so why not use it ? And it's enough  
to solve the OP's problem.
	
	- Can get recursive
	What if there is a self-join ? I guess, just fall back to the previous  
behaviour...

	- Does not work for COPY
	argument : COPY should act like it's going to insert many rows. Most of  
the time, that's how it's used.

	- When the estimated number of rows to insert is imprecise
	(for instance a SELECT with UNION's or DISTINCT or a huge join), the  
outcome would be incertain.

	What do you think ?








In response to

pgsql-general by date

Next:From: Todd P MarekDate: 2004-11-30 01:35:57
Subject: SQL Server stored procedures?
Previous:From: Joshua D. DrakeDate: 2004-11-30 01:02:21
Subject: Re: VACUUM and ANALYZE Follow-Up

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group