Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> My suggested hack for PostgreSQL is to have an option to *not* sample,
> just to scan the whole table and find n_distinct accurately.
> What price a single scan of a table, however large, when incorrect
> statistics could force scans and sorts to occur when they aren't
> actually needed ?
It's not just the scan --- you also have to sort, or something like
that, if you want to count distinct values. I doubt anyone is really
going to consider this a feasible answer for large tables.
regards, tom lane
In response to
pgsql-performance by date
|Next:||From: Thomas F.O'Connell||Date: 2005-04-25 15:44:24|
|Subject: Re: pgbench Comparison of 7.4.7 to 8.0.2|
|Previous:||From: Merlin Moncure||Date: 2005-04-25 14:13:34|
|Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon|
pgsql-hackers by date
|Next:||From: Dave Held||Date: 2005-04-25 16:15:22|
|Subject: Re: [PERFORM] Bad n_distinct estimation; hacks suggested?|
|Previous:||From: Tom Lane||Date: 2005-04-25 15:11:45|
|Subject: Re: How to make lazy VACUUM of one table run in several transactions ? |