From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: default_statistics_target WAS: max_wal_senders must die |
Date: | 2010-10-20 22:15:15 |
Message-ID: | 4CBF69F3.2070003@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>> Maybe what should be done about this is to have separate sizes for the
>> MCV list and the histogram, where the MCV list is automatically sized
>> during ANALYZE.
It's been suggested multiple times that we should base our sample size
on a % of the table, or at least offer that as an option. I've pointed
out (with math, which Simon wrote a prototype for) that doing
block-based sampling instead of random-row sampling would allow us to
collect, say, 2% of a very large table without more I/O than we're doing
now.
Nathan Boley has also shown that we could get tremendously better
estimates without additional sampling if our statistics collector
recognized common patterns such as normal, linear and geometric
distributions. Right now our whole stats system assumes a completely
random distribution.
So, I think we could easily be quite a bit smarter than just increasing
the MCV. Although that might be a nice start.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-10-20 22:15:43 | Re: Review: Fix snapshot taking inconsistencies |
Previous Message | Bruce Momjian | 2010-10-20 22:15:08 | Re: Proposed Windows-specific change: Enable crash dumps (like core files) |