Re: default_statistics_target WAS: max_wal_senders must die

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: default_statistics_target WAS: max_wal_senders must die
Date: 2010-10-20 22:15:15
Message-ID: 4CBF69F3.2070003@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>> Maybe what should be done about this is to have separate sizes for the
>> MCV list and the histogram, where the MCV list is automatically sized
>> during ANALYZE.

It's been suggested multiple times that we should base our sample size
on a % of the table, or at least offer that as an option. I've pointed
out (with math, which Simon wrote a prototype for) that doing
block-based sampling instead of random-row sampling would allow us to
collect, say, 2% of a very large table without more I/O than we're doing
now.

Nathan Boley has also shown that we could get tremendously better
estimates without additional sampling if our statistics collector
recognized common patterns such as normal, linear and geometric
distributions. Right now our whole stats system assumes a completely
random distribution.

So, I think we could easily be quite a bit smarter than just increasing
the MCV. Although that might be a nice start.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-20 22:15:43 Re: Review: Fix snapshot taking inconsistencies
Previous Message Bruce Momjian 2010-10-20 22:15:08 Re: Proposed Windows-specific change: Enable crash dumps (like core files)