Re: ANALYZE sampling is too good

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE sampling is too good
Date: 2014-03-09 01:55:18
Message-ID: 20140309015518.GB32380@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I assume we never came up with a TODO from this thread:

---------------------------------------------------------------------------

On Tue, Dec 3, 2013 at 11:30:44PM +0000, Greg Stark wrote:
> At multiple conferences I've heard about people trying all sorts of
> gymnastics to avoid ANALYZE which they expect to take too long and
> consume too much I/O. This is especially a big complain after upgrades
> when their new database performs poorly until the new statistics are
> in and they did pg_upgrade to avoid an extended downtime and complain
> about ANALYZE taking hours.
>
> I always gave the party line that ANALYZE only takes a small
> constant-sized sample so even very large tables should be very quick.
> But after hearing the same story again in Heroku I looked into it a
> bit further. I was kind of shocked but the numbers.
>
> ANALYZE takes a sample of 300 * statistics_target rows. That sounds
> pretty reasonable but with default_statistics_target set to 100 that's
> 30,000 rows. If I'm reading the code right It takes this sample by
> sampling 30,000 blocks and then (if the table is large enough) taking
> an average of one row per block. Each block is 8192 bytes so that
> means it's reading 240MB of each table.That's a lot more than I
> realized.
>
> It means if your table is anywhere up to 240MB you're effectively
> doing a full table scan and then throwing out nearly all the data
> read.
>
> Worse, my experience with the posix_fadvise benchmarking is that on
> spinning media reading one out of every 16 blocks takes about the same
> time as reading them all. Presumably this is because the seek time
> between tracks dominates and reading one out of every 16 blocks is
> still reading every track. So in fact if your table is up to about
> 3-4G ANALYZE is still effectively going to do a full table scan, at
> least as far as I/O time goes.
>
> The current algorithm seems like it was designed with a 100G+ table in
> mind but the consequences on the more common 100M-100G tables weren't
> really considered. Consider what this means for partitioned tables. If
> they partition their terabyte table into 10 partitions ANALYZE will
> suddenly want to use 10x as much I/O which seems like a perverse
> consequence.
>
> I'm not sure I have a prescription but my general feeling is that
> we're spending an awful lot of resources going after a statistically
> valid sample when we can spend a lot less resources and get something
> 90% as good. Or if we're really going to read that much data that we
> might as well use more of the rows we find.
>
> --
> greg
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-03-09 06:45:55 Re: Patch: show relation and tuple infos of a lock to acquire
Previous Message Bruce Momjian 2014-03-09 01:44:34 Re: [COMMITTERS] pgsql: libpq: change PQconndefaults() to ignore invalid service files