Re: default statistics target testing (was: Simple postgresql.conf wizard)

From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Greg Smith" <gsmith(at)gregsmith(dot)com>
Subject: Re: default statistics target testing (was: Simple postgresql.conf wizard)
Date: 2008-12-06 04:52:15
Message-ID: 603c8f070812052052p283a70f1x7b7b80ffeb821c5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> > That is interesting. It would also be interesting to total up the time it
>> > takes to run EXPLAIN (without ANALYZE) for a large number of queries.
> I wonder if we'd see anything dramatically different using PREPARE...

Well... the point here is to measure planning time. I would think
that EXPLAIN would be the best way to get that information without
confounding factors.

>> OK, I did this. I actually tried 10 .. 100 in increments of 10 and
>> then 100 ... 1000 in increments of 50, for 7 different queries of
>> varying complexity (but all generally similar, including all of them
>> having LIMIT 100 as is typical for this database). I planned each
>> query 100 times with each default_statistics_target. The results were
>> somewhat underwhelming.
> The one thing this test seems to overlook is at what point do we see
> diminshing returns from increasing dst. I think the way to do this would be
> to plot dst setting vs. query time; Robert, do you think you could modify
> your test to measure prepare time and then execute time over a series of
> runs?

I did some previous testing on query #1 where I determined that it
runs just as fast with default_statistics_target=1 (no, that's not a
typo) as default_statistics_target=1000. The plan is stable down to
values in the 5-7 range; below that it changes but not appreciably for
the worse. I could test the other queries but I suspect the results
are similar because the tables are small and should be well-modelled
even when the MCV and histogram sizes are small. The point here is to
figure out how much we're paying in additional planning time in the
worst-case scenario where the statistics aren't helping.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2008-12-06 05:30:26 Re: Mostly Harmless: Welcoming our C++ friends
Previous Message Robert Treat 2008-12-06 04:33:16 Re: Mostly Harmless: Welcoming our C++ friends