Re: Stopgap solution for table-size-estimate updatingproblem

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Stopgap solution for table-size-estimate updatingproblem
Date: 2004-11-29 18:49:57
Message-ID: 87wtw4sd62.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Nonsense. You're assuming incremental changes (ie, only a small
> fractional change in table size), but we are getting killed by
> non-incremental cases. If the plan cost estimates are such that a small
> fractional change in table size will cause the planner to switch to a
> hugely worse plan, then you're living on the edge of disaster anyway.
> Or are you telling me that every time you VACUUM or ANALYZE, you
> immediately hand-inspect the plans for every query you use?

Well with the current situation the best I can hope for is to run analyze at
times when we can withstand minor outages and I can respond. Probably I would
run it during off-peak hours. So basically while I don't hand-inspect plans,
I'm using the site to test them. If the site's still running 5 minutes after
the analyze then they're probably ok.

I have actually written up a script that I intend to experiment with that
explains every query in the system then runs analyze within a transaction and
then reruns explain on every query to check for any changed plans. It only
commits if there are no unchanged plans. This is all just an experiment
though. I'm not sure how effective it'll be.

> A further point is that only VACUUM can decrease the table size, and
> VACUUM already updates these stats anyway. The only "loss of control"
> involved here is prevention of a plan change in response to a
> significant increase in table size. Overestimates of result size
> usually don't produce as horrible plans as underestimates, so the
> downside doesn't seem as large as you make it out to be.

That's true. I don't think the proposed change makes the situation with
respect to plan stability any worse than the status quo. But it does seem to
lock us into the idea that plans could change at any time whatsoever.

I'm not sure why VACUUM without ANALYZE updates the statistics at all though.
Isn't that what ANALYZE is for?

> This is pure fantasy. It certainly has nothing to do with the current
> state of nor future directions for the planner, and you haven't even
> convinced me that it's a desirable goal. What you are describing is a
> brittle, inflexible system that is much more likely to break under
> unforeseen circumstances than it is to perform well reliably.

Huh. That's how I see the current setup. I find the current thinking too
fragile precisely because there's no way to test it and guarantee it will
perform consistently. I want something that won't suddenly change behaviour in
ways I can't predict. I want something that will consistently run the same
code path every time except at well defined points in time according to well
defined processes.

I'll point out other databases end up treading the same ground. Oracle started
with a well defined rules-based system that was too inflexible to handle
complex queries. So they went to a cost-based optimizer much like Postgres's
current optimizer. But DBAs resisted for a long time precisely because they
couldn't control it or predict its behaviour as well. Now they have a plan
stability system where you can plan queries using the cost based optimizer but
then store the plans for future use. You can even take the plans and store
them and load them on development systems for testing.

Their system is awfully kludgy though. Postgres can probably do much better.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marc G. Fournier 2004-11-29 19:03:35 Opinions on Usenet ...
Previous Message Greg Stark 2004-11-29 18:30:27 Re: Documentation on PITR still scarce