Re: Wishlist for 7.4: Plan stability

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Wishlist for 7.4: Plan stability
Date: 2002-12-03 22:35:57
Message-ID: 8765uaenn6.fsf@stark.dyndns.tv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > Ok, someone else posted their data warehousing wishlist, I want to post my
> > single item OLP wishlist: Plan stability.
>
> That seems to me to translate to "I want the system to fail to react to
> changes in data statistics and all other variables relevant to query
> planning".
>
> You can pretty much get that by never doing [VACUUM] ANALYZE, but I'm
> quite lost as to why it's really a good idea.

Well, first of all there's no guarantee that the genetic algorithm will
actually produce the same plan twice, but that's a side issue.

The main issue is that you *do* want to vacuum and analyze the database
regularly to get good performance, but you don't want the database
spontaneously changing its behaviour without testing and verifying the new
behaviour personally. Not if it's a high availability production server.

I'm thinking it should require a specific privilege that can be separately
access controlled to parse a new query that didn't already exist in the query
table.

Then for a production server I would expect the DBA to arrange for vacuum
analyze to run regularly during off-peak hours. Have a job test all the
queries and report any changed optimiser behaviour. Then have a DBA sanity
check and test the performance of any new query plans before allowing them to
go into production.

But the threat of the optimiser changing behaviour from running analyze isn't
even the main threat I see this addressing. The threat of untested queries
entering production from new code being pushed live is far greater. I've seen
web sites go down more often from new queries with bad performance that were
missed in testing more often than any other source.

And I've seen security holes caused by applications that allow untrusted users
to slip unexpected sql syntax into queries more often than any other reason.

Really it boils down to one point: there's really no reason to assume a user
should be able to execute any new query he feels like. Creating a new query
should be privileged operation just like creating a new table or new database.

For some systems such as development systems it of course makes sense for
users to be able to create new queries on the fly. For DSS systems too it's
pretty much assumed.

But for OLTP systems it's very unlikely that a new query should suddenly be
necessary. These systems spend their days running the same queries millions of
times per day. They need to return results within milliseconds. Any new query
should be assumed to be a bug or a security breach and reported as an
immediate error. not cause the database to valiantly attempt to figure out how
best to handle the unexpected query.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Clift 2002-12-03 23:09:04 Re: PostgreSQL in Universities (Was: Re: 7.4 Wishlist)
Previous Message Vince Vielhaber 2002-12-03 21:40:15 Re: [GENERAL] PostgreSQL Global Development Group Announces