Re: Hints proposal

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hints proposal
Date: 2006-10-12 15:19:10
Message-ID: 200610121519.k9CFJA416544@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance


Because DB2 doesn't like hints, and the fact that they have gotten to a
point where they feel they do not need them, I feel we too can get to a
point where we don't need them either. The question is whether we can
get there quickly enough for our userbase.

I perfer attacking the problem at the table definition level, like
something like "volatile", or adding to the existing table statistics.

---------------------------------------------------------------------------

Jim C. Nasby wrote:
> Posting here instead of hackers since this is where the thread got
> started...
>
> The argument has been made that producing a hints system will be as hard
> as actually fixing the optimizer. There's also been clamoring for an
> actual proposal, so here's one that (I hope) wouldn't be very difficult
> to implemen.
>
> My goal with this is to keep the coding aspect as simple as possible, so
> that implementation and maintenance of this isn't a big burden. Towards
> that end, these hints either tell the planner specifically how to handle
> some aspect of a query, or they tell it to modify specific cost
> estimates. My hope is that this information could be added to the
> internal representation of a query without much pain, and that the
> planner can then use that information when generating plans.
>
> The syntax these hints is something arbitrary. I'm borrowing Oracle's
> idea of embedding hints in comments, but we can use some other method if
> desired. Right now I'm more concerned with getting the general idea
> across.
>
> Since this is such a controversial topic, I've left this at a 'rough
> draft' stage - it's meant more as a framework for discussion than a
> final proposal for implementation.
>
> Forcing a Plan
> --------------
> These hints would outright force the planner to do things a certain way.
>
> ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */
>
> This would force the planner to access table via a seqscan or
> index_name. For the index case, you can also specify if the access must
> or must not be via a bitmap scan. If neither is specified, the planner
> is free to choose either one.
>
> Theoretically, we could also allow "ACCESS INDEX" without an index name,
> which would simply enforce that a seqscan not be used, but I'm not sure
> how useful that would be.
>
> ... FROM a JOIN b /* {HASH|NESTED LOOP|MERGE} JOIN */ ON (...)
> ... FROM a JOIN b ON (...) /* [HASH|NESTED LOOP|MERGE] JOIN */
>
> Force the specified join mechanism on the join. The first form would not
> enforce a join order, it would only force table b to be joined to the
> rest of the relations using the specified join type. The second form
> would specify that a joins to b in that order, and optionally specify
> what type of join to use.
>
> ... GROUP BY ... /* {HASH|SORT} AGGREGATE */
>
> Specify how aggregation should be handled.
>
> Cost Tweaking
> -------------
> It would also be useful to allow tweaking of planner cost estimates.
> This would take the general form of
>
> node operator value
>
> where node would be a planner node/hint (ie: ACCESS INDEX), operator
> would be +, -, *, /, and value would be the amount to change the
> estimate by. So "ACCESS INDEX my_index / 2" would tell the planner to
> cut the estimated cost of any index scan on a given table in half.
>
> (I realize the syntax will probably need to change to avoid pain in the
> grammar code.)
>
> Unlike the hints above that are ment to force a certain behavior on an
> operation, you could potentially have multiple cost hints in a single
> location, ie:
>
> FROM a /* HASH JOIN * 1.1 NESTED LOOP JOIN * 2 MERGE JOIN + 5000 */
> JOIN b ON (...) /* NESTED LOOP JOIN - 5000 */
>
> The first comment block would apply to any joins against a, while the
> second one would apply only to joins between a and b. The effects would
> be cumulative, so this example means that any merge join against a gets
> an added cost of 5000, unless it's a join with b (because +5000 + -5000
> = 0). I think you could end up with odd cases if the second form just
> over-rode the first, which is why it should be cummulative.
> --
> Jim Nasby jim(at)nasby(dot)net
> EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua Marsh 2006-10-12 15:26:24 Re: Hints proposal
Previous Message Jim C. Nasby 2006-10-12 15:14:39 Hints proposal

Browse pgsql-performance by date

  From Date Subject
Next Message Joshua Marsh 2006-10-12 15:26:24 Re: Hints proposal
Previous Message Jim C. Nasby 2006-10-12 15:14:39 Hints proposal