Limiting the number of parameterized indexpaths created

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Limiting the number of parameterized indexpaths created
Date: 2012-10-30 21:57:04
Message-ID: 21634.1351634224@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I looked into the complaint of unreasonable planner runtime in bug #7626,
http://archives.postgresql.org/pgsql-bugs/2012-10/msg00232.php

In the given example, the indexed relation "foo" has join clauses with
30 other relations. The code that I added in commit
3b8968f25232ad09001bf35ab4cc59f5a501193e will try all 2^30 combinations
of those rels as possible outer relations for a parameterized indexscan
:-(. So clearly, the idea that we can just try everything and not have
some kind of heuristic restriction isn't going to work.

This particular example, and probably a lot of similar examples, does
have a principled non-heuristic fix, which comes from the fact that we
recognize all the join clauses as belonging to the same equivalence
class. Therefore, using more than one of them in an indexscan is
useless. (The code already does know that and discards the extra
clauses, but that happens too far downstream to prevent the exponential
search time here.) The extra function eclass_already_used() added in
the attached patch attacks the problem this way, and is sufficient to
resolve the bug for the case presented.

However, we still have an issue for cases where the join clauses aren't
equivalence clauses. (For instance, if you just change all the "="
signs to "<" in Brian's example, it still takes forever.) So we also
need some heuristic rule to limit the number of cases considered.

I spent a good deal of time thinking about how the indexscan code might
make use of the joinlist data structure (which prevents exponential
planning time growth overall by limiting which joins we'll consider)
to fix this; the idea being to not consider outer-relation sets that
couldn't actually be presented for use because of joinlist restrictions.
But this looked messy, and probably expensive in itself. Moreover it
seemed like practical implementations of the idea would carry the
restriction that we could never generate parameterized paths for
joinlist subproblems, which would be a real shame. And we'd be doing a
lot of work for what is probably a very unusual corner case, given that
I think eclass_already_used() will fix the problem in most real cases.

So what I'm proposing instead, which is implemented in the other half of
the attached patch, is that we simply put an arbitrary limit on how many
outer-relation sets we'll consider while generating parameterized
indexscans. As written, the patch limits the number of relid sets to 10
times the number of join clauses it's working with, which is small
enough to keep the runtime of this test case to something reasonable.
I think that in practice this will still allow useful
combination-outer-rel cases to be found. I wonder though if anyone has
a better idea?

regards, tom lane

Attachment Content-Type Size
limit-indexpath-plan-time.patch text/x-patch 6.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2012-10-30 22:02:25 Re: Proposal for Allow postgresql.conf values to be changed via SQL
Previous Message Josh Berkus 2012-10-30 21:54:57 Re: Proposal for Allow postgresql.conf values to be changed via SQL