Re: BUG #7510: Very bad costing estimation on gin vs gist with FTS

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: daniel(at)heroku(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7510: Very bad costing estimation on gin vs gist with FTS
Date: 2012-08-30 04:10:06
Message-ID: 20426.1346299806@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

daniel(at)heroku(dot)com writes:
> If one adds a GIN index and does a fresh analyze, the planner still produce
> a plan for the GiST index. Because there is no way to disable particular
> indexes in a session, it's impossible to quickly experiment with a new
> hypothetical situation with only the GIN index without possibly painting
> yourself into a corner where you've dropped a needed index.

FWIW, there is a pretty standard workaround for that:

begin;
drop index unwanted_index;
explain ...;
rollback;

This isn't ideal in a production database because it requires exclusive
lock on the table for long enough to run the EXPLAIN. But it's not true
that there's no way to handle this at all. If you want something more
flexible, I'd suggest working on improving the "index advisor" plugin
that was getting batted around a couple years ago. There are sufficient
hooks in the planner to let it be given an arbitrary hypothetical set of
indexes. But I digress...

Anyway, the meat of your complaint is that the planner is overestimating
the cost of a GIN scan relative to a GIST scan. I believe the reason
for this is that gincostestimate() is trying to make a fairly honest
estimate of the work involved, whereas gistcostestimate() is just a stub
around genericcostestimate(), which computes an estimate that's more or
less suitable for btree-equivalent index operations. There's certainly
not any intelligence in the latter that would be capable of dealing with
issues like how much a tsvector "*" operator is going to hurt. We could
stand to have less bogus estimates for GIST (not to mention SPGIST), but
I'm really not familiar enough with either to write better code for that.

It also seems possible that you've tripped over a plain old performance
bug in the GIST code, ie, the fault is not with the estimate but the
reality. It's hard to tell about that though. Do you want to try to
make up a smaller self-contained test case?

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message i 2012-08-30 04:34:18 BUG #7511: 9.2: pg_stat_activity.procpid renamed unnecessarily
Previous Message Bruce Momjian 2012-08-30 01:29:49 Re: BUG #6639: Manual uses boldface where it says italic, and monospace where it says boldface