AW: AW: More Performance

From: Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: AW: AW: More Performance
Date: 2000-05-25 08:03:43
Message-ID: 219F68D65015D011A8E000006F8590C604AF7D9C@sdexcsrv1.f000.d0188.sd.spardat.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> We could ensure that the planner still picks an indexscan as the known
> table size grows by reducing the default selectivity estimate a little
> bit (I experimented and determined that 0.005 would do the trick, for
> the current cost model parameters). That's pretty ad-hoc, but then
> the 0.01 number is pretty ad-hoc too. It's probably better to be able
> to say "if you haven't done VACUUM ANALYZE, you will get an indexscan
> from WHERE col = const" than to have to say "it depends". Comments?

Imho the initial goal why we said 0.01, was to make it use the index,
so reducing it to 0.005 would be ok. I would actually try to calculate
the value with the current costs for index vs seq scan, so that it
guarantees
use of the index regardless of table size.
But, it probably shows a problem with the chosen metric for selectivity
itself.
Imho the chances are better, that an = restriction will return an equal
amount
of rows while the table grows than that it will return a percentage of total
table size.

>
> Of course the real issue here is that the true selectivity of '=' is
> much smaller in this table, because the column being looked at is
> unique. But the planner doesn't know that without VACUUM stats.
>
> A hack I have been thinking about adding is that CREATE UNIQUE INDEX
> for a single-column index should immediately force the attdisbursion
> value for that column to "unique", so that the planner would know the
> column is unique even without VACUUM ANALYZE. That would help not
> at all for the MySQL benchmark (it has a two-column unique index,
> but you can't conclude anything about the properties of either column
> individually from that :-(). But it'd help in a lot of real-world
> scenarios.

Yes, that would imho be a real winner.
For the multi column index we would need some magic that actually notices
that all index columns are restricted with =.

Andreas

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zeugswetter Andreas SB 2000-05-25 08:31:59 AW: More Performance
Previous Message Michael Meskes 2000-05-25 08:00:28 7.0.1?