Re: [HACKERS] Slow count(*) again...

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "david(at)lang(dot)hm" <david(at)lang(dot)hm>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [HACKERS] Slow count(*) again...
Date: 2011-02-03 00:03:06
Message-ID: 4D49F0BA.9000901@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Mladen Gogala wrote:
> Greg, how many questions about queries not using an index have you
> seen? There is a reason why people keep asking that. The sheer number
> of questions like that on this group should tell you that there is a
> problem there. There must be a relatively simple way of influencing
> optimizer decisions.

I think that's not quite the right question. For every person like
yourself who is making an informed "the optimizer is really picking the
wrong index" request, I think there are more who are asking for that but
are not actually right that it will help. I think you would agree that
this area is hard to understand, and easy to make mistakes about, yes?
So the right question is "how many questions about queries not using an
index would have actually benefitted from the behavior they asked for?"
That's a much fuzzier and harder to answer question.

I agree that it would be nice to provide a UI for the informed.
Unfortunately, the problem I was pointing out is that doing so could, on
average, make PostgreSQL appear to run worse to people who use it.
Things like which index and merge type are appropriate changes as data
comes in, and some of the plan switches that occur because of that are
the right thing to do--not a mistake on the optimizer's part. I'm sure
you've seen people put together plan rules for the RBO that worked fine
on small data sets, but were very wrong as production data volume went
up. That problem should be less likely to happen to a CBO approach. It
isn't always, of course, but trying to build a RBO-style approach from
scratch now to resolve those cases isn't necessarily the right way to
proceed.

Given limited resources as a development community, it's hard to justify
working on hinting--which has its own complexity to do right--when there
are so many things that I think are more likely to help *everyone* that
could be done instead. The unfortunate situation we're in, unlike
Oracle, is that there isn't a practically infinite amount of money
available to fund every possible approach here, then see which turn out
to work later after our customers suffer through the bad ones for a while.

> With all due respect, I consider myself smarter than the optimizer.
> I'm 6'4", 235LBS so telling me that you disagree and that I am more
> stupid than a computer program, would not be a smart thing to do.
> Please, do not misunderestimate me.

I remember when I used to only weigh that much. You are lucky to be
such a slim little guy!

Oh, I guess I should add, :)

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-02-03 00:13:44 Re: [HACKERS] Slow count(*) again...
Previous Message Bruce Momjian 2011-02-02 23:54:42 LIKE, CHAR(), and trailing spaces

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2011-02-03 00:13:44 Re: [HACKERS] Slow count(*) again...
Previous Message Dan Birken 2011-02-02 23:15:26 Which RAID Controllers to pick/avoid?