Re: anti-join chosen even when slower than old plan

From: Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: anti-join chosen even when slower than old plan
Date: 2010-11-11 17:13:08
Message-ID: 4CDC2424.9010102@vmsinfo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Kevin Grittner wrote:
> Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com> wrote:
>
>
>> create a definitive bias toward one type of the execution plan.
>>
>
> We're talking about trying to support the exact opposite.
I understand this, that is precisely the reason for my intervention into
the discussion of experts, which I am not.
> This all
> started because a database which was tuned for good response time
> for relatively small queries against a "hot" portion of some tables
> chose a bad plan for a weekend maintenance run against the full
> tables. We're talking about the possibility of adapting the cost
> factors based on table sizes as compared to available cache, to more
> accurately model the impact of needing to do actual disk I/O for
> such queries.
>
Kevin, in my experience, the hardest thing to do is to tune so called
mixed type databases. In practice, databases are usually separated: OLTP
database on one group of servers, reporting database and the data
warehouse on another group of servers. Postgres 9.0 has made a great
stride toward such possibility with the new replication facilities.
Again, having an optimizer which will choose the plan completely
accurately is, at least in my opinion, less important than having a
possibility of manual control, the aforementioned "knobs and buttons"
and produce the same plan for the same statement. Trying to make the
optimizer smart enough for all types of loads is akin to looking for the
Holy Grail. Inevitably, you will face some hard questions, like the one
about the airspeed velocity of an unladen swallow, and the whole search
is likely to end in pretty funny way, not producing the desired
"optimizing genie in the CPU".
>
> This also is very different from trying to adapt queries to what
> happens to be currently in cache. As already discussed on a recent
> thread, the instability in plans and the failure to get to an
> effective cache set make that a bad idea. The idea discussed here
> would maintain a stable plan for a given query, it would just help
> choose a good plan based on the likely level of caching.
>
Kevin, I am talking from the perspective of a DBA who is involved with a
production databases on day-to-day basis. I am no expert but I do
believe to speak from a perspective of users that Postgres has to win in
order to make further inroads into the corporate server rooms. Without
the possibility of such control and the plan stability, it is hard for
me to recommend more extensive use of PostgreSQL to my boss. Whatever
solution is chosen, it needs to have "knobs and buttons" and produce the
plans that will not change when the CPU usage goes up.

--

Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
http://www.vmsinfo.com
The Leader in Integrated Media Intelligence Solutions

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2010-11-11 17:45:51 Re: anti-join chosen even when slower than old plan
Previous Message Bob Lunney 2010-11-11 17:05:56 Re: anti-join chosen even when slower than old plan