Re: Joel's Performance Issues WAS : Opteron vs Xeon

From: "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
To: "Joel Fradkin" <jfradkin(at)wazagua(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon
Date: 2005-04-25 14:13:34
Message-ID: 6EE64EF3AB31D5448D0007DD34EEB3415C2676@Herge.rcsinc.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> I am waiting to here back from Josh on using cursors and trying to
flatten
> long running views.
>
> I am a little disappointed I have not understood enough to get my
analyzer
> to use the proper plan, we had to set seqscan off to get the select
from
> response_line to work fast and I had to turn off merge joins to get
assoc
> list to work fast. Once I am up I can try to learn more about it, I am
so
> glad there are so many folks here willing to take time to educate us
> newb's.

I am not a big fan of tweaking the optimizer because you are robbing
Peter to pay Paul, so to speak. pg 8.1 may come out with new optimizer
tweaks and you'll have to do it all over again. If the optimizer is not
'getting' your view, there are a few different approaches to fixing the
problem.

I am also not a big fan of de-normalizing your database. Essentially
you are lighting a fuse that may blow up later. Here are some general
approaches to planner optimization that can help out in tricky
situations.

1. Split up views. Often overlooked but can provide good enhancements.
If your view is based on 3 or more tables, has left/right joins,
consider breaking it up into two or more views. Views can be based on
views and it is easier to force the planner to pick good plans this way.
If you can find other uses for component views in other queries, so much
the better.

2. Materialize your view. Use lazy materialization, i.e. you query the
view into a table at scheduled times. Now we are trading disk spaces
and coherence for performance...this may not fit your requirements but
the nice thing about it is that it will help give us the 'ideal plan'
running time which we are shooting for.

3. pl/pgsql. Using combinations of loops, refcursors, and queries, you
can cut code that should give you comparable performance to the ideal
plan. If you can do the actual work here as well (no data returned to
client), you get a tremendous win. Also pl/pgsql works really well for
recursive sets and other things that are difficult to run in the context
of a single query. Just be aware of the disadvantages:
a. not portable
b. maintenance overhead
c. require relatively high developer skill set

I will go out on a limb and say that mastering the above approaches can
provide the solution to virtually any performance problem within the
limits of your hardware and the problem complexity.

Based on your questions, it sounds to me like your #1 problem is your
developer skillset relative to your requirements. However, this is
easily solvable...just keep attacking the problem and don't be afraid to
bring in outside help (which you've already done, that's a start!).

Merlin

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-04-25 15:23:00 Re: [HACKERS] Bad n_distinct estimation; hacks suggested?
Previous Message Joel Fradkin 2005-04-25 14:07:39 Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon