Re: Performace Optimization for Dummies

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performace Optimization for Dummies
Date: 2006-09-28 18:44:13
Message-ID: 20060928184413.GU34238@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Sep 28, 2006 at 01:47:44PM -0400, Carlo Stonebanks wrote:
> > How are you loading the tables? Copy? Insert?
>
> Once the data is transformed, it is inserted. I don't have stats, but the
> programs visual feedback does not spend a lot of time on the "inserting
> data" message. Then again, if there is an asynchronous component to an
> insert, perhaps I am not seeing how slow an insert really is until I query
> the table.

Well, individual inserts are slow, especially if they're not wrapped up
in a transaction. And you also mentioned checking for dupes. I suspect
that you're not going to find any huge gains in tuning the database...
it sounds like the application (as in: how it's using the database) is
what needs help.

> >> work_mem = 32768
> >
> > Depending on what you are doing, this is could be to low or to high.
>
> Is this like "You could be too fat or too thin"? Aren't you impressed with
> the fact that I managed to pick the one number that was not right for
> anything?

For what you're doing, it's probably fine where it is... but while
you're in the single-thread case, you can safely make that pretty big
(like 1000000).

> >
> >> maintenance_work_mem = 32768
> >> checkpoint_segments = 128
> >> effective_cache_size = 10000
> >
> > This coudl probably be higher.

I'd suggest setting it to about 3G, or 375000.
> >
> >> random_page_cost = 3
> >> stats_start_collector = on
> >> stats_command_string = on
> >> stats_row_level = on
> >> autovacuum = on
> >
> > Stats are a hit... you need to determine if you actually need them.
>
> Unfortunately, this is the only way I know of of getting the query string to
> appear in the PostgreSQL server status display. While trying to figure out
> what is slowing things down, having that is really helpful. I also imagined
> that this sort of thing would be a performance hit when you are getting lots
> of small, concurrent queries. In my case, we have queries which are taking
> around a second to perform outer joins. They aren't competing with any other
> requests as the site is not running, we are just running one app to seed the
> data.

stats_command_string can extract a huge penalty pre-8.2, on the order of
30%. I'd turn it off unless you *really* need it. Command logging (ie:
log_min_duration_statement) is much less of a burden.

The fact that you're doing outer joins while loading data really makes
me suspect that the application needs to be changed for any real
benefits to be had. But you should still look at what EXPLAIN ANALYZE is
showing you on those queries; you might be able to find some gains
there.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-09-28 18:45:38 Re: Performace Optimization for Dummies
Previous Message Jim C. Nasby 2006-09-28 18:34:23 Re: Performace Optimization for Dummies