Re: Slow update

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: Abu Mushayeed <abumushayeed(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow update
Date: 2007-02-01 01:32:20
Message-ID: 45C14324.1020907@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Following is one of the update query and it's explain plan which takes
> about 6 mins to execute. I am trying to find a way to execute it faster.
> The functions used in the update statement are if then else test and
> then return one value or the other.
> =====================================================
> update mdc_upc
> set total_curoh = ownedgoods(k.gmmid,k.divid, loc1_oh,
> loc2_oh,loc3_oh,loc120_oh, loc15_oh,chesh_curoh),
> total_curoo =orderedgoods(k.gmmid,k.divid, loc1_oo,
> loc2_oo,loc3_oo,loc120_oo, loc15_oo,chesh_curoo),
> total_oh = ownedgoods(k.gmmid,k.divid, 0, 0,loc3_oh,loc120_oh,
> loc15_oh,chesh_oh),
> total_oo =orderedgoods(k.gmmid,k.divid, 0, 0,loc3_oo,loc120_oo,
> loc15_oo,chesh_oo)
> from mdc_products p LEFT OUTER JOIN
> kst k on p.dvm_d = k.dept
> where p.keyp_products = mdc_upc.keyf_products;
>
---------------------------------------------------------------------------
> Hash Join (cost=48602.07..137331.77 rows=695899 width=391)
> Hash Cond: ("outer".keyf_products = "inner".keyp_products)
> -> Seq Scan on mdc_upc (cost=0.00..59153.99 rows=695899 width=383)
> -> Hash (cost=47274.60..47274.60 rows=530990 width=12)
> -> Hash Left Join (cost=43.85..47274.60 rows=530990 width=12)
> Hash Cond: ("outer".dvm_d = "inner".dept)
> -> Seq Scan on mdc_products p (cost=0.00..39265.90
> rows=530990 width=8)
> -> Hash (cost=41.48..41.48 rows=948 width=12)
> -> Seq Scan on kst k (cost=0.00..41.48 rows=948
> width=12)
>
> ======================================================
> I have seen that the updates are very slow on our system. What parameter
> should I test in order to find out why is it slow during update.
>
> Thanks
> Abu

Obviously the update is slow because of sequential scans of the table,
with about 1GB od data to read - this may seem as 'not too much of data'
but there may be a lot of seeks, knocking the performance down. You can
use iowait / dstat to see how much time is spent waiting for the data.

I have to admin I don't fully uderstand that query as I've never used
the UPDATE ... FROM query, but it seems to me you could set up some
indexes to speed things up. I'd definitely start with indexes on the
columns used in join conditions, namely

CREATE INDEX mdc_products_keyp_idx ON mdc_products(keyp_products);
CREATE INDEX mdc_upc_keyf_idx ON mdc_upc(keyf_products);
CREATE INDEX mdc_products_dvm_idx ON mdc_products(dvm_d);
CREATE INDEX kst_dept_idx ON kst(dept);

but this is just a guess as I really know NOTHING about those tables
(structure, size, statistical features, etc.) Btw. don't forget to
analyze the tables.

Another 'uncertainty' is related to the functions used in your query,
namely ownedgoods() / orderedgoods(). If these procedures do something
nontrivial (searches in tables, etc.) it might have severe impact on the
query - but the parser / optimizer knows nothing about these procedures
so it can't optimize them.

If this won't help, we'll need some more information - for example what
does the 'our system' mean - how much memory doest it have? What are the
important settings in your postgresql.conf? Especially what is the value
of effective_cache_size, work_mem, and some others (for example number
of checkpoint segments as it seems like a write-intensive query).

And last but not least info of the structure and size of the tables
(columns, indexes, number of rows, number of occupied blocks, etc).

Tomas

In response to

  • Slow update at 2007-02-01 00:57:38 from Abu Mushayeed

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Stosberg 2007-02-01 14:10:09 Using statement_timeout as a performance tool?
Previous Message Chad Wagner 2007-02-01 01:29:55 Re: Very slow queries