Re: atrocious update performance

From: "Rosser Schwarz" <rschwarz(at)totalcardinc(dot)com>
To: "'Greg Spiegelberg'" <gspiegelberg(at)cranel(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: atrocious update performance
Date: 2004-03-24 16:38:35
Message-ID: 004301c411be$74004650$2500fa0a@CardServices.TCI.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greg Spiegelberg wrote:

> > Will advise.

After creating 100, 1K, 10K, 100K and 1M-row subsets of account.cust and
the corresponding rows/tables with foreign key constraints referring to
the table, I'm unable to reproduce the behavior at issue.

explain analyze looks like the following, showing the query run with the
join column indexed and not, respectively:

# explain analyze update test.cust100 set prodid = tempprod.prodid,
subprodid = tempprod.subprodid where origid = tempprod.debtid;
-- with index
QUERY PLAN
-----------------------------------------------------------------------
Merge Join (cost=0.00..25.64 rows=500 width=220) (actual
time=0.241..13.091 rows=100 loops=1)
Merge Cond: (("outer".origid)::text = ("inner".debtid)::text)
-> Index Scan using ix_origid_cust100 on cust100 (cost=0.00..11.50
rows=500 width=204) (actual time=0.125..6.465 rows=100 loops=1)
-> Index Scan using ix_debtid on tempprod (cost=0.00..66916.71
rows=4731410 width=26) (actual time=0.057..1.497 rows=101 loops=1)
Total runtime: 34.067 ms
(5 rows)

-- without index
QUERY PLAN
----------------------------------------------------------------------
Merge Join (cost=7.32..16.71 rows=100 width=220) (actual
time=4.415..10.918 rows=100 loops=1)
Merge Cond: (("outer".debtid)::text = "inner"."?column22?")
-> Index Scan using ix_debtid on tempprod (cost=0.00..66916.71
rows=4731410 width=26) (actual time=0.051..1.291 rows=101 loops=1)
-> Sort (cost=7.32..7.57 rows=100 width=204) (actual
time=4.311..4.450 rows=100 loops=1)
Sort Key: (cust100.origid)::text
-> Seq Scan on cust100 (cost=0.00..4.00 rows=100 width=204)
(actual time=0.235..2.615 rows=100 loops=1)
Total runtime: 25.031 ms
(7 rows)

With the join column indexed, it takes roughly .32ms/row on the first
four tests (100.. 100K), and about .48ms/row on 1M rows. Without the
index, it runs 100 rows @ .25/row, 1000 @ .26, 10000 @ .27, 100000 @
.48 and .5 @ 1M rows.

In no case does the query plan reflect foreign key validation. Failing
any other suggestions for diagnosis in the soon, I'm going to nuke the
PostgreSQL install, scour it from the machine and start from scratch.
Failing that, I'm going to come in some weekend and re-do the machine.

> Problem is when I recreate the indexes and add the constraints back
> on ORIG I end up with the same long running process. The original
> UPDATE runs for about 30 minutes on a table of 400,000 with the
> WHERE matching about 70% of the rows. The above runs for about 2
> minutes without adding the constraints or indexes however adding the
> constraints and creating the dropped indexes negates any gain.

Is this a frequently-run update?

In my experience, with my seemingly mutant install, dropping indices
and constraints to shave 14/15 off the update time would be worth the
effort. Just script dropping, updating and recreating into one large
transaction. It's a symptom-level fix, but re-creating the fifteen
indices on one of our 5M row tables doesn't take 28 minutes, and your
hardware looks to be rather less IO and CPU bound than ours. I'd also
second Tom's suggestion of moving to 7.4.

/rls

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2004-03-24 16:44:39 Re: Help with query plan inconsistencies
Previous Message scott.marlowe 2004-03-24 16:26:40 Re: slow vacuum performance