Re: Updates are slow..

From: "Tom Burke" <lists(at)spamex(dot)com>
To: <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <ffabrizio(at)mmrd(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Updates are slow..
Date: 2002-06-11 21:26:55
Message-ID: NDBBIIDKBFNMNPDNLCDDKEACDCAA.lists@spamex.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Well, interesting developments - mostly our error. I'm going to
overkill on information just to be thorough (and because I
already collected it :).

Fran Fabrizio had an extremely helpful suggestion:
> No idea if this would help but try REINDEX TABLE EMP; beforehand too.
> VACUUM ANALYZE does not free space from indexes; only tables. I just
> discovered this last month after a year of using Pg.

After trying this, the time was cut from 40 minutes to 20 minutes,
which is a huge improvement but still quite slow.

Tom, many thanks for testing it out.
Here are the answers to your questions.

> What are the datatypes of your columns, exactly?

Because we've solved the problem, I'll just cut it down
to the columns we have data in:

Column | Type
-------------------------+--------------------------
emp_id | integer | not null
email | character varying(255) |
emp_code | character varying(50) |
created_dt | timestamp with time zone | not null
fk_emp_status_id | smallint | not null
fk_dept_id | integer | not null
fk_role_id | integer |

We actually have 10 more columns, but i no longer think it's
relevant.

> What's the average
> tuple size (actually, showing the VACUUM VERBOSE stats for the table
> would be the most useful answer)?

eppend=# vacuum verbose clientdata;
NOTICE: --Relation clientdata--
NOTICE: Index emp_pk: Pages 4018; Tuples 1230703: Deleted 597153.
CPU 0.87s/9.20u sec elapsed 15.27 sec.
NOTICE: Index emp_ix01: Pages 5445; Tuples 1230703: Deleted 597153.
CPU 0.96s/6.62u sec elapsed 17.83 sec.
NOTICE: Index emp_ix02: Pages 13972; Tuples 1230703: Deleted 597153.
CPU 2.73s/8.95u sec elapsed 78.56 sec.
NOTICE: Removed 1194306 tuples in 16048 pages.
CPU 3.28s/2.91u sec elapsed 21.83 sec.
NOTICE: Pages 32576: Changed 16469, Empty 0; Tup 1230703: Vac 1194306,
Keep 0, UnUsed 5501.
Total CPU 11.32s/28.43u sec elapsed 155.81 sec.
VACUUM

The vacuum made me realize that there was an index I had forgotten
about. emp_pk is (emp_id) and emp_ix01 is (fk_dept_id, fk_emp_status_id).
More on emp_ix02 in a second, which is the real cause of the problem.

> Are you *sure* there are no foreign
> keys either from or to this table?

Yes, actually we've dropped all other tables for simplicity of the test.
This is the only table in the database.

> Also, what plan is shown by EXPLAIN for the query?
> regards, tom lane

This explain analyse I ran took some 20+ minutes itself.

eppend=# explain analyse
eppend-# update emp
eppend-# set fk_dept_id = 5
eppend-# where fk_dept_id= 4;
NOTICE: QUERY PLAN:

Seq Scan on clientdata (cost=0.00..47959.79 rows=915643 width=522)
(actual time=1764.06..362296.23 rows=597153 loops=1)
Total runtime: 1441799.02 msec

EXPLAIN

> It would be also a good idea if you could send us the output of your
> system "ipcs" command so we can get an idea of the amount of shared
> memory you are using.
>
> - Ericson Smith

bash-2.05$ ipcs

------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x0052e2c1 0 postgres 600 68878336 4
0x00000000 163841 root 777 196608 2 dest
0x00000000 851970 root 777 196608 2 dest
0x00000000 1900547 root 777 196608 2 dest
0x00000000 2031620 root 777 196608 2 dest

------ Semaphore Arrays --------
key semid owner perms nsems status
0x0052e2c1 0 postgres 600 17
0x0052e2c2 32769 postgres 600 17
0x0052e2c3 65538 postgres 600 17

------ Message Queues --------
key msqid owner perms used-bytes messages

However, the real kicker is that I looked up what the emp_ix02 index
was and it was (email, fk_dept_id). Therefore, there were actually
two indexes on this column that had to be updated, and this index
had a lot more pages. When I dropped this index and re-ran the updated
- it took only 2 minutes!

So with the combination of the REINDEX and dropping a troublesome
index (we'll recreate the email index without dept_id), we're down
to a good time. I'm hoping reindex will help with some of the
other slow queries we've seen.

Thanks again for everyone's help!

Tom Burke
Eppend, Inc.
http://www.eppend.com/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris Gamache 2002-06-11 21:27:52 Re: "ERROR:" Messages
Previous Message Jan Wieck 2002-06-11 21:00:01 Re: details of postgres front