second concurrent update takes forever

From: Janning Vygen <vygen(at)kicktipp(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: second concurrent update takes forever
Date: 2010-02-08 17:10:08
Message-ID: 201002081810.08343.vygen@kicktipp.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi folks,

I don't need this list very often because postgresql works like a charm! But
today we encountered a rather complicated puzzle for us. We really need your
help!

we are using postgresql 8.4 on a debian lenny with latest security patches
applied.

We are running a rather complicated Update statement from time to time which
is only triggered by administrators. The statement updates about 50.000 rows.
It takes usually about 10-30 seconds to execute and that's fine for us.

This time two administrator did run this update at approximately the same
time, so the second update started before the first finished.

The first update took about 30 seconds which is quite long but not a problem
for us. the second update was useless, because everything was already updated.
Anyway, the second statement took 5 hours!

We were able to reproduce this scenario very easily. We had a local database,
switched off auto-vacuum daemon and had no other concurrent tasks.

We just ran the statement and while it was running we started the very same
statement in a different database session. the first statement finished in 3
seconds, the second statement never returned (we canceled it after an hour or
so).

so we read the documentation about transaction isolation
http://www.postgresql.org/docs/current/static/transaction-iso.html
we are in read committed isolation.

It says about UPDATE statements "The search condition of the command (the
WHERE clause) is re-evaluated to see if the updated version of the row still
matches the search condition."

the update statement has an additional from clause, I show you slightly
simplified version of it (to make it more readable)

UPDATE
queue
SET
queue_status =
CASE WHEN status = 0 OR status = 2
THEN status + 1
ELSE status
END,
queue_index =
CASE WHEN status = 0 OR status = 2
THEN updates.index
ELSE
CASE WHEN queue.index > updates.index
THEN updates.index
ELSE queue.index
END
END
FROM
(
SELECT
matchday.group_id AS group_id,
min (matchday.index) AS index
FROM event NATURAL LEFT JOIN matchday
WHERE event.event_id IN ( 3033445 )
GROUP BY matchday.group_id
) AS updates
WHERE
queue.group_id = updates.group_id
;

so the essence is: the statement has a rather long executing sub-select
statement which takes about a 800ms. I "suspect" postgresql to do the sub-
selection and while updating the first row of this sub-select it sees that this
row was updated since the update statement (or the transaction) was started.
And then it re-checks the condition just for this row (or maybe for a bunch of
rows on the same page). if it comes to updating the next row it happens again.

re-checking 40000+ rows with a very expensive sub-select plan is not good at
all. 40000 times 800 ms = 10 hours.

Am I right with my suspicion?
If yes, how can I fix it?

kind regards
Janning

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2010-02-08 17:24:45 Re: which the best way to start postgres.
Previous Message Greg Smith 2010-02-08 16:47:57 Re: Multiple buffer cache?