Re: BUG #5443: Undetected deadlock situation

From: Claudio Freire <claudio(at)livra(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5443: Undetected deadlock situation
Date: 2010-06-03 17:34:14
Message-ID: 1275586454.24950.8.camel@klauss.livra.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, 2010-04-30 at 11:50 -0400, Tom Lane wrote:

> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> > Eliminating null columns and mangling column headers for length, I
> > get this:
>
> > locktype | tranid | virtualx | pid | mode | gr
> > transactionid | 39773877 | 63/15761 | 11157 | ShareLock | f
> > transactionid | 39773877 | 4/10902 | 6421 | ExclusiveLock | t
>
> > So it looks like two locks on the same transaction ID by different
> > transactions. How does that happen?
>
> That's perfectly normal --- it indicates that pid 11157 is waiting for
> a row-level lock that's currently held by pid 6421. We translate
> row-level locking delays into waits for XID locks in order to limit
> the number of distinct locks that have to be remembered in the shared
> lock table. (We'd soon blow out shared memory if per-row lock data had
> to be kept there.)
>
> As Peter stated, there's no evidence of an actual problem in this
> bug report. I'd go looking for clients sitting with open
> transactions...

It may be that indeed there isn't a deadlock, but an abnormal
performance drop.

I couldn't get a gdb trace before I just had to work around the issue
since it happens in a production server, and when it does it means
trouble for our app.

The workaround is to break the transaction into a SELECT for the rows to
be updated, followed by individual updates batched in transactions of a
couple hundred.

With that pattern, everything works as expected. The SELECT takes a
sweet time (but doesn't block any other transaction), and the updates go
pretty fast thanks to HOT.

I didn't try a SELECT FOR UPDATE followed by all the updates in a single
transaction yet, that would result in the same behavior as the massive
update, but perhaps without the "deadlock" (or performance drop).

What I did do is analyze server load during the events, and as I
suspected, disk activity during the "deadlocks" seems to suggest a
vacuuming taking place. Although there was no autovacuum entry in
pg_stat_activity every time I checked, disk activity precisely matches
the case when autovacuum decides to vacuum a big table.

That's about as much information I can give. We've worked around the
issue successfully and it hasn't happened since. Even if it is not a
proper deadlock, the performance drop is unacceptable. I've done massive
updates before, and that performance drop was not expected (more than
one day for updating 30k rows on a table with a couple indices).

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2010-06-03 17:42:11 Re: BUG #5443: Undetected deadlock situation
Previous Message Tom Lane 2010-06-03 17:21:29 Re: superuser unable to modify settings of a system table