From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: XLogInsert scaling, revisited |
Date: | 2013-06-22 11:32:46 |
Message-ID: | 51C58B5E.7030102@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 21.06.2013 21:55, Jeff Janes wrote:
> I think I'm getting an undetected deadlock between the checkpointer and a
> user process running a TRUNCATE command.
>
> This is the checkpointer:
>
> #0 0x0000003a73eeaf37 in semop () from /lib64/libc.so.6
> #1 0x00000000005ff847 in PGSemaphoreLock (sema=0x7f8c0a4eb730,
> interruptOK=0 '\000') at pg_sema.c:415
> #2 0x00000000004b0abf in WaitOnSlot (upto=416178159648) at xlog.c:1775
> #3 WaitXLogInsertionsToFinish (upto=416178159648) at xlog.c:2086
> #4 0x00000000004b657a in CopyXLogRecordToWAL (write_len=32, isLogSwitch=1
> '\001', rdata=0x0, StartPos=<value optimized out>, EndPos=416192397312)
> at xlog.c:1389
> #5 0x00000000004b6fb2 in XLogInsert (rmid=0 '\000', info=<value optimized
> out>, rdata=0x7fff00000020) at xlog.c:1209
> #6 0x00000000004b7644 in RequestXLogSwitch () at xlog.c:8748
Hmm, it looks like the xlog-switch is trying to wait for itself to
finish. The concurrent TRUNCATE is just being blocked behind the
xlog-switch, which is stuck on itself.
I wasn't able to reproduce exactly that, but I got a PANIC by running
pgbench and concurrently doing "select pg_switch_xlog()" many times in psql.
Attached is a new version that fixes at least the problem I saw. Not
sure if it fixes what you saw, but it's worth a try. How easily can you
reproduce that?
> This is using the same testing harness as in the last round of this patch.
> Is there a way for me to dump the list of held/waiting lwlocks from gdb?
You can print out the held_lwlocks array. Or to make it more friendly,
write a function that prints it out and call that from gdb. There's no
easy way to print out who's waiting for what that I know of.
Thanks for the testing!
- Heikki
Attachment | Content-Type | Size |
---|---|---|
xloginsert-scale-24.patch.gz | application/x-gzip | 26.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2013-06-22 12:37:58 | Re: Support for REINDEX CONCURRENTLY |
Previous Message | Simon Riggs | 2013-06-22 10:39:15 | Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize |