Re: hung backends stuck in spinlock heavy endless loop

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hung backends stuck in spinlock heavy endless loop
Date: 2015-01-15 00:26:08
Message-ID: CAHyXU0z1Ea8XvzpXm5PTY0rwYvvkRs6UBU+h7ZwFrtWQn2CY0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 14, 2015 at 5:39 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> (gdb) print BufferGetBlockNumber(buf)
>> $15 = 9
>>
>> ..and it stays 9, continuing several times having set breakpoint.
>
>
> And the index involved? I'm pretty sure that this in an internal page, no?

The index is the oid index on pg_class. Some more info:

*) temp table churn is fairly high. Several dozen get spawned and
destroted at the start of a replication run, all at once, due to some
dodgy coding via dblink. During the replication run, the temp table
churn rate drops.

*) running btreecheck, I see:
cds2=# select bt_index_verify('pg_class_oid_index');
NOTICE: page 7 of index "pg_class_oid_index" is deleted
NOTICE: page 10 of index "pg_class_oid_index" is deleted
NOTICE: page 12 of index "pg_class_oid_index" is deleted
bt_index_verify
─────────────────

cds2=# select bt_leftright_verify('pg_class_oid_index');
WARNING: left link/right link pair don't comport at level 0, block 9,
last: 2, current left: 4
WARNING: left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING: left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING: left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING: left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
[repeat infinity until cancel]

which looks like the index is corrupted? ISTM _bt_moveright is
hanging because it's trying to move from block 9 to block 9 and so
loops forever.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-01-15 00:29:24 Shouldn't CREATE TABLE LIKE copy the relhasoids property?
Previous Message Peter Geoghegan 2015-01-14 23:39:45 Re: hung backends stuck in spinlock heavy endless loop