Re: BUG #10542: infinite loop in index.c when trying to reindex system tables (probably corrupted db state)

From: "hannes(dot)janetzek(at)gmail(dot)com" <hannes(dot)janetzek(at)googlemail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #10542: infinite loop in index.c when trying to reindex system tables (probably corrupted db state)
Date: 2014-06-07 16:11:01
Message-ID: CA+jkZB3JhjKt2=ie=0gcixo2-37gM+fMKfGpcaRe5MMLC6nA-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On Fri, Jun 6, 2014 at 6:34 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:

> Hi,
>
> On 2014-06-05 23:00:56 +0000, hannes(dot)janetzek(at)gmail(dot)com wrote:
> > While trying to get our database working again after a forced shutdown
> the
> > reindexing of the system tables in single user mode went into an infinite
> > loop.
>
> what happened in that infinite loop? The log excerpt below doesn't show
> one? Is it constantly echoing a message?
>

there were no messages from postgres while looping. I attached gdb and
stepped through the lines and found the range around L2260-L2385 repeating.
perf also showed only activity below IndexBuildHeapScan while tracing for a
few minutes. From looking at the source my guess was that a tuple that is
being indexed has a stale 'about-to-be-deleted-state'. The *very* well
documented source states that 'we wait for the deleting transaction to
finish and check again' I wonder if a deleting transaction can be in
progress in single-user-mode while reindex in running - Though I really
don't have any clue about pg internals :)

> could you explain how you got into the bad state? Are you using
> fsync=off?
>

Yes, the instance was running without fsync. We use it for rendering
openstreetmap map tiles so content is not that critical.

Regards,
Hannes

> > I can just roughly guess that between the lines:
> >
> https://github.com/postgres/postgres/blob/REL9_3_STABLE/src/backend/catalog/index.c#L2260-
> > L2385
> > the function is assuming that another process tries to delete a tuple
> that
> > is about to be indexed (even though in single user mode this should
> probably
> > not be possible)
>
> We've recently fixed a bug around that - but I don't immediately see how
> it could be applicable in that scenario.
>
>

> Greetings,
>
> Andres Freund
>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

--
Hannes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Grittner 2014-06-07 21:27:20 Re: uninterruptable loop: concurrent delete in progress within table
Previous Message David G Johnston 2014-06-07 13:56:14 Re: BUG #10545: The connection has been closed