Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated

From: john gale <john(at)smadness(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated
Date: 2014-02-26 07:45:18
Message-ID: 5B4AB98C-0230-4C9B-8F76-BB0B140315D5@smadness.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Does anybody have any ideas about this.

We restarted the postmaster and the issue persists. So previously in 9.0.4 where we could clean corruption, it seems in 9.3.2 we can no longer clean corruption.o I'm assuming this because our data insert environment has not changed, so we shouldn't be hitting any different transaction concurrency / isolation problems than we did before.

Is there a way to force deletion of a row, ignoring concurrency, similar to concurrent updates. It looks like changing default_transaction_isolation did not affect this:

munin2=# delete from testruns where ctid = '(37069305,4)';
ERROR: tuple concurrently updated

2014-02-26 07:42:46 GMT LOG: received SIGHUP, reloading configuration files
2014-02-26 07:42:46 GMT LOG: parameter "default_transaction_isolation" changed to "read uncommitted"
2014-02-26 07:42:53 GMT ERROR: tuple concurrently updated
2014-02-26 07:42:53 GMT STATEMENT: delete from testruns where ctid = '(37069305,4)';

thanks,

~ john

On Feb 25, 2014, at 11:43 AM, john gale <john(at)smadness(dot)com> wrote:

> We ran into an open file limit on the DB host (Mac OS X 10.9.0, Postgres 9.3.2) and caused the familiar "ERROR: unexpected chunk number 0 (expected 1) for toast value 155900302 in pg_toast_16822" when selecting data.
>
> Previously when we've run into this kind of corruption we could find the specific corrupted rows in the table and delete by ctid. However, this time we're running into a persistent "ERROR: tuple concurrently updated" when deleting by ctid.
>
> munin2=# select ctid from testruns where id = 141889653;
> ctid
> --------------
> (37069816,3)
> (1 row)
>
> munin2=# delete from testruns where ctid = '(37069816,3)';
> ERROR: tuple concurrently updated
>
> This always occurs and seems to prevent us from cleaning up the database by removing the corrupted rows.
>
> Before attempting to do more drastic things like restart the postgres instance, is there some known way of getting around this error and cleaning up the corruption (other than the full replicate / reindex / suggestions from around the web that are more involved than deleting corrupted rows by ctid).
>
> thanks,
>
> ~ john

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Willy-Bas Loos 2014-02-26 09:53:56 str replication failed, restart fixed it
Previous Message Vik Fearing 2014-02-26 06:44:25 Re: Adding a non-null column without noticeable downtime