Missing Toast Chunk

From: Sam Nelson <samn(at)consistentstate(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Missing Toast Chunk
Date: 2010-08-19 17:26:50
Message-ID: AANLkTinUYP32RpcB6-pDQwOz3_pZPuKUVOeUzh3WFXux@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Good morning, list.

We've got a bit of a problem on a customer's production box. We got a
"missing chunk number 0 for toast value N" (N being a number) this week on
their production box. We verified that it was only a problem with one row,
tried to fix it with updates, and ended up deleting the row.

To check for similar issues in other tables, we set up a script to run at
midnight and do a pg_dump on each individual table in the database where the
original error happened, sending stderr to a log file. Since the original
problem was discovered while running pg_dump, we figured this would show us
any tables that have similar issues.

We found the same problem in a couple of other tables, but the big problem
is that the same table that we just fixed had that error again, in a
different row this time.

Some information on the customer's box: It's an Amazon EC2 box running
debian (I believe debian 5, but I'm not sure). They are using postgres
8.3.11, installed from apt. They are mainly using ruby on rails for their
application(s).

Here's the full error from the log file, minus (mildly) sensitive info:

--| Table schema.table dump start: Wed Aug 18 04:54:34 UTC 2010 |--
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: missing chunk number 0 for toast
value N in pg_toast_M
pg_dump: The command was: COPY schema.table (id, foreign_key,
some_text_stuff, timestamp1, timestamp2) TO stdout;
--| Table schema.table dump end: Wed Aug 18 04:54:44 UTC 2010 |--

So the question is, what could be causing this? It's not so terrible a deal
that we found that error in their database once, but this happened again
right after we fixed it. Could it be ruby? The customer's application(s)?
Some weirdness with Amazon EC2 and/or debian? A bug in postgres, itself?
Any ideas?

-Sam

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Benjamin Smith 2010-08-19 18:13:04 Re: Massively Parallel transactioning?
Previous Message Derrick Rice 2010-08-19 15:23:22 Re: Warm Standby and resetting the primary as a standby