Re: postgres bug report

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jfaith <jfaith(at)cemsys(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: postgres bug report
Date: 2000-07-21 14:44:22
Message-ID: 13121.964190662@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

jfaith <jfaith(at)cemsys(dot)com> writes:
> A daily 'vacuum verbose' cron failed, the end of the log follows:-
> NOTICE: Index crd1: Pages 533; Tuples 104774: Deleted 6159. CPU
> 0.43s/0.81u sec.
> pqReadData() -- backend closed the channel unexpectedly.

Is there a coredump file from the vacuum crash? (Look in the
data/base/DBNAME/ directory for the database.) If so a backtrace
from it would be helpful.

> The prtc_ncn process died when the cron ran. On restart it seemed to be
> fine.

> At the time the cron runs prtcwri would not have any activity and did
> not die,
> when audits came though later its log file indicated they were performed
> correctly. However when I ran psql selects on cards did not show the new
> data!

This seems odd. A hard crash in the vacuum process should lead the
postmaster to abort all other active backends. Why would prtc_ncn's
backend quit but not prtcwri's ?

> If it will be helpfull I can simulate most of these cicumstances on a
> test machine here in an attempt to recreate the failure. What debugging
> options should I use?

Compile with -g and assert checking (configure --enable-cassert handles
the second, for the first the easiest way is "gmake PROFILE=-g all").

If you did not get a corefile above, this probably indicates you are
starting the postmaster with coredumps disabled --- on many systems
processes started from boot scripts or init run with "ulimit -c 0" by
default, and you have to change that to get cores from backends. Check
that before trying very hard to simulate the failure (an easy way to
verify whether you get cores from crashes is "kill -SEGV backendPID" on
a running backend...)

> I've just looked at the code for vacuum. If I understand correctly it
> locks each table in turn exclusively.

Yes. There is no alternative to that. A trick that many people use
is to drop indexes on a table, vacuum the table, rebuild indexes;
index-vacuuming is slower than a full rebuild at present :-(. However
if you are using the indexes to enforce UNIQUE constraints then this
may not be a good answer.

We have long-range plans to allow space recycling without VACUUM,
which'll make life easier for 24x7 operations.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message DalTech - CTE 2000-07-21 15:06:27 Re: problem with view and case - please help
Previous Message Ange Michel POZZO 2000-07-21 14:41:06 Re: [SQL] problem with view and case - please help