| From: | Richard Huxton <dev(at)archonet(dot)com> | 
|---|---|
| To: | Justin Pasher <justinp(at)newmediagateway(dot)com> | 
| Cc: | pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: Autovacuum daemon terminated by signal 11 | 
| Date: | 2009-01-15 09:47:41 | 
| Message-ID: | 496F063D.1020308@archonet.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general pgsql-hackers | 
Justin Pasher wrote:
> Hello,
> 
> I have a server running PostgreSQL 8.1.15-0etch1 (Debian etch) that was
> recently put into production. Last week a developer started having a problem
> with his psql connection being terminated every couple of minutes when he
> was running a query. When I look through the logs, I noticed this message.
> 
> 2009-01-09 08:09:46 CST LOG:  autovacuum process (PID 15012) was terminated
> by signal 11
Segmentation fault - probably a bug or bad RAM.
> I looked through the logs some more and I noticed that this was occurring
> every minute or so. The database is a pretty heavily utilized system
> (judging by the age(datfrozenxid) from pg_database, the system had run
> approximately 500 million queries in less than a week). I noticed that right
> before every autovacuum termination, it tried to autovacuum a database.
> 
> 2009-01-09 08:09:46 CST LOG:  transaction ID wrap limit is 4563352, limited
> by database "database_name"
> 
> It was always showing the same database, so I decided to manually vacuum the
> database. Once that was done (it was successful the first time without
> errors), the problem seemed to go away. I went ahead and manually vacuumed
> the remaining databases just to take care of the potential xid wraparound
> issue.
I'd be suspicious of possible corruption in autovacuum's internal data.
Can you trace these problems back to a power-outage or system crash? It
doesn't look like "database_name" itself since you vacuumed that
successfully. If autovacuum is running normally now, that might indicate
it was something in the way autovacuum was keeping track of "database_name".
It's also probably worth running some memory tests on the server -
(memtest86 or similar) to see if that shows anything. Was it *always*
the autovacuum process getting sig11? If not then it might just be a
pattern of usage that makes it more likely to use some bad RAM.
-- 
  Richard Huxton
  Archonet Ltd
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Christian Schröder | 2009-01-15 09:57:02 | Re: Polymorphic "setof record" function? | 
| Previous Message | Dave Page | 2009-01-15 09:44:09 | Re: one-click installer postgresql-8.3.5-1-linux.bin failed | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jasen Betts | 2009-01-15 10:14:10 | Re: fire trigger for a row without update? | 
| Previous Message | Marcus Kempe | 2009-01-15 09:19:56 | Re: async notification patch for dblink |