Postgresql Server Restart continuously

From: alvaro(at)audifarma(dot)com(dot)co
To: pgsql-admin(at)postgresql(dot)org
Subject: Postgresql Server Restart continuously
Date: 2004-08-26 19:52:01
Message-ID: 43765.200.31.204.249.1093549921.squirrel@www.audifarma.com.co
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin


Hello you out there,

I'm having some strange problem with a server postgresql 7.4.3, some times
the server crashes and restarts inmediatly, heres is the error message
catch from the log file

ERROR: cache lookup failed for namespace 105183855
LOG: server process (PID 3942) exited with exit code 1
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and e
xit, because another server process exited abnormally and possibly
corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and e
xit, because another server process exited abnormally and possibly
corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and e
xit, because another server process exited abnormally and possibly
corrupted shared memory.
.
.
.
.
LOG: all server processes terminated; reinitializing
LOG: could not open file "/data2/datos/postmaster.pid": No such file or
directory
LOG: database system was interrupted at 2004-08-26 09:58:21 COT
LOG: checkpoint record is at 24/6F7B343C
LOG: redo record is at 24/6F73C3C8; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 5006358; next OID: 176076757
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 24/6F73C3C8
LOG: record with zero length at 24/6FC94B6C
LOG: redo done at 24/6FC94B48
LOG: recycled transaction log file "000000240000006C"
LOG: removing transaction log file "000000240000006D"
LOG: removing transaction log file "000000240000006E"
LOG: database system is ready

Heres is the output of the ipcs command

------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x0052e2c1 65536 postgres 600 278257664 63

------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 32768 apache 600 1
0x00000000 65537 apache 600 1
0x0052e2c1 786434 postgres 600 17
0x0052e2c2 819203 postgres 600 17
0x0052e2c3 851972 postgres 600 17
0x0052e2c4 884741 postgres 600 17
0x0052e2c5 917510 postgres 600 17
0x0052e2c6 950279 postgres 600 17
0x0052e2c7 983048 postgres 600 17
0x0052e2c8 1015817 postgres 600 17
0x0052e2c9 1048586 postgres 600 17
0x0052e2ca 1081355 postgres 600 17

------ Message Queues --------
key msqid owner perms used-bytes messages

I've notice that every time this happens postgresql idicates a lookup
error pointing to a namespace but I don't know what object is.

ERROR: cache lookup failed for namespace 105183855
ERROR: cache lookup failed for namespace 185104342

I've looked at the manual for some advice or action to take when this kind
of thing happends but I couldn't find anything (or maybe the answer is
rigth there but I just can't see it).

Surfing the web someone post a messages indicating to reindex database
when the cache lookup failed happend...Is this a real solution...?

Here are some params that I've modified from the postgresql.conf file

max_connections = 150
shared_buffers = 32768
sort_mem = 2048
vacumm_mem = 32568
max_fsm_pages = 200000
max_fsm_relations = 200
max_files_per_process = 10000
wal_buffers = 256
checkpoint_segments = 10
checkpoint_timeout = 600
effective_cache_size = 10000
random_page_cost = 2

kernel.shmmax = 4000000000
kernelshmall = 4000000000

Those values are high but the hardware plataform is roboust (I guess)

Dell power edge 6600, 16Gb RAM, SCSI RAID 5 (200Gb total), 4 cpus.

Do you think that this values a correct or maybe one of those are the
origin of the problem...?

Thanks in advance,

Alvaro

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message kris pal 2004-08-26 21:15:32 Re: regression database
Previous Message Steve Lane 2004-08-26 19:33:48 Re: Odd double queries continues