Re: Postgres DB crashing

From: AI Rumman <rummandba(at)gmail(dot)com>
To: bhanu udaya <udayabhanu1984(at)hotmail(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Chris Travers <chris(dot)travers(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Postgres DB crashing
Date: 2013-06-18 17:54:09
Message-ID: CAGoODpdz_hqW_8FOybfw2GkuTQ6e2P4H9Z+vp9C5X97Vxs8h2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-support pgsql-general

Stop the autovacuum process and try again.

On Tue, Jun 18, 2013 at 1:31 PM, bhanu udaya <udayabhanu1984(at)hotmail(dot)com>wrote:

> Hello,
> Greetings.
>
> My PostgresSQL (9.2) is crashing after certain load tests. Currently,
> postgressql is crashing when simulatenously 800 to 1000 threads are run on
> a 10 million records schema. Not sure, if we have to tweak some more
> parameters of postgres. Currently, the postgressql is configured as below
> on a 7GB Ram on an Intel Xeon CPU E5507 2.27 GZ. Is this postgres
> limitation to support only 800 threads or any other configuration required.
> Please look at the log as below with errors. Please reply
>
>
> max_connections 5000 shared_buffers 2024 MB synchronous_commit off
> wal_buffers 100 MB wal_writer_delays 1000ms checkpoint_segments 512
> checkpoint_timeout 5 min checkpoint_completion_target 0.5
> checkpoint_warning 30s work_memory 1G effective_cache_size 5 GB
>
>
>
> 2013-06-11 15:11:17 GMT [26201]: [1-1]ERROR: canceling autovacuum task
>
> 2013-06-11 15:11:17 GMT [26201]: [2-1]CONTEXT: automatic vacuum of table
> "newrelic.tenant1.customer"
>
> 2013-06-11 15:11:17 GMT [25242]: [1-1]LOG: sending cancel to blocking
> autovacuum PID 26201
>
> 2013-06-11 15:11:17 GMT [25242]: [2-1]DETAIL: Process 25242 waits for
> ExclusiveLock on extension of relation 679054 of database 666546.
>
> 2013-06-11 15:11:17 GMT [25242]: [3-1]STATEMENT: UPDATE tenant1.customer
> SET lastmodifieddate = $1 WHERE id IN ( select random_range((select min(id)
> from tenant1.customer ), (select max(id) from tenant1.customer )) as id )
> AND softdeleteflag IS NOT TRUE
>
> 2013-06-11 15:11:17 GMT [25242]: [4-1]WARNING: could not send signal to
> process 26201: No such process
>
> 2013-06-11 15:22:29 GMT [22229]: [11-1]WARNING: worker took too long to
> start; canceled
>
> 2013-06-11 15:24:10 GMT [26511]: [1-1]WARNING: autovacuum worker started
> without a worker entry
>
> 2013-06-11 16:03:33 GMT [23092]: [1-1]LOG: could not receive data from
> client: Connection timed out
>
> 2013-06-11 16:06:05 GMT [23222]: [5-1]LOG: could not receive data from
> client: Connection timed out
>
> 2013-06-11 16:07:06 GMT [26869]: [1-1]FATAL: canceling authentication due
> to timeout
>
> 2013-06-11 16:23:16 GMT [25128]: [1-1]LOG: could not receive data from
> client: Connection timed out
>
> 2013-06-11 16:23:20 GMT [25128]: [2-1]LOG: unexpected EOF on client
> connection with an open transaction
>
> 2013-06-11 16:30:56 GMT [23695]: [1-1]LOG: could not receive data from
> client: Connection timed out
>
> 2013-06-11 16:43:55 GMT [24618]: [1-1]LOG: could not receive data from
> client: Connection timed out
>
> 2013-06-11 16:44:29 GMT [25204]: [1-1]LOG: could not receive data from
> client: Connection timed out
>
> 2013-06-11 16:54:14 GMT [22226]: [1-1]PANIC: stuck spinlock
> (0x2aaab54279d4) detected at bufmgr.c:1239
>
> 2013-06-11 16:54:14 GMT [32521]: [8-1]LOG: checkpointer process (PID
> 22226) was terminated by signal 6: Aborted
>
> 2013-06-11 16:54:14 GMT [32521]: [9-1]LOG: terminating any other active
> server processes
>
> 2013-06-11 16:54:14 GMT [26931]: [1-1]WARNING: terminating connection
> because of crash of another server process
>
> 2013-06-11 16:54:14 GMT [26931]: [2-1]DETAIL: The postmaster has commanded
> this server process to roll back the current transaction and exit, because
> another server process exited abnormally and possibly corrupted shared
> memory.
>
> 2013-06-11 16:54:14 GMT [26931]: [3-1]HINT: In a moment you should be able
> to reconnect to the database and repeat your command.
>
> 2013-06-11 16:54:14 GMT [26401]: [1-1]WARNING: terminating connection
> because of crash of another server process
>
> 2013-06-11 16:54:14 GMT [26401]: [2-1]DETAIL: The postmaster has commanded
> this server process to roll back the current transaction and exit, because
> another server process exited abnormally and possibly corrupted shared
> memory.
>
> 2013-06-11 16:55:08 GMT [27579]: [1-1]FATAL: the database system is in
> recovery mode
>
> 2013-06-11 16:55:08 GMT [24041]: [1-1]WARNING: terminating connection
> because of crash of another server process
>
> 2013-06-11 16:55:08 GMT [24041]: [2-1]DETAIL: The postmaster has commanded
> this server process to roll back the current
>
>

In response to

Responses

Browse pgadmin-support by date

  From Date Subject
Next Message John R Pierce 2013-06-18 18:17:25 Re: Postgres DB crashing
Previous Message bhanu udaya 2013-06-18 17:31:31 Postgres DB crashing

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Herrin 2013-06-18 18:16:41 Re: earthdistance compass bearing
Previous Message Jeff Herrin 2013-06-18 17:42:58 earthdistance compass bearing