Re: dsa_allocate() faliure

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, alvherre(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: dsa_allocate() faliure
Date: 2018-11-30 20:46:47
Message-ID: 20181130204647.GK24746@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Fri, Nov 30, 2018 at 08:20:49PM +0100, Jakub Glapa wrote:
> In the last days I've been monitoring no segfault occurred but the
> das_allocation did.
> I'm starting to doubt if the segfault I've found in dmesg was actually
> related.

The dmesg looks like a real crash, not just OOM. You can hopefully find the
timestamp of the segfaults in /var/log/syslog, and compare with postgres logs
if they go back far enough. All the postgres processes except the parent
would've been restarted at that time.

> I've grepped the postgres log for dsa_allocated:
> Why do the messages occur sometimes as FATAL and sometimes as ERROR?

I believe it may depend if it happens in a parallel worker or the leader.

You may get more log detail if you enable CSV logging (although unfortunately
as I recall it doesn't indicate it's a parallel worker).

You could force it to dump core if you recompile postgres with an assert() (see
patch below).

You could build an .deb by running dpkg-buildpackage -rfakeroot or similar (i
haven't done this in awhile), or you could compile, install, and launch
debugging binaries from your homedir (or similar)

You'd want to compile the same version (git checkout REL_10_6) and with the
proper configure flags..perhaps starting with:
./configure --with-libxml --with-libxslt --enable-debug --prefix=$HOME/src/postgresql.bin --enable-cassert && time make && make install

Be careful if you have extensions installed that they still work.

Justin

--- a/src/backend/utils/mmgr/dsa.c
+++ b/src/backend/utils/mmgr/dsa.c
@@ -727,4 +727,7 @@ dsa_allocate_extended(dsa_area *area, size_t size, int flags)
if (!FreePageManagerGet(segment_map->fpm, npages, &first_page))
+ {
elog(FATAL,
"dsa_allocate could not find %zu free pages", npages);
+ abort()
+ }

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2018-11-30 20:59:07 Re: make installcheck-world in a clean environment
Previous Message Dmitry Dolgov 2018-11-30 20:43:41 Re: pg_dumpall --exclude-database option

Browse pgsql-performance by date

  From Date Subject
Next Message Thomas Munro 2018-12-02 22:45:00 Re: dsa_allocate() faliure
Previous Message Jakub Glapa 2018-11-30 19:20:49 Re: dsa_allocate() faliure