Re: SIGSEGV in BRIN autosummarize

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SIGSEGV in BRIN autosummarize
Date: 2017-10-18 17:06:42
Message-ID: 20171018170642.GD17895@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 18, 2017 at 06:54:09PM +0200, Alvaro Herrera wrote:
> Justin Pryzby wrote:
>
> > No crashes in ~28hr. It occurs to me that it's a weaker test due to not
> > preserving most compilation options.
>
> And the previous code crashes in 45 minutes? That's solid enough for
> me; I'll clean up the patch and push in the next few days. I think what
> you have now should be sufficient for the time being for your production
> system.

No - the crash happened 4 times since adding BRIN+autosummarize 6 days ago, and
in once instance occured twice within 3 hours (while I was trying to query logs
for the preceding crash).

[pryzbyj(at)database ~]$ sudo grep -hE 'in postgres|Saved core' /var/log/messages*
Oct 13 17:22:45 database kernel: postmaster[32127] general protection ip:4bd467 sp:7ffd9b349990 error:0 in postgres[400000+692000]
Oct 13 17:22:47 database abrt[32387]: Saved core dump of pid 32127 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-13-17:22:47-32127 (15040512 bytes)
Oct 14 18:05:35 database kernel: postmaster[26500] general protection ip:84a177 sp:7ffd9b349b88 error:0 in postgres[400000+692000]
Oct 14 18:05:35 database abrt[27564]: Saved core dump of pid 26500 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-14-18:05:35-26500 (24137728 bytes)
Oct 16 23:21:22 database kernel: postmaster[31543] general protection ip:4bd467 sp:7ffe08a94890 error:0 in postgres[400000+692000]
Oct 16 23:21:22 database abrt[570]: Saved core dump of pid 31543 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-16-23:21:22-31543 (25133056 bytes)
Oct 17 01:58:36 database kernel: postmaster[8646]: segfault at 8 ip 000000000084a177 sp 00007ffe08a94a88 error 4 in postgres[400000+692000]
Oct 17 01:58:38 database abrt[9192]: Saved core dump of pid 8646 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-17-01:58:38-8646 (7692288 bytes)

> > If I understand, our crash isn't explained by the avw_database test
> > anyway (?)
>
> I don't see why you would think that -- I disagree.

No problem - apparently I read too far into Tom's thoughts regarding memory
context.

I'll continue runnning with the existing patch and come back if the issue
recurs.

Thanks
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-10-18 17:17:33 Re: Supporting Windows SChannel as OpenSSL replacement
Previous Message Robert Haas 2017-10-18 16:58:27 Re: [COMMITTERS] pgsql: Implement table partitioning.