Re: SIGSEGV in BRIN autosummarize

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SIGSEGV in BRIN autosummarize
Date: 2017-10-18 16:48:55
Message-ID: 20171018164855.GC17895@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 17, 2017 at 09:07:40AM -0500, Justin Pryzby wrote:
> On Tue, Oct 17, 2017 at 09:34:24AM -0400, Tom Lane wrote:
> > Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> > > On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote:
> > >> Anyway, can give this patch a try?
> >
> > The trick in this sort of situation is to make sure you build binaries
> > that match your existing install in every way except having the added
> > patch, and maybe getting installed into a different directory.
>
> I'm familiar with that process; but, these are PG10 binaries from PGDG for
> centos6 x64. I had to add symlinks for postgis library, but otherwise seems to
> be working fine (although I didn't preserve as many configure options as your
> message would suggest I should have).

On Tue, Oct 17, 2017 at 12:49:55PM -0400, Tom Lane wrote:
> So what I'm thinking is that you need an error during perform_work_item,
> and/or more than one work_item picked up in the calling loop, to make this
> bug manifest. You would need to enter perform_work_item in a

..in our case probably due to interruption by LOCK TABLE, yes?

On Tue, Oct 17, 2017 at 12:49:55PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > And I think that's because we're not
> > checking that the namespace OID is a valid value before calling
> > get_namespace_name on it.
>
> The part of your patch that adds a check on avw_database is clearly
> correct and necessary. I'm thinking the change you propose in
> perform_work_item is just overcomplicating code that's okay as it
> stands. We don't need to optimize for the schema-went-away case.

No crashes in ~28hr. It occurs to me that it's a weaker test due to not
preserving most compilation options. If I understand, our crash isn't
explained by the avw_database test anyway (?)

Should I make clean and recompile with all non-prefix options and a minimal
patch (avw_database==MyDatabaseId || continue) ?

Or recompile with existing options but no patch to first verify crash occurs
with locally compiled binary?

Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-10-18 16:54:09 Re: SIGSEGV in BRIN autosummarize
Previous Message Tom Lane 2017-10-18 16:39:52 Re: [COMMITTERS] pgsql: Implement table partitioning.