Re: [sqlsmith] crash in RestoreLibraryState during low-memory testing

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andreas Seltenreich <seltenreich(at)gmx(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [sqlsmith] crash in RestoreLibraryState during low-memory testing
Date: 2017-10-03 05:05:25
Message-ID: CAA4eK1+TwGt--17oZCEdvFADai58xXRyAwTRPWFjSkJ7vha+tw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 3, 2017 at 8:31 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Oct 3, 2017 at 3:04 AM, Andreas Seltenreich <seltenreich(at)gmx(dot)de> wrote:
>> Hi,
>>
>> doing low-memory testing with REL_10_STABLE at 1f19550a87 also produced
>> a couple of parallel worker core dumps with the backtrace below.
>> Although most of the backtrace is inside the dynamic linker, it looks
>> like it was passed a pointer to gone-away shared memory.
>>
>
> It appears to be some dangling pointer, but not sure how it is
> possible. Can you provide some more details, like do you have any
> other library which you want to get loaded in the backend (like by
> using shared_preload_libraries or by some other way)? I think without
> that we shouldn't try to load anything in the parallel worker.
>

Another possibility could be that the memory for library space has
been overwritten either in master backend or in worker backend. I
think that is possible in low-memory conditions if in someplace we try
to write in the memory without ensuring if space is allocated. I have
browsed the nearby code and didn't find any such instance. One idea
to narrow down the problem is to see if the other members in worker
backend are sane, for ex. can you try printing the value of
MyFixedParallelState as we get that value from shared memory similar
to libraryspace. It seems from call stack that the memory of
libraryspace is corrupted, so we can move the call to
lookup/RestoreLibraryState immediately after we assign
MyFixedParallelState. I think if after this also the memory for
libraryspace is corrupted, then probably something bad has happened in
master backend.

Any other ideas?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-10-03 05:09:56 Re: Transactions involving multiple postgres foreign servers
Previous Message Ashutosh Bapat 2017-10-03 05:04:28 Re: PoC: full merge join on comparison clause