Re: seawasp failing, maybe in glibc allocator

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: seawasp failing, maybe in glibc allocator
Date: 2021-06-20 10:59:34
Message-ID: 20210620105934.vkweficqaw4t6uw4@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-06-19 17:07:51 +1200, Thomas Munro wrote:
> On Sat, May 22, 2021 at 12:25 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2021-05-21 15:57:01 -0700, Andres Freund wrote:
> > > I found the LLVM commit to blame (c8fc5e3ba942057d6c4cdcd1faeae69a28e7b671).
> > > Contacting the author and reading the change to see if I can spit the
> > > issue myself.
> >
> > Hrmpf. It's a silent API breakage. The author intended to email us about
> > it, but apparently forgot. One now needs to increment a string-pool
> > refcount. The reason that didn't trigger a reliable crash is that
> > there's a path where the refcount of string-pool entries aren't asserted
> > to be above before decrementing the refcount... And that there
> > practically never are references to the pool entries after use.
> >
> > Continuing to discusss whether there's a better way to deal with this.
>
> Any news?

Nothing really. I'd discussed it a bit with the relevant LLVM
maintainer, but we didn't come to a real resolution. He apologized for
not giving a proper heads up - he'd planned to send out an email, but
somehow lost track of that.

Given that any potential solution would be also end up being a versioned
ifdef, I think adding something like what you suggest here is the least
unreasonable solution.

> FWIW this change appears to fix the problem for my system (LLVM 13
> build from a couple of days ago). No more weird results, valgrind
> errors gone. I ran the leak checker to see if I now had the opposite
> problem, and although there are various leaks reported, I didn't see
> obvious intern pool related stacks.
>
> diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
> index 71029a39a9..7b09e520f5 100644
> --- a/src/backend/jit/llvm/llvmjit.c
> +++ b/src/backend/jit/llvm/llvmjit.c
> @@ -1116,6 +1116,11 @@
> llvm_resolve_symbols(LLVMOrcDefinitionGeneratorRef GeneratorObj, void
> *Ctx,
> if (error != LLVMErrorSuccess)
> LLVMOrcDisposeMaterializationUnit(mu);
>
> +#if LLVM_VERSION_MAJOR > 12
> + for (int i = 0; i < LookupSetSize; i++)
> + LLVMOrcRetainSymbolStringPoolEntry(symbols[i].Name);
> +#endif
> +
> pfree(symbols);

I think this should be part of the earlier loop? Once
LLVMOrcAbsoluteSymbols() is called that owns the reference, so there
doesn't seem to be a reason to increase the refcount only later?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-06-20 11:01:07 Re: seawasp failing, maybe in glibc allocator
Previous Message Fabien COELHO 2021-06-20 09:02:14 Re: pgbench logging broken by time logic changes