Re: remaining sql/json patches

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Erik Rijkers <er(at)xs4all(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: remaining sql/json patches
Date: 2023-10-04 13:26:47
Message-ID: CA+HiwqG+v_bm3mTBNyUj0qPQ6AqR8iJR28bkx0g+iSBtPesuWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 3, 2023 at 10:11 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Mon, Oct 2, 2023 at 2:26 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Mon, Oct 2, 2023 at 1:24 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > Pushed this 30 min ago (no email on -committers yet!) and am looking
> > > at the following llvm crash reported by buildfarm animal pogona [1]:
> > >
> > > #4 0x00007f5bceb673d5 in __assert_fail_base (fmt=0x7f5bcecdbdc8
> > > "%s%s%s:%u: %s%sAssertion `%s' failed.\\n%n",
> > > assertion=assertion(at)entry=0x7f5bc1336419 "(i >= FTy->getNumParams() ||
> > > FTy->getParamType(i) == Args[i]->getType()) && \\"Calling a function
> > > with a bad signature!\\"", file=file(at)entry=0x7f5bc1336051
> > > "/home/bf/src/llvm-project-5/llvm/lib/IR/Instructions.cpp",
> > > line=line(at)entry=299, function=function(at)entry=0x7f5bc13362af "void
> > > llvm::CallInst::init(llvm::FunctionType *, llvm::Value *,
> > > ArrayRef<llvm::Value *>, ArrayRef<llvm::OperandBundleDef>, const
> > > llvm::Twine &)") at ./assert/assert.c:92
> > > #5 0x00007f5bceb763a2 in __assert_fail (assertion=0x7f5bc1336419 "(i
> > > >= FTy->getNumParams() || FTy->getParamType(i) == Args[i]->getType())
> > > && \\"Calling a function with a bad signature!\\"",
> > > file=0x7f5bc1336051
> > > "/home/bf/src/llvm-project-5/llvm/lib/IR/Instructions.cpp", line=299,
> > > function=0x7f5bc13362af "void llvm::CallInst::init(llvm::FunctionType
> > > *, llvm::Value *, ArrayRef<llvm::Value *>,
> > > ArrayRef<llvm::OperandBundleDef>, const llvm::Twine &)") at
> > > ./assert/assert.c:101
> > > #6 0x00007f5bc110f138 in llvm::CallInst::init (this=0x557a91f3e508,
> > > FTy=0x557a91ed9ae0, Func=0x557a91f8be88, Args=..., Bundles=...,
> > > NameStr=...) at
> > > /home/bf/src/llvm-project-5/llvm/lib/IR/Instructions.cpp:297
> > > #7 0x00007f5bc0fa579d in llvm::CallInst::CallInst
> > > (this=0x557a91f3e508, Ty=0x557a91ed9ae0, Func=0x557a91f8be88,
> > > Args=..., Bundles=..., NameStr=..., InsertBefore=0x0) at
> > > /home/bf/src/llvm-project-5/llvm/include/llvm/IR/Instructions.h:1934
> > > #8 0x00007f5bc0fa538c in llvm::CallInst::Create (Ty=0x557a91ed9ae0,
> > > Func=0x557a91f8be88, Args=..., Bundles=..., NameStr=...,
> > > InsertBefore=0x0) at
> > > /home/bf/src/llvm-project-5/llvm/include/llvm/IR/Instructions.h:1444
> > > #9 0x00007f5bc0fa51f9 in llvm::IRBuilder<llvm::ConstantFolder,
> > > llvm::IRBuilderDefaultInserter>::CreateCall (this=0x557a91f9c6a0,
> > > FTy=0x557a91ed9ae0, Callee=0x557a91f8be88, Args=..., Name=...,
> > > FPMathTag=0x0) at
> > > /home/bf/src/llvm-project-5/llvm/include/llvm/IR/IRBuilder.h:1669
> > > #10 0x00007f5bc100edda in llvm::IRBuilder<llvm::ConstantFolder,
> > > llvm::IRBuilderDefaultInserter>::CreateCall (this=0x557a91f9c6a0,
> > > Callee=0x557a91f8be88, Args=..., Name=..., FPMathTag=0x0) at
> > > /home/bf/src/llvm-project-5/llvm/include/llvm/IR/IRBuilder.h:1663
> > > #11 0x00007f5bc100714e in LLVMBuildCall (B=0x557a91f9c6a0,
> > > Fn=0x557a91f8be88, Args=0x7ffde6fa0b50, NumArgs=6, Name=0x7f5bc30b648c
> > > "funccall_iocoerce_in_safe") at
> > > /home/bf/src/llvm-project-5/llvm/lib/IR/Core.cpp:2964
> > > #12 0x00007f5bc30af861 in llvm_compile_expr (state=0x557a91fbeac0) at
> > > /home/bf/bf-build/pogona/HEAD/pgsql.build/../pgsql/src/backend/jit/llvm/llvmjit_expr.c:1373
> > >
> > > This seems to me to be complaining about the following addition:
> > >
> > > + {
> > > + Oid ioparam = op->d.iocoerce.typioparam;
> > > + LLVMValueRef v_params[6];
> > > + LLVMValueRef v_success;
> > > +
> > > + v_params[0] = l_ptr_const(op->d.iocoerce.finfo_in,
> > > + l_ptr(StructFmgrInfo));
> > > + v_params[1] = v_output;
> > > + v_params[2] = l_oid_const(lc, ioparam);
> > > + v_params[3] = l_int32_const(lc, -1);
> > > + v_params[4] = l_ptr_const(op->d.iocoerce.escontext,
> > > +
> > > l_ptr(StructErrorSaveContext));
> > >
> > > - LLVMBuildStore(b, v_retval, v_resvaluep);
> > > + /*
> > > + * InputFunctionCallSafe() will write directly into
> > > + * *op->resvalue.
> > > + */
> > > + v_params[5] = v_resvaluep;
> > > +
> > > + v_success = LLVMBuildCall(b, llvm_pg_func(mod,
> > > "InputFunctionCallSafe"),
> > > + v_params, lengthof(v_params),
> > > + "funccall_iocoerce_in_safe");
> > > +
> > > + /*
> > > + * Return null if InputFunctionCallSafe() encountered
> > > + * an error.
> > > + */
> > > + v_resnullp = LLVMBuildICmp(b, LLVMIntEQ, v_success,
> > > + l_sbool_const(0), "");
> > > + }
> >
> > Although most animals except pogona looked fine, I've decided to revert the patch for now.
> >
> > IIUC, LLVM is complaining that the code in the above block is not passing the arguments of InputFunctionCallSafe() using the correct types. I'm not exactly sure which particular argument is not handled correctly in the above code, but perhaps it's:
> >
> >
> > + v_params[1] = v_output;
> >
> > which maps to char *str argument of InputFunctionCallSafe(). v_output is set in the code preceding the above block as follows:
> >
> > /* and call output function (can never return NULL) */
> > v_output = LLVMBuildCall(b, v_fn_out, &v_fcinfo_out,
> > 1, "funccall_coerce_out");
> >
> > I thought that it would be fine to pass it as-is to the call of InputFunctionCallSafe() given that v_fn_out is a call to a function that returns char *, but perhaps not.
>
> OK, I think I could use some help from LLVM experts here.
>
> So, the LLVM code involving setting up a call to
> InputFunctionCallSafe() seems to *work*, but BF animal pogona's debug
> build (?) is complaining that the parameter types don't match up.
> Parameters are set up as follows:
>
> + {
> + Oid ioparam = op->d.iocoerce.typioparam;
> + LLVMValueRef v_params[6];
> + LLVMValueRef v_success;
> +
> + v_params[0] = l_ptr_const(op->d.iocoerce.finfo_in,
> + l_ptr(StructFmgrInfo));
> + v_params[1] = v_output;
> + v_params[2] = l_oid_const(lc, ioparam);
> + v_params[3] = l_int32_const(lc, -1);
> + v_params[4] = l_ptr_const(op->d.iocoerce.escontext,
> +
> l_ptr(StructErrorSaveContext));
> + /*
> + * InputFunctionCallSafe() will write directly into
> + * *op->resvalue.
> + */
> + v_params[5] = v_resvaluep;
> +
> + v_success = LLVMBuildCall(b, llvm_pg_func(mod,
> "InputFunctionCallSafe"),
> + v_params, lengthof(v_params),
> + "funccall_iocoerce_in_safe");
> +
> + /*
> + * Return null if InputFunctionCallSafe() encountered
> + * an error.
> + */
> + v_resnullp = LLVMBuildICmp(b, LLVMIntEQ, v_success,
> + l_sbool_const(0), "");
> + }
>
> And here's InputFunctionCallSafe's signature:
>
> bool
> InputFunctionCallSafe(FmgrInfo *flinfo, Datum d,
> Oid typioparam, int32 typmod,
> fmNodePtr escontext,
> Datum *result)
>
> I suspected that assignment to either param[1] or param[5] might be wrong.
>
> param[1] in InputFunctionCallSafe's signature is char *, but the code
> assigns it v_output, which is an LLVMValueRef for the output
> function's output, a Datum, so I thought LLVM's type checker is
> complaining that I'm trying to pass the Datum to char * without
> appropriate conversion.
>
> param[5] in InputFunctionCallSafe's signature is Node *, but the above
> code is assigning it an LLVMValueRef for iocoerce's escontext whose
> type is ErrorSaveContext.
>
> Maybe some other param is wrong.
>
> I tried various ways to fix both, but with no success. My way of
> checking for failure is to disassemble the IR code in .bc files
> (generated with jit_dump_bitcode) with llvm-dis and finding that it
> gives me errors such as:
>
> $ llvm-dis 58536.0.bc
> llvm-dis: error: Invalid record (Producer: 'LLVM7.0.1' Reader: 'LLVM 7.0.1')
>
> $ llvm-dis 58536.0.bc
> llvm-dis: error: Invalid cast (Producer: 'LLVM7.0.1' Reader: 'LLVM 7.0.1')

So I built LLVM sources to get asserts like pogona:

$ llvm-config --version
15.0.7
$ llvm-config --assertion-mode
ON

and I do now get a crash with bt that looks like this (not same as pogona):

#0 0x00007fe31e83c387 in raise () from /lib64/libc.so.6
#1 0x00007fe31e83da78 in abort () from /lib64/libc.so.6
#2 0x00007fe31e8351a6 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007fe31e835252 in __assert_fail () from /lib64/libc.so.6
#4 0x00007fe3136d8132 in llvm::CallInst::init(llvm::FunctionType*,
llvm::Value*, llvm::ArrayRef<llvm::Value*>,
llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine
const&) ()
from /home/amit/llvm/lib/libLLVMCore.so.15
#5 0x00007fe31362137a in
llvm::IRBuilderBase::CreateCall(llvm::FunctionType*, llvm::Value*,
llvm::ArrayRef<llvm::Value*>, llvm::Twine const&, llvm::MDNode*) ()
from /home/amit/llvm/lib/libLLVMCore.so.15
#6 0x00007fe31362d627 in LLVMBuildCall () from
/home/amit/llvm/lib/libLLVMCore.so.15
#7 0x00007fe3205e7e92 in llvm_compile_expr (state=0x1114e48) at
llvmjit_expr.c:1374
#8 0x0000000000bd3fbc in jit_compile_expr (state=0x1114e48) at jit.c:177
#9 0x000000000072442b in ExecReadyExpr (state=0x1114e48) at execExpr.c:880
#10 0x000000000072387c in ExecBuildProjectionInfo
(targetList=0x1110840, econtext=0x1114a20, slot=0x1114db0,
parent=0x1114830, inputDesc=0x1114ab0) at execExpr.c:484
#11 0x000000000074e917 in ExecAssignProjectionInfo
(planstate=0x1114830, inputDesc=0x1114ab0) at execUtils.c:547
#12 0x000000000074ea02 in ExecConditionalAssignProjectionInfo
(planstate=0x1114830, inputDesc=0x1114ab0, varno=2)
at execUtils.c:585
#13 0x0000000000749814 in ExecAssignScanProjectionInfo
(node=0x1114830) at execScan.c:276
#14 0x0000000000790bf0 in ExecInitValuesScan (node=0x1045020,
estate=0x1114600, eflags=32)
at nodeValuesscan.c:257
#15 0x00000000007451c9 in ExecInitNode (node=0x1045020,
estate=0x1114600, eflags=32) at execProcnode.c:265
#16 0x000000000073a952 in InitPlan (queryDesc=0x1070760, eflags=32) at
execMain.c:968
#17 0x0000000000739828 in standard_ExecutorStart (queryDesc=0x1070760,
eflags=32) at execMain.c:266
#18 0x000000000073959d in ExecutorStart (queryDesc=0x1070760,
eflags=0) at execMain.c:145
#19 0x00000000009c1aaa in PortalStart (portal=0x10bf7d0, params=0x0,
eflags=0, snapshot=0x0) at pquery.c:517
#20 0x00000000009bbba8 in exec_simple_query (
query_string=0x10433c0 "select i::pg_lsn from (values ('x/a'),
('b/b')) a(i);") at postgres.c:1233
#21 0x00000000009c0263 in PostgresMain (dbname=0x1079750 "postgres",
username=0x1079738 "amit")
at postgres.c:4652
#22 0x00000000008f72d6 in BackendRun (port=0x106e740) at postmaster.c:4439
#23 0x00000000008f6c6f in BackendStartup (port=0x106e740) at postmaster.c:4167
#24 0x00000000008f363e in ServerLoop () at postmaster.c:1781
#25 0x00000000008f300e in PostmasterMain (argc=5, argv=0x103dc60) at
postmaster.c:1465
#26 0x00000000007bbfb4 in main (argc=5, argv=0x103dc60) at main.c:198

The LLVMBuildCall() in frame #6 is added by the patch that I also
mentioned in the previous replies. I haven't yet pinpointed down
which of the LLVM's asserts it is, nor have I been able to walk
through LLVM source code using gdb to figure what the new code is
doing wrong. Maybe I'm still missing a trick or two...

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-10-04 13:29:16 Re: Infinite Interval
Previous Message Robert Haas 2023-10-04 13:22:14 Re: --sync-method isn't documented to take an argument