Server crash on RHEL 9/s390x platform against PG16

From: Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Server crash on RHEL 9/s390x platform against PG16
Date: 2023-09-12 09:57:21
Message-ID: CAF1DzPXjpPxnsgySz2Zjm8d2dx7=J070C+MQBFh+9NBNcBKCAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Found server crash on RHEL 9/s390x platform with below test case -

*Machine details:*

*[edb(at)9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2
(Turquoise Kodkod)[edb(at)9428da9d2137 postgres]$ lscpuArchitecture:
s390x CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits
physical, 48 bits virtual Byte Order: Big Endian*
*Configure command:*
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm
--with-perl --with-python --with-tcl --with-openssl --enable-nls
--with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
--enable-debug --enable-cassert --with-pgport=5414

*Test case:*
CREATE TABLE rm32044_t1
(
pkey integer,
val text
);
CREATE TABLE rm32044_t2
(
pkey integer,
label text,
hidden boolean
);
CREATE TABLE rm32044_t3
(
pkey integer,
val integer
);
CREATE TABLE rm32044_t4
(
pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
= rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

*backtrace:*
[edb(at)9428da9d2137 postgres]$ gdb bin/postgres
data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc(at)entry=0x1ba3d10,
values=values(at)entry=0x1ba4168, isnull=isnull(at)entry=0x1ba41a8) at
heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x
libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x
libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x
libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x
libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x
llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x
ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x
systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc(at)entry=0x1ba3d10,
values=values(at)entry=0x1ba4168, isnull=isnull(at)entry=0x1ba41a8) at
heaptuple.c:227
#1 0x00000000010a9bb0 in heap_form_minimal_tuple
(tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at
heaptuple.c:1484
#2 0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>)
at ../../../../src/include/executor/tuptable.h:472
#3 tuplesort_puttupleslot (state=state(at)entry=0x1be4d18,
slot=slot(at)entry=0x1ba4120)
at tuplesortvariants.c:610
#4 0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at
nodeIncrementalSort.c:716
#5 0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at
../../../src/include/executor/executor.h:273
#6 ExecutePlan (execute_once=<optimized out>, dest=0x1ade698,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7 standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>,
count=0, execute_once=<optimized out>) at execMain.c:365
#8 0x00000000014a6ae2 in PortalRunSelect (portal=portal(at)entry=0x1a63558,
forward=forward(at)entry=true, count=0, count(at)entry=9223372036854775807,
dest=dest(at)entry=0x1ade698) at pquery.c:924
#9 0x00000000014a84e0 in PortalRun (portal=portal(at)entry=0x1a63558,
count=count(at)entry=9223372036854775807, isTopLevel=isTopLevel(at)entry=true,
run_once=run_once(at)entry=true, dest=dest(at)entry=0x1ade698, altdest=0x1ade698,
qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2
ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;")
at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at
postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc(at)entry=3,
argv=argv(at)entry=0x19a59a0)
at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at
main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2023-09-12 11:49:05 Re: Removing unneeded self joins
Previous Message Dilip Kumar 2023-09-12 09:56:00 Re: trying again to get incremental backup