pg13.2: invalid memory alloc request size NNNN

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pg13.2: invalid memory alloc request size NNNN
Date: 2021-02-12 01:48:37
Message-ID: 20210212014837.GE1793@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ts=# \errverbose
ERROR: XX000: invalid memory alloc request size 18446744073709551613

#0 pg_re_throw () at elog.c:1716
#1 0x0000000000a33b12 in errfinish (filename=0xbff20e "mcxt.c", lineno=959, funcname=0xbff2db <__func__.6684> "palloc") at elog.c:502
#2 0x0000000000a6760d in palloc (size=18446744073709551613) at mcxt.c:959
#3 0x00000000009fb149 in text_to_cstring (t=0x2aaae8023010) at varlena.c:212
#4 0x00000000009fbf05 in textout (fcinfo=0x2094538) at varlena.c:557
#5 0x00000000006bdd50 in ExecInterpExpr (state=0x2093990, econtext=0x20933d8, isnull=0x7fff5bf04a87) at execExprInterp.c:1112
#6 0x00000000006d4f18 in ExecEvalExprSwitchContext (state=0x2093990, econtext=0x20933d8, isNull=0x7fff5bf04a87) at ../../../src/include/executor/executor.h:316
#7 0x00000000006d4f81 in ExecProject (projInfo=0x2093988) at ../../../src/include/executor/executor.h:350
#8 0x00000000006d5371 in ExecScan (node=0x20932c8, accessMtd=0x7082e0 <SeqNext>, recheckMtd=0x708385 <SeqRecheck>) at execScan.c:238
#9 0x00000000007083c2 in ExecSeqScan (pstate=0x20932c8) at nodeSeqscan.c:112
#10 0x00000000006d1b00 in ExecProcNodeInstr (node=0x20932c8) at execProcnode.c:466
#11 0x00000000006e742c in ExecProcNode (node=0x20932c8) at ../../../src/include/executor/executor.h:248
#12 0x00000000006e77de in ExecAppend (pstate=0x2089208) at nodeAppend.c:267
#13 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2089208) at execProcnode.c:466
#14 0x000000000070964f in ExecProcNode (node=0x2089208) at ../../../src/include/executor/executor.h:248
#15 0x0000000000709795 in ExecSort (pstate=0x2088ff8) at nodeSort.c:108
#16 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2088ff8) at execProcnode.c:466
#17 0x00000000006d1ad1 in ExecProcNodeFirst (node=0x2088ff8) at execProcnode.c:450
#18 0x00000000006dec36 in ExecProcNode (node=0x2088ff8) at ../../../src/include/executor/executor.h:248
#19 0x00000000006df079 in fetch_input_tuple (aggstate=0x2088a20) at nodeAgg.c:589
#20 0x00000000006e1fad in agg_retrieve_direct (aggstate=0x2088a20) at nodeAgg.c:2368
#21 0x00000000006e1bfd in ExecAgg (pstate=0x2088a20) at nodeAgg.c:2183
#22 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2088a20) at execProcnode.c:466
#23 0x00000000006d1ad1 in ExecProcNodeFirst (node=0x2088a20) at execProcnode.c:450
#24 0x00000000006c6ffa in ExecProcNode (node=0x2088a20) at ../../../src/include/executor/executor.h:248
#25 0x00000000006c966b in ExecutePlan (estate=0x2032f48, planstate=0x2088a20, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0xbb3400 <donothingDR>,
execute_once=true) at execMain.c:1632

#3 0x00000000009fb149 in text_to_cstring (t=0x2aaae8023010) at varlena.c:212
212 result = (char *) palloc(len + 1);

(gdb) l
207 /* must cast away the const, unfortunately */
208 text *tunpacked = pg_detoast_datum_packed(unconstify(text *, t));
209 int len = VARSIZE_ANY_EXHDR(tunpacked);
210 char *result;
211
212 result = (char *) palloc(len + 1);

(gdb) p len
$1 = -4

This VM had some issue early today and I killed the VM, causing PG to execute
recovery. I'm tentatively blaming that on zfs, so this could conceivably be a
data error (although recovery supposedly would have resolved it). I just
checked and data_checksums=off.

The query has mode(), string_agg(), distinct.

Here's a redacted plan for the query:

GroupAggregate (cost=15681340.44..20726393.56 rows=908609 width=618)
Group Key: (((COALESCE(a.ii, $0) || lpad(a.ii, 5, '0'::text)) || lpad(a.ii, 5, '0'::text))), a.ii, (COALESCE(a.ii, $2)), (CASE (a.ii)::integer WHEN 1 THEN 'qq'::text WHEN 2 THEN 'qq'::text WHEN 3 THEN 'qq'::text WHEN 4 THEN 'qq'::text WHEN 5 THEN 'qq qq'::text WHEN 6 THEN 'qq-qq'::text ELSE a.ii END), (CASE WHEN (COALESCE(a.ii, $3) = substr(a.ii, 1, length(COALESCE(a.ii, $4)))) THEN 'qq qq'::text WHEN (hashed SubPlan 7) THEN 'qq qq'::text ELSE 'qq qq qq'::text END)
InitPlan 1 (returns $0)
-> Seq Scan on d
InitPlan 3 (returns $2)
-> Seq Scan on d d
InitPlan 4 (returns $3)
-> Seq Scan on d d
InitPlan 5 (returns $4)
-> Seq Scan on d d
InitPlan 6 (returns $5)
-> Seq Scan on d d
-> Sort (cost=15681335.39..15704050.62 rows=9086093 width=313)
Sort Key: (((COALESCE(a.ii, $0) || lpad(a.ii, 5, '0'::text)) || lpad(a.ii, 5, '0'::text))), a.ii, (COALESCE(a.ii, $2)), (CASE (a.ii)::integer WHEN 1 THEN 'qq'::text WHEN 2 THEN 'qq'::text WHEN 3 THEN 'qq'::text WHEN 4 THEN 'qq'::text WHEN 5 THEN 'qq qq'::text WHEN 6 THEN 'qq-qq'::text ELSE a.ii END), (CASE WHEN (COALESCE(a.ii, $3) = substr(a.ii, 1, length(COALESCE(a.ii, $4)))) THEN 'qq qq'::text WHEN (hashed SubPlan 7) THEN 'qq qq'::text ELSE 'qq qq qq'::text END)
-> Append (cost=1.01..13295792.30 rows=9086093 width=313)
-> Seq Scan on a a (cost=1.01..5689033.34 rows=3948764 width=328)
Filter: ((ii >= '2021-02-10 00:00:00+10'::timestamp with time zone) AND (ii < '2021-02-11 00:00:00+10'::timestamp with time zone))
SubPlan 7
-> Seq Scan on d d (cost=0.00..1.01 rows=1 width=7)
-> Seq Scan on b (cost=1.01..12.75 rows=1 width=417)
Filter: ((ii >= '2021-02-10 00:00:00+10'::timestamp with time zone) AND (ii < '2021-02-11 00:00:00+10'::timestamp with time zone))
SubPlan 11
-> Seq Scan on d d (cost=0.00..1.01 rows=1 width=7)
-> Seq Scan on c c (cost=1.01..7561315.74 rows=5137328 width=302)
Filter: ((ii >= '2021-02-10 00:00:00+10'::timestamp with time zone) AND (ii < '2021-02-11 00:00:00+10'::timestamp with time zone))
SubPlan 14
-> Seq Scan on d d (cost=0.00..1.01 rows=1 width=7)

I restored to a test cluster, but so far not able to reproduce the issue there,
so I'm soliciting suggestions how to debug it further.

--
Justin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2021-02-12 02:17:48 Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
Previous Message Ajin Cherian 2021-02-12 01:48:32 Re: Single transaction in the tablesync worker?