BUG #17619: AllocSizeIsValid violation in parallel hash join

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: dastapov(at)gmail(dot)com
Subject: BUG #17619: AllocSizeIsValid violation in parallel hash join
Date: 2022-09-21 15:57:21
Message-ID: 17619-0de62ceda812b8b5@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17619
Logged by: Dmitry Astapov
Email address: dastapov(at)gmail(dot)com
PostgreSQL version: 13.8
Operating system: Linux (CentOS 7)
Description:

This is how the issue looks like from the user perspective (happens with
both "explain analyze" and "select" as well):

# explain analyze select n, id, rhs from lhs left join (select id, date1,
date2, source,payload from rhs) as rhs using (id) where n > bigint
E'3240754568';
ERROR: XX000: invalid memory alloc request size 1702125924
LOCATION: MemoryContextAlloc, mcxt.c:804

Query plan (as shown by explain) is rather simple:
# explain select n, id, rhs from lhs left join (select id, date1, date2,
source,payload from rhs) as rhs using (id) where n > bigint E'3240754568';
QUERY PLAN

─────────────────────────────────────────────────────────────────────────────────────────
Gather (cost=186066.70..198955.08 rows=13 width=52)
Workers Planned: 2
-> Parallel Hash Left Join (cost=185066.70..197953.78 rows=5
width=52)
Hash Cond: (lhs.id = rhs.id)
-> Parallel Seq Scan on lhs (cost=0.00..12887.04 rows=5
width=20)
Filter: (n > '3240754568'::bigint)
-> Parallel Hash (cost=180233.53..180233.53 rows=386653
width=44)
-> Parallel Seq Scan on rhs (cost=0.00..180233.53
rows=386653 width=44)

I managed to add "print pid and sleep" into that function and attach gdb,
and this is the backtrace I got:

(gdb) bt
#0 0x00007fbb83da59e0 in __nanosleep_nocancel () from /lib64/libc.so.6
#1 0x00007fbb83da5894 in sleep () from /lib64/libc.so.6
#2 0x0000000000926584 in MemoryContextAlloc (context=0x134d190,
size=size(at)entry=1702125924) at mcxt.c:806
#3 0x000000000092cd9d in sts_read_tuple (meta_data=0x7ffc88d1264c,
accessor=0x1360e00) at sharedtuplestore.c:450
#4 sts_parallel_scan_next (accessor=accessor(at)entry=0x1360e00,
meta_data=meta_data(at)entry=0x7ffc88d1264c) at sharedtuplestore.c:530
#5 0x0000000000682911 in ExecParallelHashJoinNewBatch (hjstate=<optimized
out>, hjstate=<optimized out>) at nodeHashjoin.c:1153
#6 ExecHashJoinImpl (parallel=true, pstate=0x13112a0) at
nodeHashjoin.c:560
#7 ExecParallelHashJoin (pstate=0x13112a0) at nodeHashjoin.c:607
#8 0x0000000000665f88 in ExecProcNodeInstr (node=0x13112a0) at
execProcnode.c:466
#9 0x000000000067b8a4 in ExecProcNode (node=0x13112a0) at
../../../src/include/executor/executor.h:248
#10 gather_getnext (gatherstate=0x1311098) at nodeGather.c:295
#11 ExecGather (pstate=0x1311098) at nodeGather.c:227
#12 0x0000000000665f88 in ExecProcNodeInstr (node=0x1311098) at
execProcnode.c:466
#13 0x000000000065e542 in ExecProcNode (node=0x1311098) at
../../../src/include/executor/executor.h:248
#14 ExecutePlan (execute_once=<optimized out>, dest=0xa79fe0 <donothingDR>,
direction=<optimized out>, numberTuples=0, sendTuples=true,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x1311098, estate=0x1310e10) at execMain.c:1632
#15 standard_ExecutorRun (queryDesc=0x130f980, direction=<optimized out>,
count=0, execute_once=<optimized out>) at execMain.c:350
#16 0x00000000005f281f in ExplainOnePlan
(plannedstmt=plannedstmt(at)entry=0x13056d8, into=into(at)entry=0x0,
es=es(at)entry=0x12f63b0,
queryString=queryString(at)entry=0x122daa0 "explain analyze select n, id,
rhs from lhs left join (select id, date1, date2, source,payload from rhs) as
rhs using (id) where n > bigint E'3240754568';", params=params(at)entry=0x0,
queryEnv=queryEnv(at)entry=0x0,
planduration=planduration(at)entry=0x7ffc88d12820, bufusage=0x0) at
explain.c:571
#17 0x00000000005f2b8d in ExplainOneQuery (query=<optimized out>,
cursorOptions=<optimized out>, into=0x0, es=0x12f63b0,
queryString=0x122daa0 "explain analyze select n, id, rhs from lhs left
join (select id, date1, date2, source,payload from rhs) as rhs using (id)
where n > bigint E'3240754568';", params=0x0, queryEnv=0x0) at
explain.c:404
#18 0x00000000005f31f3 in ExplainQuery (pstate=pstate(at)entry=0x12517c0,
stmt=stmt(at)entry=0x122f848, params=params(at)entry=0x0,
dest=dest(at)entry=0x12f6318) at explain.c:275
#19 0x00000000007e2785 in standard_ProcessUtility (pstmt=0x12d6168,
queryString=0x122daa0 "explain analyze select n, id, rhs from lhs left join
(select id, date1, date2, source,payload from rhs) as rhs using (id) where n
> bigint E'3240754568';",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x12f6318, qc=0x7ffc88d12ac0) at utility.c:846
#20 0x00000000007e082e in PortalRunUtility (portal=portal(at)entry=0x128fb10,
pstmt=0x12d6168, isTopLevel=<optimized out>,
setHoldSnapshot=setHoldSnapshot(at)entry=true, dest=dest(at)entry=0x12f6318,
qc=qc(at)entry=0x7ffc88d12ac0) at pquery.c:1153
#21 0x00000000007e0ce5 in FillPortalStore (portal=portal(at)entry=0x128fb10,
isTopLevel=isTopLevel(at)entry=true) at pquery.c:1026
#22 0x00000000007e105c in PortalRun (portal=portal(at)entry=0x128fb10,
count=count(at)entry=9223372036854775807, isTopLevel=isTopLevel(at)entry=true,
run_once=run_once(at)entry=true, dest=dest(at)entry=0x12ef350,
altdest=altdest(at)entry=0x12ef350, qc=qc(at)entry=0x7ffc88d12ca0)
at pquery.c:760
#23 0x00000000007dc774 in exec_simple_query (query_string=0x122daa0 "explain
analyze select n, id, rhs from lhs left join (select id, date1, date2,
source,payload from rhs) as rhs using (id) where n > bigint E'3240754568';")
at postgres.c:1238
#24 0x00000000007dda1a in PostgresMain (argc=<optimized out>,
argv=argv(at)entry=0x1257bd8, dbname=0x1257af0 "dastapov", username=<optimized
out>) at postgres.c:4347
#25 0x000000000048409f in BackendRun (port=<optimized out>, port=<optimized
out>) at postmaster.c:4550
#26 BackendStartup (port=0x124fad0) at postmaster.c:4234
#27 ServerLoop () at postmaster.c:1739
#28 0x0000000000758e05 in PostmasterMain (argc=argc(at)entry=9,
argv=argv(at)entry=0x1228420) at postmaster.c:1412
#29 0x000000000048546e in main (argc=9, argv=0x1228420) at main.c:210

If I go to frame 3 (sts_read_tuple), it would be in this part of
sts_read_tuple:

if (size > accessor->read_buffer_size)
{
size_t new_read_buffer_size;

if (accessor->read_buffer != NULL)
pfree(accessor->read_buffer);
new_read_buffer_size = Max(size, accessor->read_buffer_size * 2);
accessor->read_buffer =
MemoryContextAlloc(accessor->context, new_read_buffer_size); <<<<-----
HERE is where the memory allocation is happening
accessor->read_buffer_size = new_read_buffer_size;
}

At this point:
(gdb) frame 3
#3 0x000000000092cd9d in sts_read_tuple (meta_data=0x7ffc88d1264c,
accessor=0x1360e00) at sharedtuplestore.c:450
450 MemoryContextAlloc(accessor->context, new_read_buffer_size);
(gdb) p accessor->read_buffer_size
$1 = 5641470
(gdb) p size
$2 = 1702125924
(gdb) p accessor->read_file->name
$5 = 0x1336af8 "i12of16.p1"
(gdb) p accessor->read_file->numFiles
$6 = 1

Looking in base/pgsql_tmp, i see there pgsql_tmp4511.0.sharedfileset which
is 5.1 GB large, and contains:
$ ls -ltr
-rw------- 1 user group 94M Sep 21 16:05 i7of8.p2.0
-rw------- 1 user group 90M Sep 21 16:05 i7of8.p1.0
-rw------- 1 user group 88M Sep 21 16:05 i7of8.p0.0
-rw------- 1 user group 96M Sep 21 16:05 i6of8.p2.0
-rw------- 1 user group 74M Sep 21 16:05 i6of8.p1.0
-rw------- 1 user group 75M Sep 21 16:05 i6of8.p0.0
-rw------- 1 user group 84M Sep 21 16:05 i5of8.p2.0
-rw------- 1 user group 84M Sep 21 16:05 i5of8.p1.0
-rw------- 1 user group 77M Sep 21 16:05 i5of8.p0.0
-rw------- 1 user group 66M Sep 21 16:05 i4of8.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i4of8.p1.0
-rw------- 1 user group 78M Sep 21 16:05 i4of8.p0.0
-rw------- 1 user group 69M Sep 21 16:05 i3of8.p2.0
-rw------- 1 user group 86M Sep 21 16:05 i3of8.p1.0
-rw------- 1 user group 83M Sep 21 16:05 i3of8.p0.0
-rw------- 1 user group 80M Sep 21 16:05 i2of8.p2.0
-rw------- 1 user group 87M Sep 21 16:05 i2of8.p1.0
-rw------- 1 user group 88M Sep 21 16:05 i2of8.p0.0
-rw------- 1 user group 97M Sep 21 16:05 i1of8.p2.0
-rw------- 1 user group 77M Sep 21 16:05 i1of8.p1.0
-rw------- 1 user group 90M Sep 21 16:05 i1of8.p0.0
-rw------- 1 user group 86M Sep 21 16:05 i9of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i9of16.p1.0
-rw------- 1 user group 82M Sep 21 16:05 i9of16.p0.0
-rw------- 1 user group 78M Sep 21 16:05 i8of16.p2.0
-rw------- 1 user group 70M Sep 21 16:05 i8of16.p1.0
-rw------- 1 user group 73M Sep 21 16:05 i8of16.p0.0
-rw------- 1 user group 78M Sep 21 16:05 i7of16.p2.0
-rw------- 1 user group 72M Sep 21 16:05 i7of16.p1.0
-rw------- 1 user group 73M Sep 21 16:05 i7of16.p0.0
-rw------- 1 user group 71M Sep 21 16:05 i6of16.p2.0
-rw------- 1 user group 75M Sep 21 16:05 i6of16.p1.0
-rw------- 1 user group 74M Sep 21 16:05 i6of16.p0.0
-rw------- 1 user group 77M Sep 21 16:05 i5of16.p2.0
-rw------- 1 user group 74M Sep 21 16:05 i5of16.p1.0
-rw------- 1 user group 74M Sep 21 16:05 i5of16.p0.0
-rw------- 1 user group 73M Sep 21 16:05 i4of16.p2.0
-rw------- 1 user group 74M Sep 21 16:05 i4of16.p1.0
-rw------- 1 user group 67M Sep 21 16:05 i4of16.p0.0
-rw------- 1 user group 74M Sep 21 16:05 i3of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i3of16.p1.0
-rw------- 1 user group 72M Sep 21 16:05 i3of16.p0.0
-rw------- 1 user group 79M Sep 21 16:05 i2of16.p2.0
-rw------- 1 user group 73M Sep 21 16:05 i2of16.p1.0
-rw------- 1 user group 74M Sep 21 16:05 i2of16.p0.0
-rw------- 1 user group 70M Sep 21 16:05 i1of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i1of16.p1.0
-rw------- 1 user group 80M Sep 21 16:05 i1of16.p0.0
-rw------- 1 user group 86M Sep 21 16:05 i15of16.p2.0
-rw------- 1 user group 85M Sep 21 16:05 i15of16.p1.0
-rw------- 1 user group 86M Sep 21 16:05 i15of16.p0.0
-rw------- 1 user group 78M Sep 21 16:05 i14of16.p2.0
-rw------- 1 user group 80M Sep 21 16:05 i14of16.p1.0
-rw------- 1 user group 71M Sep 21 16:05 i14of16.p0.0
-rw------- 1 user group 75M Sep 21 16:05 i13of16.p2.0
-rw------- 1 user group 71M Sep 21 16:05 i13of16.p1.0
-rw------- 1 user group 78M Sep 21 16:05 i13of16.p0.0
-rw------- 1 user group 69M Sep 21 16:05 i12of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i12of16.p1.0
-rw------- 1 user group 80M Sep 21 16:05 i12of16.p0.0
-rw------- 1 user group 73M Sep 21 16:05 i11of16.p2.0
-rw------- 1 user group 72M Sep 21 16:05 i11of16.p1.0
-rw------- 1 user group 75M Sep 21 16:05 i11of16.p0.0
-rw------- 1 user group 79M Sep 21 16:05 i10of16.p2.0
-rw------- 1 user group 77M Sep 21 16:05 i10of16.p1.0
-rw------- 1 user group 80M Sep 21 16:05 i10of16.p0.0

So it looks like we are reading i12of16.p1.0, which is only 78M large. I
have retained a copy of the files in this location for further analysis, if
needed.

Unfortunately, I cannot share my dataset with you, and my attempts to come
up with a synthetic dataset that reproduces this issue have failed so far. I
am very confident, though, that issue is not caused by database corruption,
as I can take my dataset as two csv files, \copy them it into a freshly
initdb-ed database and immediately reproduce the issue.

Tables in question (lhs and rhs) have 953201 and 927968 rows accordingly and
are this large:
\dt+
List of relations
Schema │ Name │ Type │ Owner │ Persistence │ Access method │ Size │
Description
────────┼──────┼───────┼──────────┼─────────────┼───────────────┼─────────┼─────────────
public │ lhs │ table │ dastapov │ permanent │ heap │ 49 MB │

public │ rhs │ table │ dastapov │ permanent │ heap │ 1747 MB │

\d+ lhs
Table "public.lhs"
Column │ Type │ Collation │ Nullable │ Default │ Storage │ Stats target
│ Description
────────┼────────┼───────────┼──────────┼─────────┼──────────┼──────────────┼─────────────
id │ text │ │ │ │ extended │

n │ bigint │ │ │ │ plain │

Access method: heap

\d+ rhs
Table "public.rhs"
Column │ Type │ Collation │ Nullable │ Default │ Storage │ Stats target │
Description
─────────┼──────┼───────────┼──────────┼─────────┼──────────┼──────────────┼─────────────
id │ text │ │ │ │ extended │ │

date1 │ date │ │ │ │ plain │ │

date2 │ date │ │ │ │ plain │ │

source │ text │ │ │ │ extended │ │

payload │ text │ │ │ │ extended │ │

Access method: heap

# select tablename, attname, avg_width, n_distinct from pg_stats where
tablename in ('lhs', 'rhs') order by 1;
tablename │ attname │ avg_width │ n_distinct
───────────┼─────────┼───────────┼─────────────
lhs │ id │ 12 │ -0.95371073
lhs │ n │ 8 │ -1
rhs │ payload │ 1383 │ -1
rhs │ id │ 12 │ -1
rhs │ date1 │ 4 │ 609
rhs │ date2 │ 4 │ 11
rhs │ source │ 17 │ 4275

Potentially interesting piece of the puzzle is that there are some long
outliers in rhs.payload and rhs.source, but the rest of the columns have
values that are exactly of avg_width bytes:

# select log_len, count(*) from (select log(length(payload))::int as log_len
from rhs) foo group by 1 order by 2 desc;
log_len │ count
─────────┼────────
3 │ 840852
4 │ 77776
5 │ 8003
6 │ 1317
7 │ 20
(5 rows)

# select log_len, count(*) from (select log(length(source))::int as log_len
from rhs) foo group by 1 order by 2 desc;
log_len │ count
─────────┼────────
1 │ 907692
3 │ 11793
2 │ 4109
4 │ 2920
5 │ 1454
(5 rows)

As I said, I tried to create tables with similar data using generate_series,
but failed to reproduce the issue. However, I am happy to provide additional
information / do additional debugging, as necessary, in lieu of my inability
to share the offending dataset.

pg_settings:
# select name, setting from pg_settings where setting <> boot_val;
name │ setting
────────────────────────────┼───────────────────────────────
application_name │ psql
archive_command │ (disabled)
client_encoding │ UTF8
data_directory_mode │ 0700
default_text_search_config │ pg_catalog.english
lc_collate │ en_US.utf8
lc_ctype │ en_US.utf8
lc_messages │ en_US.utf8
lc_monetary │ en_US.utf8
lc_numeric │ en_US.utf8
lc_time │ en_US.utf8
log_file_mode │ 0600
log_filename │ postgresql-%a.log
log_rotation_size │ 0
log_timezone │ Europe/London
log_truncate_on_rotation │ on
logging_collector │ on
max_locks_per_transaction │ 128
max_stack_depth │ 2048
server_encoding │ UTF8
shared_buffers │ 4194304
synchronous_commit │ off
temp_buffers │ 32768
TimeZone │ Europe/London
unix_socket_directories │ /home/dastapov/local/pg13-bug
unix_socket_permissions │ 0777
wal_buffers │ 2048
work_mem │ 262144
(28 rows)

Let me know how I can assist further.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Masahiko Sawada 2022-09-22 02:03:24 Re: BUG #17385: "RESET transaction_isolation" inside serializable transaction causes Assert at the transaction end
Previous Message Junwang Zhao 2022-09-21 15:05:12 Re: The keyword in the procedure's error message is "function", which should be "procedure"