Re: Parallel INSERT (INTO ... SELECT ...)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-11 18:00:44
Message-ID: 2309260.1615485644@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The buildfarm has revealed that this patch doesn't work under
CLOBBER_CACHE_ALWAYS:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-03-10%2021%3A09%3A32

I initially thought that that was a problem with c3ffe34863, but after
reproducing it here I get this stack trace:

#0 target_rel_trigger_max_parallel_hazard (context=0x7fff9cbfc220,
trigdesc=0x7fdd7dfa9718) at clauses.c:971
#1 target_rel_max_parallel_hazard_recurse (command_type=CMD_INSERT,
context=0x7fff9cbfc220, rel=0x7fdd7df819e0) at clauses.c:929
#2 target_rel_max_parallel_hazard_recurse (rel=0x7fdd7df819e0,
command_type=<optimized out>, context=0x7fff9cbfc220) at clauses.c:893
#3 0x000000000075d26e in target_rel_max_parallel_hazard (
context=0x7fff9cbfc220) at clauses.c:884
#4 max_parallel_hazard_walker (node=node(at)entry=0x146e590,
context=context(at)entry=0x7fff9cbfc220) at clauses.c:831
#5 0x0000000000700812 in range_table_entry_walker (rte=0x146e590,
walker=0x75cf00 <max_parallel_hazard_walker>, context=0x7fff9cbfc220,
flags=16) at nodeFuncs.c:2467
#6 0x0000000000700927 in range_table_walker (rtable=0x11fdd70,
walker=walker(at)entry=0x75cf00 <max_parallel_hazard_walker>,
context=context(at)entry=0x7fff9cbfc220, flags=16) at nodeFuncs.c:2446
#7 0x0000000000700ada in query_tree_walker (flags=<optimized out>,
context=0x7fff9cbfc220, walker=0x75cf00 <max_parallel_hazard_walker>,
query=0x11fdc58) at nodeFuncs.c:2423
#8 query_tree_walker (query=query(at)entry=0x700927 <range_table_walker+87>,
walker=walker(at)entry=0x75cf00 <max_parallel_hazard_walker>,
context=context(at)entry=0x11fdc58, flags=<optimized out>) at nodeFuncs.c:2336
#9 0x000000000075d138 in max_parallel_hazard_walker (
node=node(at)entry=0x11fdc58, context=0x11fdc58, context(at)entry=0x7fff9cbfc220)
at clauses.c:853
#10 0x000000000075dc98 in max_parallel_hazard (parse=parse(at)entry=0x11fdc58,
glob=glob(at)entry=0x11fdb40) at clauses.c:585
#11 0x000000000074cd22 in standard_planner (parse=0x11fdc58,
query_string=<optimized out>, cursorOptions=256,
boundParams=<optimized out>) at planner.c:345
#12 0x0000000000814947 in pg_plan_query (querytree=0x11fdc58,
query_string=0x11fc740 "insert into fk_notpartitioned_pk (a, b)\n select 2048, x from generate_series(1,10) x;", cursorOptions=256, boundParams=0x0)
at postgres.c:809
#13 0x0000000000814a43 in pg_plan_queries (querytrees=0x14725d0,
query_string=query_string(at)entry=0x11fc740 "insert into fk_notpartitioned_pk (a, b)\n select 2048, x from generate_series(1,10) x;",
cursorOptions=cursorOptions(at)entry=256, boundParams=boundParams(at)entry=0x0)
at postgres.c:900
#14 0x0000000000814d35 in exec_simple_query (
query_string=0x11fc740 "insert into fk_notpartitioned_pk (a, b)\n select 2048, x from generate_series(1,10) x;") at postgres.c:1092

The problem is that target_rel_trigger_max_parallel_hazard and its caller
think they can use a relcache TriggerDesc field across other cache
accesses, which they can't because the relcache doesn't guarantee that
that won't move.

One approach would be to add logic to RelationClearRelation similar to
what it does for tupdescs, rules, etc, to avoid moving them when their
contents haven't changed. But given that we've not needed that for the
past several decades, I'm disinclined to add the overhead. I think this
code ought to be adjusted to not make its own copy of the trigdesc
pointer, but instead fetch it out of the relcache struct each time it is
accessed. There's no real reason why
target_rel_trigger_max_parallel_hazard shouldn't be passed the (stable)
Relation pointer instead of just the trigdesc pointer.

BTW, having special logic for FK triggers in
target_rel_trigger_max_parallel_hazard seems quite loony to me.
Why isn't that handled by setting appropriate proparallel values
for those trigger functions?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-03-11 18:13:39 Re: Performance degradation of REFRESH MATERIALIZED VIEW
Previous Message Tom Lane 2021-03-11 17:44:37 Re: Huge memory consumption on partitioned table with FKs