Re: Combination of geqo and enable_partitionwise_join leads to crashes in the regression tests

From: Onder Kalaci <onderk(at)microsoft(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Combination of geqo and enable_partitionwise_join leads to crashes in the regression tests
Date: 2020-10-21 10:54:48
Message-ID: DM6PR21MB121177B425FFC0FE27143CC7D81C0@DM6PR21MB1211.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I think this is already discussed here: https://www.postgresql.org/message-id/flat/CAExHW5tgiLsYC_CLcaKHFFc8H56C0s9mCu_0OpahGxn%3DhUi_Pg%40mail.gmail.com#db54218ab7bb9e1484cdcc52abf2d324

Sorry for missing that thread before sending the mail.

From: Onder Kalaci <onderk(at)microsoft(dot)com>
Date: Wednesday, 21 October 2020 12:49
To: pgsql-hackers(at)postgresql(dot)org <pgsql-hackers(at)postgresql(dot)org>
Subject: Combination of geqo and enable_partitionwise_join leads to crashes in the regression tests
Hi,

I was running “make installcheck” with the following settings:

SET geqo_threshold=2;
SET geqo_generations=1000;
SETT geqo_pool_size=1000;
SET enable_partitionwise_join to true;

And, realized that “partition_join” test crashed. It is reproducible for both 12.3 and 13.0 (I’ve not tested further).

Minimal steps to reproduce:

SET geqo_threshold=2;
SET geqo_generations=1000;
SET geqo_pool_size=1000;
SET enable_partitionwise_join to true;

CREATE TABLE prt1 (a int, b int, c varchar) PARTITION BY RANGE(a);
CREATE TABLE prt1_p1 PARTITION OF prt1 FOR VALUES FROM (0) TO (250);
CREATE TABLE prt2 (a int, b int, c varchar) PARTITION BY RANGE(b);
CREATE TABLE prt2_p1 PARTITION OF prt2 FOR VALUES FROM (0) TO (250);

EXPLAIN (COSTS OFF)
SELECT t1.a,
ss.t2a,
ss.t2c
FROM prt1 t1
LEFT JOIN LATERAL
(SELECT t2.a AS t2a,
t3.a AS t3a,
t2.b t2b,
t2.c t2c,
least(t1.a, t2.a, t3.b)
FROM prt1 t2
JOIN prt2 t3 ON (t2.a = t3.b)) ss ON t1.c = ss.t2c
WHERE (t1.b + coalesce(ss.t2b, 0)) = 0
ORDER BY t1.a;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 4.966 ms
@:-!>

Top of the backtrace on PG 13.0:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1700000100)
* frame #0: 0x0000000108b255c0 postgres`bms_is_subset(a=0x0000001700000100, b=0x00007fac37834db8) at bitmapset.c:327:13
frame #1: 0x0000000108b65b55 postgres`generate_join_implied_equalities_normal(root=0x00007fac37815640, ec=0x00007fac3781d2c0, join_relids=0x00007fac37834db8, outer_relids=0x00007fac3781a9f8, inner_relids=0x00007fac37087608) at equivclass.c:1324:8
frame #2: 0x0000000108b659a9 postgres`generate_join_implied_equalities(root=0x00007fac37815640, join_relids=0x00007fac37834db8, outer_relids=0x00007fac3781a9f8, inner_rel=0x00007fac370873f0) at equivclass.c:1197:14
frame #3: 0x0000000108ba71a3 postgres`build_joinrel_restrictlist(root=<unavailable>, joinrel=0x00007fac37834ba0, outer_rel=0x00007fac37802f10, inner_rel=0x00007fac370873f0) at relnode.c:1079:8
frame #4: 0x0000000108ba6fe0 postgres`build_join_rel(root=0x00007fac37815640, joinrelids=0x00007fac370873c8, outer_rel=0x00007fac37802f10, inner_rel=0x00007fac370873f0, sjinfo=0x00007fac3781c540, restrictlist_ptr=0x00007ffee72c9668) at relnode.c:709:17
frame #5: 0x0000000108b6e552 postgres`make_join_rel(root=0x00007fac37815640, rel1=0x00007fac37802f10, rel2=0x00007fac370873f0) at joinrels.c:746:12
frame #6: 0x0000000108b58d68 postgres`merge_clump(root=0x00007fac37815640, clumps=0x00007fac37087348, new_clump=0x00007fac37087320, num_gene=3, force=<unavailable>) at geqo_eval.c:260:14
frame #7: 0x0000000108b58bee postgres`gimme_tree(root=<unavailable>, tour=0x00007fac378248c8, num_gene=<unavailable>) at geqo_eval.c:199:12
frame #8: 0x0000000108b58ab9 postgres`geqo_eval(root=0x00007fac37815640, tour=0x00007fac378248c8, num_gene=3) at geqo_eval.c:102:12
frame #9: 0x0000000108b592b8 postgres`random_init_pool(root=0x00007fac37815640, pool=0x00007fac37824828) at geqo_pool.c:109:25
frame #10: 0x0000000108b58fb7 postgres`geqo(root=0x00007fac37815640, number_of_rels=<unavailable>, initial_rels=<unavailable>) at geqo_main.c:114:2
frame #11: 0x0000000108b5988f postgres`make_one_rel(root=0x00007fac37815640, joinlist=0x00007fac3781cf08) at allpaths.c:227:8
frame #12: 0x0000000108b7f187 postgres`query_planner(root=0x00007fac37815640, qp_callback=<unavailable>, qp_extra=0x00007ffee7
….

Thanks,
Onder

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-10-21 11:16:17 Re: Is Recovery actually paused?
Previous Message Bharath Rupireddy 2020-10-21 10:50:16 Re: Parallel copy