"could not find pathkey item to sort" for TPC-DS queries 94-96

From: Luc Vlaming <luc(at)swarm64(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: "could not find pathkey item to sort" for TPC-DS queries 94-96
Date: 2021-04-12 12:24:32
Message-ID: 91f3ec99-85a4-fa55-ea74-33f85a5c651f@swarm64.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When trying to run on master (but afaik also PG-13) TPC-DS queries 94,
95 and 96 on a SF10 I get the error "could not find pathkey item to sort".
When I disable enable_gathermerge the problem goes away and then the
plan for query 94 looks like below. I tried figuring out what the
problem is but to be honest I would need some pointers as the code that
tries to matching equivalence members in prepare_sort_from_pathkeys is
something i'm really not familiar with.

To reproduce you can either ingest and test using the toolkit I used too
(see https://github.com/swarm64/s64da-benchmark-toolkit/), or
alternatively just use the schema (see
https://github.com/swarm64/s64da-benchmark-toolkit/tree/master/benchmarks/tpcds/schemas/psql_native)

Best,
Luc

------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=229655.62..229655.63 rows=1 width=72)
-> Sort (cost=229655.62..229655.63 rows=1 width=72)
Sort Key: (count(DISTINCT ws1.ws_order_number))
-> Aggregate (cost=229655.60..229655.61 rows=1 width=72)
-> Nested Loop Semi Join (cost=1012.65..229655.59
rows=1 width=16)
-> Nested Loop (cost=1012.22..229653.73 rows=1
width=20)
Join Filter: (ws1.ws_web_site_sk =
web_site.web_site_sk)
-> Nested Loop (cost=1012.22..229651.08
rows=1 width=24)
-> Gather (cost=1011.80..229650.64
rows=1 width=28)
Workers Planned: 2
-> Nested Loop Anti Join
(cost=11.80..228650.54 rows=1 width=28)
-> Hash Join
(cost=11.37..227438.35 rows=2629 width=28)
Hash Cond:
(ws1.ws_ship_date_sk = date_dim.d_date_sk)
-> Parallel Seq
Scan on web_sales ws1 (cost=0.00..219548.92 rows=3000992 width=32)
-> Hash
(cost=10.57..10.57 rows=64 width=4)
-> Index Scan
using idx_d_date on date_dim (cost=0.29..10.57 rows=64 width=4)
Index
Cond: ((d_date >= '2000-03-01'::date) AND (d_date <= '2000-04-30'::date))
-> Index Only Scan using
idx_wr_order_number on web_returns wr1 (cost=0.42..0.46 rows=2 width=4)
Index Cond:
(wr_order_number = ws1.ws_order_number)
-> Index Scan using
customer_address_pkey on customer_address (cost=0.42..0.44 rows=1 width=4)
Index Cond: (ca_address_sk =
ws1.ws_ship_addr_sk)
Filter: ((ca_state)::text =
'GA'::text)
-> Seq Scan on web_site (cost=0.00..2.52
rows=10 width=4)
Filter: ((web_company_name)::text =
'pri'::text)
-> Index Scan using idx_ws_order_number on
web_sales ws2 (cost=0.43..1.84 rows=59 width=8)
Index Cond: (ws_order_number =
ws1.ws_order_number)
Filter: (ws1.ws_warehouse_sk <> ws_warehouse_sk)

The top of the stacktrace is:
#0 errfinish (filename=0x5562dc1a5125 "createplan.c", lineno=6186,
funcname=0x5562dc1a54d0 <__func__.14> "prepare_sort_from_pathkeys") at
elog.c:514
#1 0x00005562dbc2d7de in prepare_sort_from_pathkeys
(lefttree=0x5562dc5a2f58, pathkeys=0x5562dc4eabc8, relids=0x0,
reqColIdx=0x0, adjust_tlist_in_place=<optimized out>,
p_numsortkeys=0x7ffc0b8cda84, p_sortColIdx=0x7ffc0b8cda88,
p_sortOperators=0x7ffc0b8cda90, p_collations=0x7ffc0b8cda98,
p_nullsFirst=0x7ffc0b8cdaa0) at createplan.c:6186
#2 0x00005562dbe8d695 in make_sort_from_pathkeys (lefttree=<optimized
out>, pathkeys=<optimized out>, relids=<optimized out>) at createplan.c:6313
#3 0x00005562dbe8eba3 in create_sort_plan (flags=1,
best_path=0x5562dc548d68, root=0x5562dc508cf8) at createplan.c:2118
#4 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc548d68,
flags=1) at createplan.c:489
#5 0x00005562dbe8f315 in create_gather_merge_plan
(best_path=0x5562dc5782f8, root=0x5562dc508cf8) at createplan.c:1885
#6 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc5782f8,
flags=<optimized out>) at createplan.c:541
#7 0x00005562dbe8ddad in create_nestloop_plan
(best_path=0x5562dc585668, root=0x5562dc508cf8) at createplan.c:4237
#8 create_join_plan (best_path=0x5562dc585668, root=0x5562dc508cf8) at
createplan.c:1062
#9 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc585668,
flags=<optimized out>) at createplan.c:418
#10 0x00005562dbe8ddad in create_nestloop_plan
(best_path=0x5562dc5c4428, root=0x5562dc508cf8) at createplan.c:4237
#11 create_join_plan (best_path=0x5562dc5c4428, root=0x5562dc508cf8) at
createplan.c:1062
#12 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc5c4428,
flags=<optimized out>) at createplan.c:418
#13 0x00005562dbe8ddad in create_nestloop_plan
(best_path=0x5562dc5d3bd8, root=0x5562dc508cf8) at createplan.c:4237
#14 create_join_plan (best_path=0x5562dc5d3bd8, root=0x5562dc508cf8) at
createplan.c:1062
#15 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc5d3bd8,
flags=<optimized out>) at createplan.c:418
#16 0x00005562dbe8e428 in create_agg_plan (best_path=0x5562dc5d6f08,
root=0x5562dc508cf8) at createplan.c:2238
#17 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc5d6f08,
flags=3) at createplan.c:509
#18 0x00005562dbe8eb73 in create_sort_plan (flags=1,
best_path=0x5562dc5d7378, root=0x5562dc508cf8) at createplan.c:2109
#19 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc5d7378,
flags=1) at createplan.c:489
#20 0x00005562dbe8e7e8 in create_limit_plan (flags=1,
best_path=0x5562dc5d7a08, root=0x5562dc508cf8) at createplan.c:2784
#21 create_plan_recurse (root=0x5562dc508cf8, best_path=0x5562dc5d7a08,
flags=1) at createplan.c:536
#22 0x00005562dbe914ae in create_plan (root=root(at)entry=0x5562dc508cf8,
best_path=<optimized out>) at createplan.c:349

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luc Vlaming 2021-04-12 12:31:53 interaction between csps with dummy tlists and set_customscan_references
Previous Message vignesh C 2021-04-12 12:16:37 Re: Replication slot stats misgivings