BUG #15717: Index not used when ordering by left joined table column

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: cdalxndr(at)yahoo(dot)com
Subject: BUG #15717: Index not used when ordering by left joined table column
Date: 2019-03-27 15:30:27
Message-ID: 15717-e8df3cbc92d5d474@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15717
Logged by: Alex
Email address: cdalxndr(at)yahoo(dot)com
PostgreSQL version: 10.0
Operating system: Windows 10
Description:

The following query runs slow, as it seq scans tables, without using any
index:
select
wp.id as id
from
web_page wp
left join web_url_path url on wp.id=url.page
order by
url.priority asc
limit 1

Explain analyze:
"Limit (cost=180140.55..180140.55 rows=1 width=8) (actual
time=2402.310..2402.310 rows=1 loops=1)"
" -> Sort (cost=180140.55..185578.31 rows=2175106 width=8) (actual
time=2402.309..2402.309 rows=1 loops=1)"
" Sort Key: url.priority"
" Sort Method: top-N heapsort Memory: 25kB"
" -> Hash Right Join (cost=68981.25..169265.02 rows=2175106
width=8) (actual time=588.010..2232.158 rows=2159417 loops=1)"
" Hash Cond: (url.page = wp.id)"
" -> Seq Scan on web_url_path url (cost=0.00..47376.06
rows=2175106 width=8) (actual time=0.044..403.351 rows=2175106 loops=1)"
" -> Hash (cost=33480.89..33480.89 rows=2163789 width=4)
(actual time=581.473..581.473 rows=2158888 loops=1)"
" Buckets: 131072 Batches: 32 Memory Usage: 3403kB"
" -> Seq Scan on web_page wp (cost=0.00..33480.89
rows=2163789 width=4) (actual time=0.076..251.148 rows=2158888 loops=1)"
"Planning time: 0.295 ms"
"Execution time: 2402.956 ms"

Note that I have indexes (btree) on columns "web_url_path.priority" and
"web_url_path.page".
I was expecting the "priority" column index would be used to retrieve first
url then do a reverse join to get the corresponding page, and continue if no
page is found.

Note that the following query uses the "priority" index:
select u.id
from web_url_path u
order by u.priority
limit 1

Explain analyze:
"Limit (cost=0.43..0.50 rows=1 width=8) (actual time=0.014..0.014 rows=1
loops=1)"
" -> Index Scan using web_url_path_priority_idx on web_url_path u
(cost=0.43..144920.91 rows=2175106 width=8) (actual time=0.014..0.014 rows=1
loops=1)"
"Planning time: 0.114 ms"
"Execution time: 0.026 ms"

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Lars Vonk 2019-03-27 16:17:51 Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery
Previous Message Tom Lane 2019-03-27 14:14:21 Re: BUG #15715: UPDATE using "in (subquery for update limit 1)" does not respect the limit in subquery