slow joins?

From: Joe Van Dyk <joe(at)tanga(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: slow joins?
Date: 2013-04-06 01:38:01
Message-ID: CACfv+pLwqw-c=kR65WjA_pFy9M2qoKM2Usy77BoD8QWNwcfGHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 9.2.4, running two identical queries except for the value of a column in
the WHERE clause. Postgres is picking very different query plans, the first
is much slower than the second.

Any ideas on how I can speed this up? I have btree indexes for all the
columns used in the query.

explain analyze

SELECT COUNT(*)

FROM purchased_items pi

inner join line_items li on li.id = pi.line_item_id

inner join products on products.id = li.product_id

WHERE products.drop_shipper_id = 221;

Aggregate (cost=193356.31..193356.32 rows=1 width=0) (actual
time=2425.225..2425.225 rows=1 loops=1)
-> Hash Join (cost=78864.43..193160.41 rows=78360 width=0) (actual
time=726.612..2424.206 rows=8413 loops=1)
Hash Cond: (pi.line_item_id = li.id)
-> Seq Scan on purchased_items pi (cost=0.00..60912.39
rows=3724639 width=4) (actual time=0.008..616.812 rows=3724639 loops=1)
-> Hash (cost=77937.19..77937.19 rows=56499 width=4) (actual
time=726.231..726.231 rows=8178 loops=1)
Buckets: 4096 Batches: 4 Memory Usage: 73kB
-> Hash Join (cost=1684.33..77937.19 rows=56499 width=4)
(actual time=1.270..723.222 rows=8178 loops=1)
Hash Cond: (li.product_id = products.id)
-> Seq Scan on line_items li (cost=0.00..65617.18
rows=2685518 width=8) (actual time=0.081..392.926 rows=2685499 loops=1)
-> Hash (cost=1676.60..1676.60 rows=618 width=4)
(actual time=0.835..0.835 rows=618 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
-> Bitmap Heap Scan on products
(cost=13.07..1676.60 rows=618 width=4) (actual time=0.185..0.752 rows=618
loops=1)
Recheck Cond: (drop_shipper_id = 221)
-> Bitmap Index Scan on
index_products_on_drop_shipper_id (cost=0.00..12.92 rows=618 width=0)
(actual time=0.125..0.125 rows=618 loops=1)
Index Cond: (drop_shipper_id = 221)
Total runtime: 2425.302 ms

explain analyze

SELECT COUNT(*)

FROM purchased_items pi

inner join line_items li on li.id = pi.line_item_id

inner join products on products.id = li.product_id

WHERE products.drop_shipper_id = 2;

Aggregate (cost=29260.40..29260.41 rows=1 width=0) (actual
time=0.906..0.906 rows=1 loops=1)
-> Nested Loop (cost=0.00..29254.38 rows=2409 width=0) (actual
time=0.029..0.877 rows=172 loops=1)
-> Nested Loop (cost=0.00..16011.70 rows=1737 width=4) (actual
time=0.021..0.383 rows=167 loops=1)
-> Index Scan using index_products_on_drop_shipper_id on
products (cost=0.00..80.41 rows=19 width=4) (actual time=0.010..0.074
rows=70 loops=1)
Index Cond: (drop_shipper_id = 2)
-> Index Scan using index_line_items_on_product_id on
line_items li (cost=0.00..835.70 rows=279 width=8) (actual
time=0.002..0.004 rows=2 loops=70)
Index Cond: (product_id = products.id)
-> Index Only Scan using purchased_items_line_item_id_idx on
purchased_items pi (cost=0.00..7.60 rows=2 width=4) (actual
time=0.002..0.003 rows=1 loops=167)
Index Cond: (line_item_id = li.id)
Heap Fetches: 5
Total runtime: 0.955 ms
(11 rows)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Joe Van Dyk 2013-04-06 01:42:29 Re: slow joins?
Previous Message Merlin Moncure 2013-04-05 18:40:16 Re: Picking out the most recent row using a time stamp column