How to force Nested Loop plan?

From: Rob Nagler <nagler(at)bivio(dot)biz>
To: pgsql-performance(at)postgresql(dot)org
Subject: How to force Nested Loop plan?
Date: 2003-08-30 13:16:13
Message-ID: 16208.41885.960000.12706@gargle.gargle.HOWL
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm trying to understand how I can get the planner to always do the
right thing with this query:

EXPLAIN ANALYZE
SELECT
aa_t.min_date_time
FROM
aa_t
, bb_t
, cc_t
WHERE bb_t.bb_id = aa_t.bb_id
AND aa_t.realm_id = cc_t.realm_id
AND aa_t.server_id = 21
ORDER BY aa_t.min_date_time desc
LIMIT 1
OFFSET 674
;

There's a extreme elbow in the performance curve around the sum of
LIMIT and OFFSET. The two plans follow. First for the query above:

Limit (cost=21569.56..21601.56 rows=1 width=84) (actual time=59.60..59.69 rows=1 loops=1)
-> Nested Loop (cost=0.00..110535.66 rows=3454 width=84) (actual time=0.19..59.20 rows=676 loops=1)
-> Nested Loop (cost=0.00..93177.46 rows=3454 width=65) (actual time=0.14..44.41 rows=676 loops=1)
-> Index Scan Backward using aa_t20 on aa_t (cost=0.00..76738.77 rows=3454 width=46) (actual time=0.10..31.30 rows=676 loops=1)
Filter: (server_id = 21::numeric)
-> Index Scan using cc_t1 on cc_t (cost=0.00..4.75 rows=1 width=19) (actual time=0.01..0.01 rows=1 loops=676)
Index Cond: ("outer".realm_id = cc_t.realm_id)
-> Index Scan using bb_t1 on bb_t (cost=0.00..5.01 rows=1 width=19) (actual time=0.02..0.02 rows=1 loops=676)
Index Cond: (bb_t.bb_id = "outer".bb_id)
Total runtime: 59.89 msec
(10 rows)

Setting OFFSET to 675 in the above query, results in this 100 times
slower plan:

Limit (cost=21614.48..21614.48 rows=1 width=84) (actual time=4762.39..4762.39 rows=1 loops=1)
-> Sort (cost=21612.79..21621.42 rows=3454 width=84) (actual time=4761.45..4761.92 rows=677 loops=1)
Sort Key: aa_t.min_date_time
-> Merge Join (cost=21139.96..21409.80 rows=3454 width=84) (actual time=4399.80..4685.24 rows=41879 loops=1)
Merge Cond: ("outer".bb_id = "inner".bb_id)
-> Sort (cost=8079.83..8184.53 rows=41879 width=19) (actual time=936.99..967.37 rows=41879 loops=1)
Sort Key: bb_t.bb_id
-> Seq Scan on bb_t (cost=0.00..4864.79 rows=41879 width=19) (actual time=0.06..729.60 rows=41879 loops=1)
-> Sort (cost=13060.13..13068.76 rows=3454 width=65) (actual time=3462.76..3493.97 rows=41879 loops=1)
Sort Key: aa_t.bb_id
-> Merge Join (cost=12794.42..12857.14 rows=3454 width=65) (actual time=2923.62..3202.78 rows=41879 loops=1)
Merge Cond: ("outer".realm_id = "inner".realm_id)
-> Sort (cost=12762.78..12771.41 rows=3454 width=46) (actual time=2920.78..2950.87 rows=41879 loops=1)
Sort Key: aa_t.realm_id
-> Index Scan using aa_t5 on aa_t (cost=0.00..12559.79 rows=3454 width=46) (actual time=0.18..2589.22 rows=41879 loops=1)
Index Cond: (server_id = 21::numeric)
-> Sort (cost=31.64..32.78 rows=455 width=19) (actual time=2.54..33.12 rows=42163 loops=1)
Sort Key: cc_t.realm_id
-> Seq Scan on cc_t (cost=0.00..11.55 rows=455 width=19) (actual time=0.04..0.86 rows=455 loops=1)
Total runtime: 4792.84 msec
(20 rows)

Twiddling effective_cache_size and random_page_cost allows for a large
LIMIT+OFFSET number but not enough. These tests are made with 400000
effective_cache_size and random_page_cost of 4.

I can increase the LIMIT+OFFSET elbow to 1654 by changing the
query thusly:

< AND aa_t.server_id = 21
---
> AND aa_t.server_id IN (21, 0)

The value 0 is an invalid server_id, so I know it won't be returned.
However, I've got 41K rows that could be returned by this query and
growing, and 1654 is obviously not enough. (aa is 690K rows, bb is
41K rows, and cc is 500 rows.)

If I drop the ORDER BY, the query goes much faster, but the query is
useless without the ORDER BY.

I've figured out that the second plan is slow, because it is writing a
huge result set to disk (+200MB). This doesn't make sense to me,
since sort_mem is 32000.

Is there a way to tell the optimizer to use Nested Loop plan always
instead of the Merge/Join plan? Turning off enable_mergejoin is
obviously not an option.

Thanks,
Rob

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rod Taylor 2003-08-30 13:58:12 Re: Hardware recommendations to scale to silly load
Previous Message Richard Jones 2003-08-30 13:09:03 Selecting random rows efficiently