Re: How to force Nested Loop plan?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Rob Nagler <nagler(at)bivio(dot)biz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: How to force Nested Loop plan?
Date: 2003-08-30 15:02:01
Message-ID: 24637.1062255721@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Rob Nagler <nagler(at)bivio(dot)biz> writes:
> I'm trying to understand how I can get the planner to always do the
> right thing with this query:

> SELECT
> aa_t.min_date_time
> FROM
> aa_t
> , bb_t
> , cc_t
> WHERE bb_t.bb_id = aa_t.bb_id
> AND aa_t.realm_id = cc_t.realm_id
> AND aa_t.server_id = 21
> ORDER BY aa_t.min_date_time desc
> LIMIT 1
> OFFSET 674

> -> Index Scan Backward using aa_t20 on aa_t (cost=0.00..76738.77 rows=3454 width=46) (actual time=0.10..31.30 rows=676 loops=1)
> Filter: (server_id = 21::numeric)

The reason the planner does not much like this plan is that it's
estimating that quite a lot of rows will have to be hit in min_date_time
order before it finds enough rows with server_id = 21. Thus the high
cost estimate for the above step.

I suspect that the reason you like this plan is that there's actually
substantial correlation between server_id and min_date_time, such that
the required rows are found quickly. Before trying to force the planner
into what you consider an optimal plan, you had better ask yourself
whether you can expect that correlation to hold up in the future.
If not, your plan could become pessimal pretty quickly.

I'd suggest creating a double-column index:

create index aa_ti on aa_t(server_id, min_date_time);

and altering the query to read

ORDER BY aa_t.server_id DESC, aa_t.min_date_time DESC

(you need this kluge to make sure the planner recognizes that the new
index matches the ORDER BY request). Then you should get a plan with
a much smaller cost coefficient for this step.

regards, tom lane

PS: does server_id really need to be NUMERIC? Why not integer, or at
worst bigint?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-08-30 15:14:04 Re: Selecting random rows efficiently
Previous Message Tom Lane 2003-08-30 14:47:42 Re: Selecting random rows efficiently