Re: Optimizer improvements: to do or not to do?

From: "Say42" <andrews42(at)yandex(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimizer improvements: to do or not to do?
Date: 2006-09-13 04:58:44
Message-ID: 1158123524.490590.99590@e63g2000cwd.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut wrote:

> If you know of a more effective way to do that beyond the effective_cache_size
> parameter that we have now, let us know.

I don't know the better way and it is not my goal at all. I think about
more accurate cost estimation of nested loops join and subqueries.
Usual case in data request is a joining detail and some master tables
into a single relation. Often master tables are small and after some
nested loops iterations are well (perhaps wholly) cached. Cost
estimation of the tables access path don't care about the such caching
and cause overestimation. In some cases it can lead up to choosing not
the best plan.

Example from real life. The following request return count of national
calls from the call registration table.

select count(*) from conn.conn20060803 c
where
exists (select code from trunk_codes tc
where c.bnum >= tc.code and c.bnum like tc.code || '%'
order by tc.code desc limit 1)

enable_seqscan = off:

"Aggregate (cost=103185258.68..103185258.69 rows=1 width=0) (actual
time=13385.674..13385.676 rows=1 loops=1)"
" -> Seq Scan on conn20060803 c (cost=100000000.00..103184640.52
rows=247264 width=0) (actual time=0.409..13307.254 rows=38739 loops=1)"
" Filter: (subplan)"
" SubPlan"
" -> Limit (cost=0.00..6.42 rows=1 width=10) (actual
time=0.020..0.020 rows=0 loops=494527)"
" -> Index Scan Backward using belg_mobile_pkey on
belg_mobile tc (cost=0.00..6.42 rows=1 width=10) (actual
time=0.012..0.012 rows=0 loops=494527)"
" Index Cond: (($0)::text >= (code)::text)"
" Filter: (($0)::text ~~ ((code)::text ||
'%'::text))"
"Total runtime: 13385.808 ms"

enable_seqscan =on:

"Aggregate (cost=1101623.47..1101623.48 rows=1 width=0) (actual
time=63724.508..63724.509 rows=1 loops=1)"
" -> Seq Scan on conn20060803 c (cost=0.00..1101005.30 rows=247264
width=0) (actual time=2.244..63640.413 rows=38739 loops=1)"
" Filter: (subplan)"
" SubPlan"
" -> Limit (cost=2.20..2.20 rows=1 width=10) (actual
time=0.121..0.121 rows=0 loops=494527)"
" -> Sort (cost=2.20..2.20 rows=1 width=10) (actual
time=0.114..0.114 rows=0 loops=494527)"
" Sort Key: code"
" -> Seq Scan on belg_mobile tc (cost=0.00..2.19
rows=1 width=10) (actual time=0.096..0.099 rows=0 loops=494527)"
" Filter: ((($0)::text >= (code)::text) AND
(($0)::text ~~ ((code)::text || '%'::text)))"
"Total runtime: 63724.630 ms"

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-09-13 05:04:46 Re: Optimizer improvements: to do or not to do?
Previous Message andy 2006-09-13 02:20:55 Re: Vacuum error on database postgres