Re: Performance With Joins on Large Tables

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Joshua Marsh <icub3d(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, jim(at)nasby(dot)net
Subject: Re: Performance With Joins on Large Tables
Date: 2006-09-13 17:04:14
Message-ID: 1158167054.29889.72.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 2006-09-13 at 10:19 -0600, Joshua Marsh wrote:
> > Right, it's just used for planning. Avoid setting it too low, if it's
> > below about 2.0 you would most likely see some very strange plans.
> > Certainly it doesn't make sense at all to set it below 1.0, since that
> > is saying it's cheaper to get a random page than a sequential one.
> >
> > What was your original random_page_cost, and what is the new value you
> > set it to?
> >
> > Regards,
> > Jeff Davis
> >
> >
> >
> >
>
> I tried it at several levels. It was initially at 4 (the default). I
> tried 3 and 2 with no changes. When I set it to 1, it used and index
> on view_505 but no r3s169:
>
> data=# EXPLAIN SELECT v.phonedirect, v.editdrop, EXTRACT (EPOCH FROM
> v.custdate), EXTRACT (YEAR FROM s.custdate) || '-' || EXTRACT (MONTH
> FROM s.custdate) FROM view_505 v INNER JOIN r3s169 s ON v.dsiacctno =
> s.dsiacctno;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
> Merge Join (cost=154730044.01..278318711.49 rows=285230272 width=11)
> Merge Cond: ("outer".dsiacctno = "inner".dsiacctno)
> -> Index Scan using view_505_dsiacctno on view_505 v
> (cost=0.00..111923570.63 rows=112393848 width=20)
> -> Sort (cost=154730044.01..155443119.69 rows=285230272 width=17)
> Sort Key: s.dsiacctno
> -> Seq Scan on r3s169 s (cost=100000000.00..106873675.72
> rows=285230272 width=17)
>
>
> Setting to 0.1 finally gave me the result I was looking for. I know
> that the index scan is faster though. The seq scan never finished (i
> killed it after 24+ hours) and I'm running the query now with indexes
> and it's progressing nicely (will probably take 4 hours).

Hmm... that sounds bad. I'm sure your system will always choose indexes
with that value.

Is it overestimating the cost of using indexes or underestimating the
cost of a seq scan, or both? Maybe explain with the 0.1 setting will
help?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Christoph Nelles 2006-09-13 17:36:29 Unsubscribe
Previous Message Joshua Marsh 2006-09-13 16:19:04 Re: Performance With Joins on Large Tables