Re: Performance With Joins on Large Tables

From: "Joshua Marsh" <icub3d(at)gmail(dot)com>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>, pgsql-performance(at)postgresql(dot)org, jim(at)nasby(dot)net
Subject: Re: Performance With Joins on Large Tables
Date: 2006-09-13 16:19:04
Message-ID: 38242de90609130919k33208204ka1006ff0ca25f75b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 9/13/06, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Wed, 2006-09-13 at 08:49 -0600, Joshua Marsh wrote:
> > That seems to have done it. Are there any side effects to this
> > change? I read about random_page_cost in the documentation and it
> > seems like this is strictly for planning. All the tables on this
> > database will be indexed and of a size similar to these two, so I
> > don't see it causing any other problems. Though I would check though
> > :)
> >
>
> Right, it's just used for planning. Avoid setting it too low, if it's
> below about 2.0 you would most likely see some very strange plans.
> Certainly it doesn't make sense at all to set it below 1.0, since that
> is saying it's cheaper to get a random page than a sequential one.
>
> What was your original random_page_cost, and what is the new value you
> set it to?
>
> Regards,
> Jeff Davis
>
>
>
>

I tried it at several levels. It was initially at 4 (the default). I
tried 3 and 2 with no changes. When I set it to 1, it used and index
on view_505 but no r3s169:

data=# EXPLAIN SELECT v.phonedirect, v.editdrop, EXTRACT (EPOCH FROM
v.custdate), EXTRACT (YEAR FROM s.custdate) || '-' || EXTRACT (MONTH
FROM s.custdate) FROM view_505 v INNER JOIN r3s169 s ON v.dsiacctno =
s.dsiacctno;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Merge Join (cost=154730044.01..278318711.49 rows=285230272 width=11)
Merge Cond: ("outer".dsiacctno = "inner".dsiacctno)
-> Index Scan using view_505_dsiacctno on view_505 v
(cost=0.00..111923570.63 rows=112393848 width=20)
-> Sort (cost=154730044.01..155443119.69 rows=285230272 width=17)
Sort Key: s.dsiacctno
-> Seq Scan on r3s169 s (cost=100000000.00..106873675.72
rows=285230272 width=17)

Setting to 0.1 finally gave me the result I was looking for. I know
that the index scan is faster though. The seq scan never finished (i
killed it after 24+ hours) and I'm running the query now with indexes
and it's progressing nicely (will probably take 4 hours).

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Davis 2006-09-13 17:04:14 Re: Performance With Joins on Large Tables
Previous Message Jeff Davis 2006-09-13 16:07:34 Re: Performance With Joins on Large Tables