Re: Performance With Joins on Large Tables

From: "Joshua Marsh" <icub3d(at)gmail(dot)com>
To: "Terje Elde" <terje(at)elde(dot)net>
Cc: "Jeff Davis" <pgsql(at)j-davis(dot)com>, pgsql-performance(at)postgresql(dot)org, jim(at)nasby(dot)net
Subject: Re: Performance With Joins on Large Tables
Date: 2006-09-13 20:27:47
Message-ID: 38242de90609131327o68d2866fh92c12ddcf364aba7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 9/13/06, Terje Elde <terje(at)elde(dot)net> wrote:
> Jeff Davis wrote:
> > Is it overestimating the cost of using indexes or underestimating the
> > cost of a seq scan, or both? Maybe explain with the 0.1 setting will
> > help?
> >
>
> If enable_seqscan is off, and cost is still set to 100000000, it could
> be that it's quite simply forcibly underestimating the cost of a seqscan
> in this case.
>
> If enable_secscan was off for the mentioned plan, it'd be interesting to
> see if things would be saner with seqscans enabled, and a more
> reasonable random page cost. If more 'sane' values still produce the
> desired plan, it might be better for other plans etc.
>
> Terje
>
>

I turned enable_seqscan to off and got similar results.

random_age_cost at 4.0:
data=# explain SELECT v.phonedirect, v.editdrop, EXTRACT (EPOCH FROM
v.custdate), EXTRACT (EPOCH FROM s.custdate) FROM view_505 v INNER
JOIN r3s169 s on v.dsiacctno = s.dsiacctno;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Merge Join (cost=293737539.01..301430139.34 rows=285230272 width=11)
Merge Cond: ("outer".dsiacctno = "inner".dsiacctno)
-> Sort (cost=127311593.00..127592577.62 rows=112393848 width=20)
Sort Key: v.dsiacctno
-> Seq Scan on view_505 v (cost=100000000.00..104602114.48
rows=112393848 width=20)
-> Sort (cost=166425946.01..167139021.69 rows=285230272 width=17)
Sort Key: s.dsiacctno
-> Seq Scan on r3s169 s (cost=100000000.00..106873675.72
rows=285230272 width=17)
(8 rows)

random_page_cost at 3.0:
data=# explain SELECT v.phonedirect, v.editdrop, EXTRACT (EPOCH FROM
v.custdate), EXTRACT (EPOCH FROM s.custdate) FROM view_505 v INNER
JOIN r3s169 s on v.dsiacctno = s.dsiacctno;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Merge Join (cost=288303269.01..295995869.34 rows=285230272 width=11)
Merge Cond: ("outer".dsiacctno = "inner".dsiacctno)
-> Sort (cost=125775957.00..126056941.62 rows=112393848 width=20)
Sort Key: v.dsiacctno
-> Seq Scan on view_505 v (cost=100000000.00..104602114.48
rows=112393848 width=20)
-> Sort (cost=162527312.01..163240387.69 rows=285230272 width=17)
Sort Key: s.dsiacctno
-> Seq Scan on r3s169 s (cost=100000000.00..106873675.72
rows=285230272 width=17)
(8 rows)

random_age_cost ad 2,0:
data=# explain SELECT v.phonedirect, v.editdrop, EXTRACT (EPOCH FROM
v.custdate), EXTRACT (EPOCH FROM s.custdate) FROM view_505 v INNER
JOIN r3s169 s on v.dsiacctno = s.dsiacctno;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Merge Join (cost=282868999.01..290561599.34 rows=285230272 width=11)
Merge Cond: ("outer".dsiacctno = "inner".dsiacctno)
-> Sort (cost=124240321.00..124521305.62 rows=112393848 width=20)
Sort Key: v.dsiacctno
-> Seq Scan on view_505 v (cost=100000000.00..104602114.48
rows=112393848 width=20)
-> Sort (cost=158628678.01..159341753.69 rows=285230272 width=17)
Sort Key: s.dsiacctno
-> Seq Scan on r3s169 s (cost=100000000.00..106873675.72
rows=285230272 width=17)
(8 rows)

random_page_cost at 1.0:
data=# explain SELECT v.phonedirect, v.editdrop, EXTRACT (EPOCH FROM
v.custdate), EXTRACT (EPOCH FROM s.custdate) FROM view_505 v INNER
JOIN r3s169 s on v.dsiacctno = s.dsiacctno;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Merge Join (cost=154730044.01..274040257.41 rows=285230272 width=11)
Merge Cond: ("outer".dsiacctno = "inner".dsiacctno)
-> Index Scan using view_505_dsiacctno on view_505 v
(cost=0.00..111923570.63 rows=112393848 width=20)
-> Sort (cost=154730044.01..155443119.69 rows=285230272 width=17)
Sort Key: s.dsiacctno
-> Seq Scan on r3s169 s (cost=100000000.00..106873675.72
rows=285230272 width=17)
(6 rows)

random_page_cost ad 0.1:
data=# explain SELECT v.phonedirect, v.editdrop, EXTRACT (EPOCH FROM
v.custdate), EXTRACT (EPOCH FROM s.custdate) FROM view_505 v INNER
JOIN r3s169 s on v.dsiacctno = s.dsiacctno;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Merge Join (cost=0.00..51808909.26 rows=285230272 width=11)
Merge Cond: ("outer".dsiacctno = "inner".dsiacctno)
-> Index Scan using view_505_dsiacctno on view_505 v
(cost=0.00..12755411.69 rows=112393848 width=20)
-> Index Scan using r3s169_dsiacctno on r3s169 s
(cost=0.00..32357747.90 rows=285230272 width=17)
(4 rows)

I have a suspision that pgsql isn't tuned to properly deal with tables
of this size. Are there other things I should look at when dealing
with a database of this size.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2006-09-13 20:33:48 Re: Poor performance on seq scan
Previous Message Jamal Ghaffour 2006-09-13 20:00:19 Unsubscribe