Re: Help Me Understand Why I'm Getting a Bad Query Plan

From: Bryan Murphy <bmurphy1976(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Help Me Understand Why I'm Getting a Bad Query Plan
Date: 2009-03-25 22:14:45
Message-ID: 7fd310d10903251514x1dd522f3o8d13a9218a6eda6c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Mar 25, 2009 at 4:55 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Oh, I see.  It thinks that it'll need to pull 260,000 redundant rows in
> order to get 1800 unique ones.  Only it's wrong; you're only pulling about
> 4000.
>
> Try increasing some stats still further: itemexperiencelog.visitorid and
> visitors.user_id both to 500.

I tried that already, but I decided to try again in case I messed up
something last time. Here's what I ran. As you can see, it still
chooses to do a sequential scan. Am I changing the stats for those
columns correctly?

Thanks,
Bryan

First, the query:

SELECT MAX(l.Id) as Id, l.ItemId
FROM ItemExperienceLog l
INNER JOIN Items_Primary p ON p.Id = l.ItemId
INNER JOIN Feeds f ON f.Id = p.FeedId
INNER JOIN Visitors v ON v.Id = l.VisitorId
WHERE
v.UserId = 'fbe2537f21d94f519605612c0bf7c2c5'
AND LOWER(f.Slug) = LOWER('Wealth_Building_by_NightingaleConant')
GROUP BY l.ItemId

The query plan:

HashAggregate (cost=130291.23..130291.35 rows=9 width=37) (actual
time=8385.428..8385.433 rows=4 loops=1)
-> Nested Loop (cost=2649.02..130291.19 rows=9 width=37) (actual
time=3707.336..8385.388 rows=11 loops=1)
-> Nested Loop (cost=2649.02..129744.01 rows=1888 width=70)
(actual time=8.881..8322.029 rows=3210 loops=1)
-> Hash Join (cost=2649.02..125273.81 rows=1888
width=37) (actual time=8.836..8196.469 rows=3210 loops=1)
Hash Cond: (l.visitorid = v.id)
-> Seq Scan on itemexperiencelog l
(cost=0.00..112491.03 rows=2697303 width=70) (actual
time=0.048..4459.139 rows=2646177 loops=1)
-> Hash (cost=2631.24..2631.24 rows=1422
width=33) (actual time=7.751..7.751 rows=899 loops=1)
-> Bitmap Heap Scan on visitors v
(cost=23.44..2631.24 rows=1422 width=33) (actual time=0.577..6.347
rows=899 loops=1)
Recheck Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
-> Bitmap Index Scan on
visitors_userid_index2 (cost=0.00..23.08 rows=1422 width=0) (actual
time=0.419..0.419 rows=899 loops=1)
Index Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
-> Index Scan using items_primary_pkey on items_primary
p (cost=0.00..2.36 rows=1 width=66) (actual time=0.024..0.025 rows=1
loops=3210)
Index Cond: (p.id = l.itemid)
-> Index Scan using feeds_pkey on feeds f (cost=0.00..0.28
rows=1 width=33) (actual time=0.018..0.018 rows=0 loops=3210)
Index Cond: (f.id = p.feedid)
Filter: (lower((f.slug)::text) =
'wealth_building_by_nightingaleconant'::text)
Total runtime: 8385.538 ms

Bump up the stats:

ALTER TABLE ItemExperienceLog ALTER COLUMN VisitorId SET STATISTICS 500;
ALTER TABLE ItemExperienceLog ALTER COLUMN ItemId SET STATISTICS 500;
ANALYZE ItemExperienceLog;

ALTER TABLE Visitors ALTER COLUMN UserId SET STATISTICS 500;
ALTER TABLE Visitors ALTER COLUMN Id SET STATISTICS 500;
ANALYZE Visitors;

The new query plan:

HashAggregate (cost=127301.63..127301.72 rows=7 width=37) (actual
time=11447.033..11447.037 rows=4 loops=1)
-> Nested Loop (cost=1874.67..127301.60 rows=7 width=37) (actual
time=4717.880..11446.987 rows=11 loops=1)
-> Nested Loop (cost=1874.67..126923.09 rows=1306 width=70)
(actual time=20.565..11345.756 rows=3210 loops=1)
-> Hash Join (cost=1874.67..123822.53 rows=1306
width=37) (actual time=20.445..8292.235 rows=3210 loops=1)
Hash Cond: (l.visitorid = v.id)
-> Seq Scan on itemexperiencelog l
(cost=0.00..112010.04 rows=2646604 width=70) (actual
time=0.065..4438.481 rows=2646549 loops=1)
-> Hash (cost=1862.32..1862.32 rows=988
width=33) (actual time=19.360..19.360 rows=899 loops=1)
-> Bitmap Heap Scan on visitors v
(cost=18.08..1862.32 rows=988 width=33) (actual time=0.666..17.788
rows=899 loops=1)
Recheck Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
-> Bitmap Index Scan on
visitors_userid_index2 (cost=0.00..17.83 rows=988 width=0) (actual
time=0.520..0.520 rows=899 loops=1)
Index Cond: (userid =
'fbe2537f21d94f519605612c0bf7c2c5'::bpchar)
-> Index Scan using items_primary_pkey on items_primary
p (cost=0.00..2.36 rows=1 width=66) (actual time=0.944..0.945 rows=1
loops=3210)
Index Cond: (p.id = l.itemid)
-> Index Scan using feeds_pkey on feeds f (cost=0.00..0.28
rows=1 width=33) (actual time=0.029..0.029 rows=0 loops=3210)
Index Cond: (f.id = p.feedid)
Filter: (lower((f.slug)::text) =
'wealth_building_by_nightingaleconant'::text)
Total runtime: 11447.155 ms

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stef Telford 2009-03-26 00:50:20 Re: Raid 10 chunksize
Previous Message Josh Berkus 2009-03-25 21:55:40 Re: Help Me Understand Why I'm Getting a Bad Query Plan