select count(idn) is slow (Seq Scan) instead of Bitmap Heap.. why?

From: Adam PAPAI <wooh(at)wooh(dot)hu>
To: pgsql-performance(at)postgresql(dot)org
Subject: select count(idn) is slow (Seq Scan) instead of Bitmap Heap.. why?
Date: 2009-08-02 16:28:37
Message-ID: 4A75BEB5.2070002@wooh.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

I have a problem with an inner join + count().

my query is:

explain analyze select
k.idn,k.kerdes_subject,k.kerdes_text,u.vezeteknev,u.keresztnev,u.idn as
user_id, kg.kategoria_neve, count(v.idn)

FROM kategoriak as kg

INNER JOIN kerdesek as k on kg.idn = k.kategoria_id
INNER JOIN users as u ON k.user_id = u.idn
INNER JOIN valaszok as v ON k.idn = v.kerdes_id

where kg.idn=15 group by k.idn, k.kerdes_subject,k.kerdes_text,
u.idn,u.vezeteknev,u.keresztnev,kg.kategoria_neve

The problem is with the count(v.idn).

This column has a relation with: v.kerdes_id = k.idn => k.kategoria_id =
kg.idn

and the WHERE says: kg.idn = 15.

Why does it run through all lines in v?

the explain sais:

GroupAggregate (cost=103238.59..103602.66 rows=10402 width=1382)
(actual time=8531.405..8536.633 rows=73 loops=1)
-> Sort (cost=103238.59..103264.59 rows=10402 width=1382) (actual
time=8531.339..8533.199 rows=1203 loops=1)
Sort Key: k.idn, k.kerdes_subject, k.kerdes_text, u.idn,
u.vezeteknev, u.keresztnev, kg.kategoria_neve
-> Hash Join (cost=3827.79..89951.54 rows=10402 width=1382)
(actual time=1778.590..8523.015 rows=1203 loops=1)
Hash Cond: (v.kerdes_id = k.idn)
-> Seq Scan on valaszok v (cost=0.00..78215.98
rows=2080998 width=8) (actual time=59.714..5009.171 rows=2080998 loops=1)
-> Hash (cost=3823.42..3823.42 rows=350 width=1378)
(actual time=12.553..12.553 rows=74 loops=1)
-> Nested Loop (cost=14.98..3823.42 rows=350
width=1378) (actual time=0.714..12.253 rows=74 loops=1)
-> Nested Loop (cost=14.98..1056.38
rows=350 width=830) (actual time=0.498..5.952 rows=117 loops=1)
-> Seq Scan on kategoriak kg
(cost=0.00..1.30 rows=1 width=278) (actual time=0.066..0.076 rows=1 loops=1)
Filter: (idn = 15)
-> Bitmap Heap Scan on kerdesek k
(cost=14.98..1051.58 rows=350 width=560) (actual time=0.374..5.430
rows=117 loops=1)
Recheck Cond: (15 = kategoria_id)
-> Bitmap Index Scan on
kategoria_id_id_idx (cost=0.00..14.89 rows=350 width=0) (actual
time=0.212..0.212 rows=117 loops=1)
Index Cond: (15 =
kategoria_id)
-> Index Scan using users_pkey on users u
(cost=0.00..7.89 rows=1 width=552) (actual time=0.047..0.048 rows=1
loops=117)
Index Cond: (k.user_id = u.idn)
Total runtime: 8536.936 ms

So it run through more than 2 mill lines... but why? It should only
count those lines which has the category_id = 15...

What am I doing wrong?

--
Adam PAPAI

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephen Frost 2009-08-02 17:29:50 Re: Performance 8.4.0
Previous Message Robert Haas 2009-08-02 15:25:30 Re: Performance 8.4.0