Re: 7.3 vs 7.2 - different query plan, bad performance

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: siaco(at)allegro(dot)pl
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 7.3 vs 7.2 - different query plan, bad performance
Date: 2003-06-16 10:31:08
Message-ID: 373rev0fotgpq2osvlldf41b8flambulk9@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, 16 Jun 2003 08:38:50 +0200, siaco(at)allegro(dot)pl wrote:
>[After VACUUM ANALYSE ...] I don't see a big difference:
>
>siaco=# explain analyze select count(*) from v_c;
> QUERY PLAN
>---------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=210.83..210.83 rows=1 width=8) (actual time=5418.09..5418.09 rows=1 loops=1)
> -> Subquery Scan v_c (cost=28.40..197.63 rows=5281 width=8) (actual time=4.59..5414.13 rows=5281 loops=1)
> -> Hash Join (cost=28.40..197.63 rows=5281 width=8) (actual time=4.58..5407.73 rows=5281 loops=1)
> Hash Cond: ("outer".id = "inner".id)
> -> Seq Scan on b t1 (cost=0.00..76.81 rows=5281 width=4) (actual time=0.01..9.68 rows=5281 loops=1)
> -> Hash (cost=24.32..24.32 rows=1632 width=4) (actual time=3.29..3.29 rows=0 loops=1)
> -> Seq Scan on a t2 (cost=0.00..24.32 rows=1632 width=4) (actual time=0.01..1.88 rows=1632 loops=1)
> SubPlan
> -> Aggregate (cost=28.41..28.41 rows=1 width=0) (actual time=1.02..1.02 rows=1 loops=5281)
> -> Seq Scan on a t3 (cost=0.00..28.40 rows=3 width=0) (actual time=0.76..1.01 rows=1 loops=5281)
> Filter: (parent_id = $0)
> Total runtime: 5433.65 msec

Ok, now we have something to work on.

.) I guess you are not really interested in

SELECT count(*) FROM v_c;

If you were, you would simply

SELECT count(*) from b;

Try

EXPLAIN ANALYSE SELECT * FROM v_c;

and you will see that 7.2 produces a plan that is almost equal to that
produced by 7.3.

.) Without any index a seq scan is the best you can get. A scan of a
takes only 1 ms, but doing it 5000 times gives 5 seconds. Try

CREATE INDEX a_parent ON a(parent_id);

.) Wouldn't

CREATE VIEW v_c AS
SELECT t1.id, count(t3.id) AS children_count
FROM (b t1 LEFT JOIN a t2 ON (t1.id = t2.id))
LEFT JOIN a t3 ON (t3.parent_id = t2.id)
GROUP BY t1.id;

give the same results as your view definition with the subselect? And
under some assumptions about your data even

CREATE VIEW v_c AS
SELECT b.id, count(a.id) AS children_count
FROM b
LEFT JOIN a ON (a.parent_id = b.id)
GROUP BY b.id;

might work. But I think I don't understand your requirements. Why
are you not interested in the children_count for an id that doesn't
have a parent itself?

.) To answer your original question: The difference seems to be that
7.2 does not evaluate the subselect in the SELECT list, when you are
only asking for count(*).

Servus
Manfred

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ryszard Lach 2003-06-16 11:41:47 Re: 7.3 vs 7.2 - different query plan, bad performance
Previous Message Dawn Hollingsworth 2003-06-16 08:57:35 Re: Postgres Connections Requiring Large Amounts of Memory