Re: osdl-dbt3 run results - puzzled by the execution plans

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Jenny Zhang <jenny(at)osdl(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org, "Matt Clark" <matt(at)ymogen(dot)net>
Subject: Re: osdl-dbt3 run results - puzzled by the execution plans
Date: 2003-09-19 15:08:26
Message-ID: 3c6mmvcmvlr59kng6tnee2a0q4vf4spctt@email.aon.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Thu, 18 Sep 2003 15:36:50 -0700, Jenny Zhang <jenny(at)osdl(dot)org>
wrote:
>We thought the large effective_cache_size should lead us to better
>plans. But we found the opposite.

The common structure of your query plans is:

Sort
Sort Key: sum((partsupp.ps_supplycost * partsupp.ps_availqty))
InitPlan
-> Aggregate
-> SubPlan
-> Aggregate
Filter: (sum((ps_supplycost * ps_availqty)) > $0)
-> Group
-> Sort
Sort Key: partsupp.ps_partkey
-> SubPlan (same as above)

where the SubPlan is

-> Merge Join (cost=519.60..99880.05 rows=32068 width=65)
(actual time=114.78..17435.28 rows=30400 loops=1)
ctr=5.73
Merge Cond: ("outer".ps_suppkey = "inner".s_suppkey)
-> Index Scan using i_ps_suppkey on partsupp
(cost=0.00..96953.31 rows=801712 width=34)
(actual time=0.42..14008.92 rows=799361 loops=1)
ctr=6.92
-> Sort (cost=519.60..520.60 rows=400 width=31)
(actual time=106.88..143.49 rows=30321 loops=1)
ctr=3.63
Sort Key: supplier.s_suppkey
-> SubSubPlan

for large effective_cache_size and

-> Nested Loop (cost=0.00..130168.30 rows=32068 width=65)
(actual time=0.56..1374.41 rows=30400 loops=1)
ctr=94.71
-> SubSubPlan
-> Index Scan using i_ps_suppkey on partsupp
(cost=0.00..323.16 rows=80 width=34)
(actual time=0.16..2.98 rows=80 loops=380)
ctr=108.44
Index Cond: (partsupp.ps_suppkey = "outer".s_suppkey)

for small effective_cache_size. Both subplans have an almost
identical subsubplan:

-> Nested Loop (cost=0.00..502.31 rows=400 width=31)
(actual time=0.23..110.51 rows=380 loops=1)
ctr=4.55
Join Filter: ("inner".s_nationkey = "outer".n_nationkey)
-> Seq Scan on nation (cost=0.00..1.31 rows=1 width=10)
(actual time=0.08..0.14 rows=1 loops=1)
ctr=9.36
Filter: (n_name = 'ETHIOPIA'::bpchar)
-> Seq Scan on supplier (cost=0.00..376.00 rows=10000 width=21)
(actual time=0.10..70.72 rows=10000 loops=1)
ctr=5.32

I have added the ctr (cost:time ratio) for each plan node. These
values are mostly between 5 and 10 with two notable exceptions:

1) -> Sort (cost=519.60..520.60 rows=400 width=31)
(actual time=106.88..143.49 rows=30321 loops=1)
ctr=3.63

It has already been noticed by Matt Clark that this is the only plan
node where the row count estimation looks wrong. However, I don't
believe that this has great influence on the total cost of the plan,
because the ctr is not far from the usual range and if it were a bit
higher, it would only add a few hundred cost units to a branch costing
almost 100000 units. BTW I vaguely remember that there is something
strange with the way actual rows are counted inside a merge join.
Look at the branch below this plan node: It shows an actual row count
of 380.

2) -> Index Scan using i_ps_suppkey on partsupp
(cost=0.00..323.16 rows=80 width=34)
(actual time=0.16..2.98 rows=80 loops=380)
ctr=108.44

Here we have the only plan node where loops > 1, and it is the only
one where the ctr is far off. The planner computes the cost for one
loop and multiplies it by the number of loops (which it estimates
quite accurately to be 400), thus getting a total cost of ca. 130000.
We have no reason to believe that the single loop cost is very far
from reality (for a *single* index scan), but the planner does not
account for additional index scans hitting pages in the cache that
have been brought in by preceding scans. This is a known problem, Tom
has mentioned it several times, IIRC.

Now I'm very interested in getting a better understanding of this
problem, so could you please report the results of

. \d i_ps_suppkey

. VACUUM VERBOSE ANALYSE partsupp;
VACUUM VERBOSE ANALYSE supplier;

. SELECT attname, null_frac, avg_witdh, n_distinct, correlation
FROM pg_stats
WHERE tablename = 'partsupp' AND attname IN ('ps_suppkey', ...);

Please insert other interesting column names for ..., especially
those contained in i_ps_suppkey, if any.

. SELECT relname, relpages, reltuples
FROM pg_class
WHERE relname IN ('partsupp', 'supplier', ...);
^^^
Add relevant index names here.

. EXPLAIN ANALYSE
SELECT ps_partkey, ps_supplycost, ps_availqty
FROM partsupp, supplier
WHERE ps_suppkey = s_suppkey AND s_nationkey = '<youknowit>';

The idea is to eliminate parts of the plan that are always the same.
Omitting nation is possibly to much a simplification. In this case
please re-add it.
Do this test for small and large effective_cache_size.
Force the use of other join methods by setting enable_<joinmethod>
to off. Post all results.

Jenny, I understand that this long message contains more questions
than answers and is not of much help for you. OTOH your tests might
be very helpful for Postgres development ...

Servus
Manfred

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-09-19 15:25:16 Re: NuSphere and PostgreSQL for windows
Previous Message Oleg Bartunov 2003-09-19 13:58:20 Re: NuSphere and PostgreSQL for windows

Browse pgsql-performance by date

  From Date Subject
Next Message Jenny Zhang 2003-09-19 18:35:35 Re: osdl-dbt3 run results - puzzled by the execution
Previous Message Greg Stark 2003-09-19 13:12:27 Re: osdl-dbt3 run results - puzzled by the execution plans