Re: planner not choosing fastest estimated plan

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: planner not choosing fastest estimated plan
Date: 2013-07-06 17:52:51
Message-ID: CAFj8pRDRyE3r3jxz7trAYs96P3h-AzhGsSoBi5W7c5FpQKrPCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

pls, can you send EXPLAIN ANALYZE result?

Regards

Pavel Stehule

2013/7/6 Jeff Janes <jeff(dot)janes(at)gmail(dot)com>:
> I have a weird case where the planner doesn't choose the plan that it
> itself believes to be the fastest plan. If I disable seqscan, it then
> chooses a faster plan (faster in estimate and faster in reality) than
> the one chosen when all options were open to it. I can't figure out
> how this can be anything other than a bug. The *collapse_limit
> parameters are not restraining things.
>
> I've created a dummy self-contained test case that is a simple
> self-join of a partitioned table, with a function-based index.
>
> If I analyze the tables after the function-based indexes are in place,
> then the problems goes away. And that is the production solution.
> But still, a bug is a bug, even if there is a work around.
>
> This is using the default configuration with LANG=C in 9.2, 9.3, and 9.4.
>
> It was introduced in commit 5b7b5518d0ea56c422a19787, "Revise
> parameterized-path mechanism to fix assorted issues"
>
> I've tried compiling under OPTIMIZER_DEBUG, but the output did not
> mean anything to me. It looks like the only RELOPTINFO corresponding
> to the join, "RELOPTINFO (1 2)" only lists the HashJoin when
> enable_seqscan=on, and only contains NestLoop when enable_seqscan=off.
> I don't know why it doesn't list both in both cases and then choose
> the faster.
>
>
>
> create table foo1 as select lpad(g::text,7,'0') as val1, g as num1
> from generate_series(1,100000) g;
> create table foo2 as select lpad(g::text,7,'0') as val1, g as num1
> from generate_series(100001,1000000) g;
> alter table foo2 inherit foo1;
> create index on foo1(num1 );
> create index on foo2(num1 );
> analyze foo1; analyze foo2;
> create language plperl;
> CREATE OR REPLACE FUNCTION perlupper(text)
> RETURNS text
> LANGUAGE plperl
> IMMUTABLE COST 1000
> AS $function$
> return uc($_[0]);
> $function$;
> create index on foo1 (perlupper(val1));
> create index on foo2 (perlupper(val1));
>
>
> jjanes=# explain select a.num1 from foo1 a, foo1 b where
> perlupper(a.val1)=perlupper(b.val1) and b.num1 = 987845;
> QUERY PLAN
> ----------------------------------------------------------------------------------------
> Hash Join (cost=32789.00..538040.65 rows=10000 width=4)
> Hash Cond: (perlupper(b.val1) = perlupper(a.val1))
> -> Append (cost=0.00..16.64 rows=2 width=8)
> -> Index Scan using foo1_num1_idx on foo1 b
> (cost=0.00..8.28 rows=1 width=8)
> Index Cond: (num1 = 987845)
> -> Index Scan using foo2_num1_idx on foo2 b
> (cost=0.00..8.37 rows=1 width=8)
> Index Cond: (num1 = 987845)
> -> Hash (cost=15406.00..15406.00 rows=1000000 width=12)
> -> Append (cost=0.00..15406.00 rows=1000000 width=12)
> -> Seq Scan on foo1 a (cost=0.00..1541.00 rows=100000 width=12)
> -> Seq Scan on foo2 a (cost=0.00..13865.00
> rows=900000 width=12)
>
> jjanes=# set enable_seqscan TO off;
> jjanes=# explain select a.num1 from foo1 a, foo1 b where
> perlupper(a.val1)=perlupper(b.val1) and b.num1 = 987845;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------
> Nested Loop (cost=17.14..60438.19 rows=10000 width=4)
> -> Append (cost=0.00..16.64 rows=2 width=8)
> -> Index Scan using foo1_num1_idx on foo1 b
> (cost=0.00..8.28 rows=1 width=8)
> Index Cond: (num1 = 987845)
> -> Index Scan using foo2_num1_idx on foo2 b
> (cost=0.00..8.37 rows=1 width=8)
> Index Cond: (num1 = 987845)
> -> Append (cost=17.14..30160.77 rows=5000 width=12)
> -> Bitmap Heap Scan on foo1 a (cost=17.14..3022.65 rows=500 width=12)
> Recheck Cond: (perlupper(val1) = perlupper(b.val1))
> -> Bitmap Index Scan on foo1_perlupper_idx
> (cost=0.00..17.01 rows=500 width=0)
> Index Cond: (perlupper(val1) = perlupper(b.val1))
> -> Bitmap Heap Scan on foo2 a (cost=92.22..27138.12
> rows=4500 width=12)
> Recheck Cond: (perlupper(val1) = perlupper(b.val1))
> -> Bitmap Index Scan on foo2_perlupper_idx
> (cost=0.00..91.10 rows=4500 width=0)
> Index Cond: (perlupper(val1) = perlupper(b.val1))
>
> Cheers,
>
> Jeff
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2013-07-06 18:09:23 Re: Run-time posix_fallocate failures
Previous Message Jeff Janes 2013-07-06 17:49:28 planner not choosing fastest estimated plan