Re: Different plan for very similar queries

From: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Different plan for very similar queries
Date: 2015-07-19 20:41:44
Message-ID: 20150719204144.GA14234@hjp.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2015-05-29 10:55:44 +0200, Peter J. Holzer wrote:
> wdsah=> explain analyze select facttablename, columnname, term, concept_id, t.hidden, language, register
> from term t where facttablename='facttable_stat_fta4' and columnname='einheit' and exists (select 1 from facttable_stat_fta4 f where f.einheit=t.term );
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop Semi Join (cost=0.00..384860.48 rows=1 width=81) (actual time=0.061..0.119 rows=2 loops=1)
> -> Index Scan using term_facttablename_columnname_idx on term t (cost=0.00..391.46 rows=636 width=81) (actual time=0.028..0.030 rows=3 loops=1)
> Index Cond: (((facttablename)::text = 'facttable_stat_fta4'::text) AND ((columnname)::text = 'einheit'::text))
> -> Index Scan using facttable_stat_fta4_einheit_idx on facttable_stat_fta4 f (cost=0.00..384457.80 rows=21788970 width=3) (actual time=0.027..0.027 rows=1 loops=3)
> Index Cond: ((einheit)::text = (t.term)::text)
> Total runtime: 0.173 ms
> (6 rows)
>
> 0.17 ms. Much faster than a plain select distinct over a table with 43
> million rows could ever hope to be.
>
> warenstrom is very similar and the columns with more distinct values
> aren't that bad either.
>
> But for column berechnungsart the result is bad:
>
> wdsah=> explain analyze select facttablename, columnname, term, concept_id, t.hidden, language, register
> from term t where facttablename='facttable_stat_fta4' and columnname='berechnungsart' and exists (select 1 from facttable_stat_fta4 f where f.berechnungsart=t.term );
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Merge Semi Join (cost=316864.57..319975.79 rows=1 width=81) (actual time=7703.917..30948.271 rows=2 loops=1)
> Merge Cond: ((t.term)::text = (f.berechnungsart)::text)
> -> Index Scan using term_term_idx on term t (cost=0.00..319880.73 rows=636 width=81) (actual time=7703.809..7703.938 rows=3 loops=1)
> Filter: (((facttablename)::text = 'facttable_stat_fta4'::text) AND ((columnname)::text = 'berechnungsart'::text))
> -> Index Scan using facttable_stat_fta4_berechnungsart_idx on facttable_stat_fta4 f (cost=0.00..2545748.85 rows=43577940 width=2) (actual time=0.089..16263.582 rows=21336180 loops=1)
> Total runtime: 30948.648 ms
> (6 rows)
>
> Over 30 seconds! That's almost 200'000 times slower.

First I'd like to apologize for dropping out of the thread without
providing a test data set. I actually had one prepared (without
confidential data), but I wanted to make sure that I could reproduce the
problem with the test data, and I didn't get around to it for a week or
two and then I went on vacation ...

Anyway, in the meantime you released 9.5alpha (thanks for that, I
probably would have compiled a snapshot sooner or later, but installing
debian packages is just a lot more convenient - I hope you get a lot of
useful feedback) and I installed that this weekend.

I am happy to report that the problem appears to be solved. All the
queries of this type I threw at the database finish in a few
milliseconds now.

hp

--
_ | Peter J. Holzer | I want to forget all about both belts and
|_|_) | | suspenders; instead, I want to buy pants
| | | hjp(at)hjp(dot)at | that actually fit.
__/ | http://www.hjp.at/ | -- http://noncombatant.org/

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeison Bedoya Delgado 2015-07-21 02:59:18 hyperthreadin low performance
Previous Message Julien Rouhaud 2015-07-18 10:17:39 Re: [PERFORM] intel s3500 -- hot stuff