Re: Avoid huge perfomance loss on string concatenation

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: "Andrus" <kobruleht2(at)hot(dot)ee>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Avoid huge perfomance loss on string concatenation
Date: 2007-12-05 01:02:19
Message-ID: 20071204200219.147b4d43.wmoran@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Andrus" <kobruleht2(at)hot(dot)ee> wrote:
>
> Using string concatenation in where clause causes huge perfomance loss:
>
> explain analyze select
> rid.toode
> FROM dok JOIN rid USING (dokumnr)
> JOIN toode USING (toode)
> LEFT JOIN artliik using(grupp,liik)
> WHERE rid.toode='NAH S'
> AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
> and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

You provide zero information on the table layout, and the explain output
has been horribly mangled by your MUA.

I would suspect the problem is that there's no index that can be used
for that final comparison. Do you have an index along the lines of
CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?

Overall, the fact that you're concatenating two text fields to generate a
date field tends to suggest that your database schema has some fairly
major design problems, but I can only speculate at this point.

If neither of those help, I expect you'll need to provide more information.

> "Nested Loop Left Join (cost=68.75..5064.86 rows=1 width=24) (actual
> time=8.081..26995.552 rows=567 loops=1)"
> " Join Filter: ((toode.grupp = artliik.grupp) AND (toode.liik =
> artliik.liik))"
> " -> Nested Loop (cost=68.75..5062.19 rows=1 width=43) (actual
> time=8.045..26965.731 rows=567 loops=1)"
> " -> Index Scan using toode_pkey on toode (cost=0.00..8.27 rows=1
> width=43) (actual time=0.023..0.026 rows=1 loops=1)"
> " Index Cond: ('NAH S'::bpchar = toode)"
> " -> Nested Loop (cost=68.75..5053.91 rows=1 width=24) (actual
> time=8.016..26964.698 rows=567 loops=1)"
> " -> Index Scan using dok_kuupaev_idx on dok
> (cost=0.00..4326.16 rows=10 width=4) (actual time=0.059..67.985 rows=3543
> loops=1)"
> " Index Cond: ((kuupaev >= '2007-11-01'::date) AND
> (kuupaev <= '2007-12-04'::date))"
> " Filter: ((((kuupaev)::text || (kellaaeg)::text) >=
> '2007-11-01'::text) AND (((kuupaev)::text || (kellaaeg)::text) <=
> '2007-12-0423 59'::text))"
> " -> Bitmap Heap Scan on rid (cost=68.75..72.76 rows=1
> width=28) (actual time=7.577..7.577 rows=0 loops=3543)"
> " Recheck Cond: ((dok.dokumnr = rid.dokumnr) AND
> (rid.toode = 'NAH S'::bpchar))"
> " -> BitmapAnd (cost=68.75..68.75 rows=1 width=0)
> (actual time=7.574..7.574 rows=0 loops=3543)"
> " -> Bitmap Index Scan on rid_dokumnr_idx
> (cost=0.00..5.13 rows=83 width=0) (actual time=0.037..0.037 rows=14
> loops=3543)"
> " Index Cond: (dok.dokumnr = rid.dokumnr)"
> " -> Bitmap Index Scan on rid_toode_idx
> (cost=0.00..63.03 rows=1354 width=0) (actual time=7.528..7.528 rows=21144
> loops=3543)"
> " Index Cond: (toode = 'NAH S'::bpchar)"
> " -> Seq Scan on artliik (cost=0.00..2.27 rows=27 width=19) (actual
> time=0.007..0.020 rows=27 loops=567)"
> "Total runtime: 26996.399 ms"
>
> takes 26 seconds !
>
> If I remove last line it takes only 0 seconds:
>
> SET SEARCH_PATH TO FIRMA1,public;
> explain analyze select
> rid.toode
> FROM dok JOIN rid USING (dokumnr)
> JOIN toode USING (toode)
> LEFT JOIN artliik using(grupp,liik)
> WHERE rid.toode='NAH S'
> AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
>
> "Hash Left Join (cost=4313.85..7702.10 rows=24 width=24) (actual
> time=10.138..48.884 rows=567 loops=1)"
> " Hash Cond: ((toode.grupp = artliik.grupp) AND (toode.liik =
> artliik.liik))"
> " -> Nested Loop (cost=4311.17..7699.14 rows=24 width=43) (actual
> time=10.049..47.877 rows=567 loops=1)"
> " -> Index Scan using toode_pkey on toode (cost=0.00..8.27 rows=1
> width=43) (actual time=0.043..0.046 rows=1 loops=1)"
> " Index Cond: ('NAH S'::bpchar = toode)"
> " -> Hash Join (cost=4311.17..7690.63 rows=24 width=24) (actual
> time=9.998..47.341 rows=567 loops=1)"
> " Hash Cond: (rid.dokumnr = dok.dokumnr)"
> " -> Index Scan using rid_toode_idx on rid
> (cost=0.00..3372.45 rows=1354 width=28) (actual time=0.089..24.265
> rows=21144 loops=1)"
> " Index Cond: (toode = 'NAH S'::bpchar)"
> " -> Hash (cost=4286.20..4286.20 rows=1998 width=4) (actual
> time=9.871..9.871 rows=3543 loops=1)"
> " -> Index Scan using dok_kuupaev_idx on dok
> (cost=0.00..4286.20 rows=1998 width=4) (actual time=0.057..6.779 rows=3543
> loops=1)"
> " Index Cond: ((kuupaev >= '2007-11-01'::date) AND
> (kuupaev <= '2007-12-04'::date))"
> " -> Hash (cost=2.27..2.27 rows=27 width=19) (actual time=0.060..0.060
> rows=27 loops=1)"
> " -> Seq Scan on artliik (cost=0.00..2.27 rows=27 width=19) (actual
> time=0.009..0.027 rows=27 loops=1)"
> "Total runtime: 49.409 ms"
>
>
> How to rewrite the query
>
> select
> rid.toode,
> artliik.*
> FROM dok JOIN rid USING (dokumnr)
> JOIN toode USING (toode)
> LEFT JOIN artliik using(grupp,liik)
> WHERE rid.toode='NAH S'
> AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
> and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'
>
> so it runs fast ?
>
> Andrus.
>
>
> "PostgreSQL 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
> (mingw-special)"
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bill Moran
http://www.potentialtech.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2007-12-05 01:25:10 Re: Avoid huge perfomance loss on string concatenation
Previous Message Martin Gainty 2007-12-05 00:52:32 Re: Vacuum output redirect