From: | "Federico Simonetti (Liveye)" <federico(at)liveye(dot)net> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Query performance on ILIKE with AND operator... |
Date: | 2005-10-10 15:18:04 |
Message-ID: | 434A862C.9010701@liveye.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Sorry but this does not seem to improve performance, it takes even more
time, have a look at these data:
explain analyze SELECT * FROM ViewHttp
WHERE (vchost || txcontenttype) ilike '%www.%html%'
ORDER BY iDStart DESC, iSensorID DESC, iForensicID DESC, iSubID DESC
OFFSET 0 LIMIT 201
"Limit (cost=22740.77..22741.28 rows=201 width=1250) (actual
time=14234.000..14234.000 rows=201 loops=1)"
" -> Sort (cost=22740.77..22741.89 rows=447 width=1250) (actual
time=14234.000..14234.000 rows=201 loops=1)"
" Sort Key: detail0009.idstart, detail0009.isensorid,
detail0009.iforensicid, detail0009.isubid"
" -> Hash Join (cost=13.13..22721.10 rows=447 width=1250)
(actual time=469.000..12140.000 rows=54035 loops=1)"
" Hash Cond: ("outer".isensorid = "inner".isensorid)"
" -> Nested Loop (cost=0.00..22701.27 rows=447
width=1165) (actual time=469.000..11428.000 rows=54035 loops=1)"
" -> Seq Scan on detail0009 (cost=0.00..20763.77
rows=26 width=1005) (actual time=453.000..6345.000 rows=54064 loops=1)"
" Filter: (((vchost)::text || txcontenttype)
~~* '%www.%html%'::text)"
" -> Index Scan using connections_pkey on
connections (cost=0.00..74.25 rows=18 width=168) (actual
time=0.073..0.077 rows=1 loops=54064)"
" Index Cond: (("outer".isensorid =
connections.isensorid) AND ("outer".iforensicid = connections.iforensicid))"
" -> Hash (cost=12.50..12.50 rows=250 width=101) (actual
time=0.000..0.000 rows=0 loops=1)"
" -> Seq Scan on sensors (cost=0.00..12.50 rows=250
width=101) (actual time=0.000..0.000 rows=1 loops=1)"
"Total runtime: 14234.000 ms"
Thanks for your help anyway...
Federico
Tom Lane ha scritto:
>"Federico Simonetti (Liveye)" <federico(at)liveye(dot)net> writes:
>
>
>>I'm encountering a quite strange performance problem.
>>
>>
>
>The problem stems from the horrid misestimation of the number of rows
>fetched from detail0009:
>
>
>
>>" -> Seq Scan on detail0009 (cost=0.00..20500.11
>>rows=26 width=1005) (actual time=453.000..5983.000 rows=53588 loops=1)"
>>" Filter: ((txcontenttype ~~* '%html%'::text)
>>AND ((vchost)::text ~~* '%www.%'::text))"
>>
>>
>
>When the planner is off by a factor of two thousand about the number of
>rows involved, it's not very likely to produce a good plan :-(
>
>In the OR case the rowcount estimate is 6334, which is somewhat closer
>to reality (only about a factor of 10 off, looks like), and that changes
>the plan to something that works acceptably well.
>
>Assuming that this is web-log data, the prevalence of www and html
>together is hardly surprising, but PG's statistical mechanisms will
>never realize it. Not sure about a good workaround. Does it make
>sense to combine the two conditions into one?
> (vchost || txcontenttype) ilike '%www.%html%'
>
> regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: explain analyze is your friend
>
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Jon Brisbin | 2005-10-10 21:29:42 | Performance on SUSE w/ reiserfs |
Previous Message | Tom Lane | 2005-10-10 14:58:15 | Re: Query performance on ILIKE with AND operator... |