Re: Query-Planer from 6seconds TO DAYS

From: Böckler Andreas <andy(at)boeckler(dot)org>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query-Planer from 6seconds TO DAYS
Date: 2012-10-24 18:51:33
Message-ID: 4FCA6A79-9D9B-4576-A661-4C290C1B6EF7@boeckler.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Jeff,

thanks for your answer!

Am 24.10.2012 um 19:00 schrieb Jeff Janes:

> On Wed, Oct 24, 2012 at 8:41 AM, Böckler Andreas <andy(at)boeckler(dot)org> wrote:
>
>> SELECT m.machine_id, s.timestamp, s.errorcode
>> FROM events m INNER JOIN spsdata as s ON (m.machine_id= s.machine_id
>
> m.machine_id is equal to itself? you must be retyping the query by hand…
Yes I did … i changed the vars from german to english ..
That should be m.machine_id=s.machine_id
>
> You should report the results of "EXPLAIN ANALYZE" rather than merely
> EXPLAIN, as that would make it much easier to verify where the
> selectivity estimates are off.
>
OK ..
i can do that for the FAST query.
But the other one would take days. (see below )

>
>> FAST:
>> QUERY PLAN
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Nested Loop (cost=0.00..144979241.24 rows=42662 width=14)
>> Join Filter: ((s."timestamp" <= m."timestamp") AND (m.machine_id = s.machine_id) AND (s."timestamp" >= (m."timestamp" - '00:00:30'::interval)))
>> -> Index Scan using events_code on events m (cost=0.00..4911.18 rows=25 width=12)
>> Index Cond: (code = 2024)
>> Filter: (("timestamp" >= '2012-08-14 00:00:00'::timestamp without time zone) AND ("timestamp" <= '2012-08-26 00:00:00'::timestamp without time zone))
>> -> Append (cost=0.00..5770958.44 rows=1400738 width=14)
>> -> Index Scan using spsdata_machine_id on spsdata s (cost=0.00..4.11 rows=1 width=14)
>> Index Cond: (s.machine_id = m.machine_id)
>
> Was there more to the plan that you snipped? If not, why isn't it
> checking all the other partitions?

Your right. It's checking all partitions!. So the constraint exclusion doesn't kick in.
This can be fixed with
SELECT
m.machine_id, s.timestamp, s.errorcode
FROM
events m
INNER JOIN spsdata as s ON (m.machine_id=s.machine_id AND s.timestamp BETWEEN m.timestamp - interval '30 seconds' AND m.timestamp)
WHERE
m.code IN (2024)
AND m.timestamp BETWEEN '2012-08-01' AND '2012-08-29'
AND s.timestamp BETWEEN '2012-08-01' AND '2012-08-29'
AND s.errorcode in ('2024');

It doesn't take hours to end, but it's not the performance gain you would expect.

I'v changed the query to one partition spsdata_2012m08 and attached the slow and fast cases with EXPLAIN ANALYZE.

The difference is one day in the WHERE-Clause
290.581 ms VS 687887.674 ms !
Thats 2372 times slower.

How can i force the fast query plan in a select?

At least I know that spsdata_2012m08 has way more records than events
spsdata_2012m08: reltuples -> 5.74082 * 10^7
events: count(1) for that time range -> 51383

>
> If you can't fix the selectivity estimates, one thing you could do to
> drive it to the faster query is to decrease random_page_cost to be the
> same seq_page_cost. That should push the cross-over point to the
> sequential scan out to a region you might not care about. However, it
> could also drive other queries in your system to use worse plans than
> they currently are.
> Or, you could "set enable_seqscan = off" before running this
> particular query, then reset it afterwards.
>
> Cheers,
>
> Jeff
I've played with seq_page_cost and enable_seqscan already, but you have to know the right values before SELECT to get good results ;)

Cheers,

Andy

--
Andreas Böckler
andy(at)boeckler(dot)org

Attachment Content-Type Size
fast_explain_analyze.log application/octet-stream 23.3 KB
partition_selectfast.log application/octet-stream 1.4 KB
partition_selectfast.sql application/octet-stream 393 bytes
partition_selectslow.log application/octet-stream 1.9 KB
partition_selectslow.sql application/octet-stream 425 bytes

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2012-10-24 19:11:48 Re: Setting Statistics on Functional Indexes
Previous Message Shaun Thomas 2012-10-24 16:55:15 Setting Statistics on Functional Indexes