different plans for the same query - different filter values

From: Misa Simic <misa(dot)simic(at)gmail(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: different plans for the same query - different filter values
Date: 2013-04-25 12:18:27
Message-ID: CAH3i69kRk=cQqpB3Q8oLGxmJhNbpSPhtFGX7DKbPptnDgh5z2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

We again have problems with query planer... (Ubuntu, pg 9.1)

Up to now - solution was "rephrase the question", but for next thing we are
not sure what would be best solution...

the whole story is too complex... but simplified:

We have tables:

things (thing_id int pk... other columns...)
activities (activity_id int pk, date, thing_id.... other columns...)

So, for each day we track main activities about things...

Now... each activity... could have 0 or more additional info about
activity... if that happened at all that day...

So we have:
additional_activities (id serial pk, activity_id int fk,... other
columns...)

Now, what creates problems...

We need a view
what shows all info about things and activities...

but just 1 row per activity...

so:

date, thing columns, activity columns... and now last 7 columns are from
additional_activities table... what can have 0 or more rows related to the
activity - but we need just one...
if it has more than 1 row - we should show:
-actual values from the first row (related to the activity) + last two
columns: sum value and total number of additinal info relateed to the
activity...

So we have make a view:

WITH main_id AS (
SELECT min(id) AS id, sum(value) AS total_value, count(1) AS
total_additional_info
FROM additional_activities
GROUP BY activity_id
)
SELECT *
FROM main_id
JOIN additional_activities USING (id);

What actually returns first row values about thing + summarized values...

then left join to that view - and we get result what we want...

with my_view:

SELECT *
FROM things
JOIN activities USING (thing_id)
LEFT JOIN additional_activities_view USING (thing_id)

Usual query on that view is:

SELECT * FROM my_view WHERE thing_id = $1 AND date BETWEEN $2 AND $3

And now comes problems:
Query1:
SELECT * FROM my_view WHERE thing_id = 321 AND date BETWEEN '20130301' AND
'20130331'

takes more then 20s and uses very bad plan:
http://explain.depesz.com/s/CLh

but Query2:
SELECT * FROM my_view WHERE thing_id = 321 AND date BETWEEN '20130201' AND
'20130331

What returns even more rows then query1, (Changed just from date 1st Feb
instead of 1st March)

takes less then 2 seconds!?
http://explain.depesz.com/s/9QP

Any suggestions?

Many Thanks,

Misa

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Kirkwood 2013-04-26 02:33:31 In progress INSERT wrecks plans on table
Previous Message Heikki Linnakangas 2013-04-25 11:26:00 Re: [PERFORM] Question about network bandwidth usage between PostgreSQL’s client and server