Help on explain analyze

From: Leif Jensen <leif(at)crysberg(dot)dk>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Help on explain analyze
Date: 2010-11-26 05:04:13
Message-ID: 17009599.98418.1290747853408.JavaMail.root@quick
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi guys,

I have a rather complex view that sometimes takes an awful long time to execute. I have tried to do an 'explain analyze' on it. My intention was to try to optimize the tables involved by creating some indexes to help the lookup. I looked for the "Seq Scan's and created appropriate indexes, I thought. However, in most cases the search got even slower. I have "expanded" the view as follows:

cims=# explain analyze select * from (SELECT t.id AS oid, d.id AS devid, d.description AS devname, cd.value AS period, upper(dt.typename::text) AS devtype, (date_part('epoch'::text, timezone('GMT'::text, t.firstrun))::bigint - (z.timezone::integer -
CASE
WHEN z.daylightsaving <> 'Y'::bpchar THEN 0
ELSE
CASE
WHEN cy.dl_start < now() AND now() < cy.dl_finish THEN 1
ELSE 0
END
END) * 3600) % 86400::bigint AS firstrun, t."interval", t.id AS tid, ti.id AS tiid, t.runstatus, t.last, tt.handler, td.value AS ctrlid, td.string AS alarm, z.timezone AS real_timezone, cy.dl_start < now() AND now() < cy.dl_finish AS daylight, z.timezone::integer -
CASE
WHEN z.daylightsaving <> 'Y'::bpchar THEN 0
ELSE
CASE
WHEN cy.dl_start < now() AND now() < cy.dl_finish THEN 1
ELSE 0
END
END AS timezone
FROM device d
LEFT JOIN task_info ti ON ti.ctrlid = d.id
LEFT JOIN task t ON t.id = ti.taskid
LEFT JOIN ctrl_definitions cd ON d.id = cd.ctrlid AND cd.name::text = 'IrrPeriodStart'::text, task_type tt, task_definitions td, devtype dt, ctrl_definitions cd2, zip z, county cy
WHERE td.name = 'UseWSData'::text AND ti.id = td.taskinfoid AND d.devtypeid = dt.id AND tt.id = t.tasktypeid AND (tt.handler = 'modthcswi.so'::text OR tt.handler = 'modthcswb.so'::text) AND btrim(cd2.string) = z.zip::text AND cd2.ctrlid = td.value AND cd2.name::text = 'ZIP'::text AND z.countyfips = cy.countyfips AND z.state = cy.state AND date_part('year'::text, now()) = date_part('year'::text, cy.dl_start)) AS wstaskdist
WHERE wstaskdist.ctrlid = 401 AND CAST( alarm AS boolean ) = 't';

The view is actually the sub-SELECT which I have name 'wstaskdist', and my search criteria is the bottom WHERE. The result of the ANALYZE is:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=284.88..9767.82 rows=1 width=109) (actual time=2515.318..40073.432 rows=10 loops=1)
-> Nested Loop (cost=284.88..9745.05 rows=70 width=102) (actual time=2515.184..40071.697 rows=10 loops=1)
-> Nested Loop (cost=229.56..5692.38 rows=1 width=88) (actual time=2512.044..39401.729 rows=10 loops=1)
-> Nested Loop (cost=229.56..5692.07 rows=1 width=80) (actual time=2511.999..39401.291 rows=10 loops=1)
-> Nested Loop (cost=229.56..5691.76 rows=1 width=77) (actual time=2511.943..39400.680 rows=10 loops=1)
Join Filter: (ti.id = td.taskinfoid)
-> Seq Scan on task_definitions td (cost=0.00..13.68 rows=1 width=22) (actual time=0.204..0.322 rows=10 loops=1)
Filter: ((name = 'UseWSData'::text) AND (value = 401) AND (string)::boolean)
-> Hash Left Join (cost=229.56..5672.72 rows=429 width=59) (actual time=7.159..3939.536 rows=429 loops=10)
Hash Cond: (d.id = cd.ctrlid)
-> Nested Loop (cost=24.66..5442.80 rows=429 width=55) (actual time=6.797..3937.349 rows=429 loops=10)
-> Hash Join (cost=16.65..282.84 rows=429 width=38) (actual time=0.078..6.587 rows=429 loops=10)
Hash Cond: (t.id = ti.taskid)
-> Seq Scan on task t (cost=0.00..260.29 rows=429 width=30) (actual time=0.022..5.089 rows=429 loops=10)
-> Hash (cost=11.29..11.29 rows=429 width=12) (actual time=0.514..0.514 rows=429 loops=1)
-> Seq Scan on task_info ti (cost=0.00..11.29 rows=429 width=12) (actual time=0.020..0.302 rows=429 loops=1)
-> Bitmap Heap Scan on device d (cost=8.01..12.02 rows=1 width=21) (actual time=9.145..9.146 rows=1 loops=4290)
Recheck Cond: (d.id = ti.ctrlid)
-> Bitmap Index Scan on pk_device (cost=0.00..8.01 rows=1 width=0) (actual time=0.463..0.463 rows=1569 loops=4290)
Index Cond: (d.id = ti.ctrlid)
-> Hash (cost=202.61..202.61 rows=183 width=8) (actual time=3.534..3.534 rows=343 loops=1)
-> Seq Scan on ctrl_definitions cd (cost=0.00..202.61 rows=183 width=8) (actual time=0.034..3.298 rows=343 loops=1)
Filter: ((name)::text = 'IrrPeriodStart'::text)
-> Index Scan using devtype_pkey on devtype dt (cost=0.00..0.30 rows=1 width=11) (actual time=0.053..0.055 rows=1 loops=10)
Index Cond: (dt.id = d.devtypeid)
-> Index Scan using pk_task_type on task_type tt (cost=0.00..0.30 rows=1 width=16) (actual time=0.036..0.039 rows=1 loops=10)
Index Cond: (tt.id = t.tasktypeid)
Filter: ((tt.handler = 'modthcswi.so'::text) OR (tt.handler = 'modthcswb.so'::text))
-> Hash Join (cost=55.33..4050.56 rows=211 width=18) (actual time=3.000..66.988 rows=1 loops=10)
Hash Cond: ((z.zip)::text = btrim(cd2.string))
-> Seq Scan on zip z (cost=0.00..3729.23 rows=42223 width=20) (actual time=0.017..43.637 rows=42108 loops=10)
-> Hash (cost=55.31..55.31 rows=1 width=10) (actual time=0.138..0.138 rows=1 loops=1)
-> Bitmap Heap Scan on ctrl_definitions cd2 (cost=4.39..55.31 rows=1 width=10) (actual time=0.089..0.128 rows=1 loops=1)
Recheck Cond: (ctrlid = 401)
Filter: ((name)::text = 'ZIP'::text)
-> Bitmap Index Scan on ctrl_def_ctrlid (cost=0.00..4.39 rows=19 width=0) (actual time=0.055..0.055 rows=25 loops=1)
Index Cond: (ctrlid = 401)
-> Index Scan using county_state_fips_inx on county cy (cost=0.00..0.31 rows=1 width=25) (actual time=0.083..0.146 rows=1 loops=10)
Index Cond: ((cy.state = z.state) AND (cy.countyfips = z.countyfips))
Filter: (date_part('year'::text, now()) = date_part('year'::text, cy.dl_start))
Total runtime: 40073.738 ms
(41 rows)

I have concentrate my effort on the (double) 'Seq Scan':

-> Hash Join (cost=16.65..282.84 rows=429 width=38) (actual time=0.078..6.587 rows=429 loops=10)
Hash Cond: (t.id = ti.taskid)
-> Seq Scan on task t (cost=0.00..260.29 rows=429 width=30) (actual time=0.022..5.089 rows=429 loops=10)
-> Hash (cost=11.29..11.29 rows=429 width=12) (actual time=0.514..0.514 rows=429 loops=1)
-> Seq Scan on task_info ti (cost=0.00..11.29 rows=429 width=12) (actual time=0.020..0.302 rows=429 loops=1)

and tried various indexes on the two table 'task' and 'task_info' to help avoid the sequential scans, but at best it gives the same result, at worst it was 25% slower :-(. The tables aren't that big. I think the largest are the 'zip' and the 'county' tables with ~42000 and ~22000 rows, respectively.

What is it that I'm missing,

Leif

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Murat Kabilov 2010-11-26 09:33:02 number of not null arguments
Previous Message James B. Byrne 2010-11-26 03:13:40 Re: PGError: ERROR: missing FROM-clause entry for table