Re: horrendous query challenge :-)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fran Fabrizio <ffabrizio(at)mmrd(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: horrendous query challenge :-)
Date: 2002-05-29 22:35:22
Message-ID: 26423.1022711722@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Fran Fabrizio <ffabrizio(at)mmrd(dot)com> writes:
> I'm posting this sort of as a learning exercise to see how others
> approach a slow query and what steps they take to optimize it.

You haven't really given us enough information. For starters, do the
row count estimates shown in EXPLAIN have any relationship to reality?
(Posting EXPLAIN ANALYZE results, if you are on 7.2, would help answer
that question.) Possibly even more relevant is what the functions used
in the view definition do --- it's not unlikely that the function
evaluations are where much of the time goes.

> monitor=# explain select
> cs.entity_id,e.type,e.name,w.interface_label,cs.remote_ts,s.name,r.name
> from current_status_test cs, entity e, site s, region r,watch w where
> cs.entity_id in (select entity_id from entity where active=true) and
> cs.current = false and cs.msgtype=w.msg_type and e.entity_id =
> cs.entity_id and s.region_id = r.region_id and cs.site_id = s.site_id
> order by cs.remote_ts desc, r.name, s.name;

IN is almost always bad news. Is entity.entity_id a unique key? If so
I'd think you could rewrite this into a join ... but wait, you're
*already* joining to entity. Isn't the
cs.entity_id in (select entity_id from entity where active=true)
clause replaceable by just
e.active = true
given that you have e.entity_id = cs.entity_id in there already?

> View definition: SELECT findsite(e.entity_id) AS site_id, e.entity_id,
> get_status(e.entity_id, e.watch_id) AS status, e.watch_id, e.msg_type AS
> msgtype, cat.name AS msgcat, 'Ok' AS message, now() AS local_ts,
> e.last_contact AS remote_ts, e.current FROM entity_watch_map e,
> classifications class, categories cat WHERE (((e.last_contact >
> "timestamp"((date(now()) - 180))) AND (e.msg_type = class.msg_type)) AND
> (class.category_id = cat.category_id));

The e.last_contact > "timestamp"((date(now()) - 180)) clause will not be
indexable without some rework (at least not in PG 7.2 and earlier ---
this will be a nonissue once 7.3 comes out). I am not sure how much
that matters; the clause may not be selective enough to justify trying
to indexscan on last_contact anyhow. Again it's tough to say much in
advance of seeing EXPLAIN ANALYZE results.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Josh Berkus 2002-05-29 23:03:28 Re: Database Ownership
Previous Message Jillian Carroll 2002-05-29 22:23:59 Database Ownership