Re: Query performance question

From: Lukas Ertl <l(dot)ertl(at)univie(dot)ac(dot)at>
To: <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Query performance question
Date: 2001-03-29 21:33:29
Message-ID: 20010329232607.O547-100000@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Thanks for all of your suggestions, I found a way to speed up the queries
using VIEWs:

--
CREATE VIEW hits_per_referrer AS SELECT count(hits.id) AS anzahl,
hits.referrer_id FROM hits GROUP BY hits.referrer_id;

SELECT anzahl, url FROM hits_per_referrer, referrer WHERE referrer_id =
referrer.id ORDER BY anzahl DESC LIMIT 10;
--

This is really _a lot_ faster, as this EXPLAIN statement proves:

--
httplog=# EXPLAIN SELECT anzahl, url FROM hits_per_referrer, referrer
WHERE referrer_id = referrer.id ORDER BY anzahl DESC LIMIT 10;
NOTICE: QUERY PLAN:

Sort (cost=254.66..254.66 rows=1000 width=24)
-> Aggregate (cost=69.83..204.83 rows=1000 width=24)
-> Group (cost=69.83..179.83 rows=10000 width=24)
-> Merge Join (cost=69.83..154.83 rows=10000 width=24)
-> Index Scan using referrer_pkey on referrer
(cost=0.00..60.00 rows=1000 width=16)
-> Sort (cost=69.83..69.83 rows=1000 width=8)
-> Seq Scan on hits (cost=0.00..20.00
rows=1000 width=8)

EXPLAIN
--

Thanks again,

regards,
le

On Thu, 29 Mar 2001, D. Duccini wrote:

>
> i'd suggest either using an MD5 hash or some sort of CRC
>
> definitely a hash though
>
>
> On Thu, 29 Mar 2001, Vijay Deval wrote:
>
> > URL is a very large field. If an extra field is created which gives a
> > neumeric id to the url, run the query on the number, and then get the desired
> > output
> >
> > Vijay
> >
> > Lukas Ertl wrote:
> >
> > > Hi,
> > >
> > > I'm having trouble with an obviously simple query that just doesn't
> > > perform quite good IMO.
> > >
> > > I have two tables:
> > >
> > > httplog=# \d hits
> > > Table "hits"
> > > Attribute | Type | Modifier
> > > -------------+-----------+-----------------------------------------------
> > > id | integer | not null default nextval('hits_id_seq'::text)
> > > page_id | integer | not null
> > > referrer_id | integer | not null
> > > ip_addr | inet | not null
> > > time | timestamp | not null
> > > domain_id | integer | not null
> > > Index: hits_pkey
> > >
> > > httplog=# \d referrer
> > > Table "referrer"
> > > Attribute | Type | Modifier
> > > -----------+--------------+----------
> > > id | integer |
> > > url | varchar(300) |
> > > Index: referrer_pkey
> > >
> > > These are part of an HTTP-log database. Table 'hits' has about 7000
> > > rows, table 'referrer' has about 350 rows. Now I want to know what the top
> > > ten referrers are, and I issue this query:
> > >
> > > SELECT count(*), url FROM hits, referrer WHERE referrer.id = referrer_id
> > > GROUP BY url ORDER BY count DESC LIMIT 10;

--
Lukas Ertl eMail: l(dot)ertl(at)univie(dot)ac(dot)at
WWW-Redaktion Tel.: (+43 1) 4277-14073
Zentraler Informatikdienst (ZID) Fax.: (+43 1) 4277-9140
der Universität Wien

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message wsmuir 2001-03-30 12:25:25 About Triggers
Previous Message Lars Forseth 2001-03-29 19:44:57 Help on Postgres and locale - norwegian characters are not possible in my SuSE 7.0 stock install