Re: same query in high number of times

From: Peter Alban <peter(dot)alban2(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: same query in high number of times
Date: 2009-06-22 21:11:06
Message-ID: 477dfcc10906221411t34dfac4fn9af31a246b69e85b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

hey folks !

eventually the removing of the group by did improve but still my concern is
why cant we take the result from memory given its same resultset .
But I keep pusing for the developers to move to memcached so we overcome
this limitation .

cheers,
Peter

On Mon, Jun 22, 2009 at 5:23 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Sun, Jun 21, 2009 at 4:59 PM, Peter Alban<peter(dot)alban2(at)gmail(dot)com>
> wrote:
> >
> >
> > On Sun, Jun 21, 2009 at 10:01 PM, Justin Graf <justin(at)emproshunts(dot)com>
> > wrote:
> >>
> >> Peter Alban wrote:
> >>
> >> duration: 2533.734 ms statement:
> >>
> >> SELECT news.url_text,news.title, comments.name, comments.createdate,
> >> comments.user_id, comments.comment FROM news, comments WHERE
> >> comments.cid=news.id AND comments.published='1' GROUP BY
> >> news.url_text,news.title comments.name, comments.createdate,
> >> comments.user_id, comments.comment ORDER BY comments.createdate DESC
> LIMIT 3
> >>
> >>
> >> And here is the query plan :
> >> QUERY
> >> PLAN
> >>
> >>
> ----------------------------------------------------------------------------------------------------------------------------------------
> >> Limit (cost=4313.54..4313.55 rows=3 width=595) (actual
> >> time=288.525..288.528 rows=3 loops=1)
> >> -> Sort (cost=4313.54..4347.26 rows=13486 width=595) (actual
> >> time=288.523..288.523 rows=3 loops=1)
> >> Sort Key: comments.createdate
> >> -> HashAggregate (cost=3253.60..3388.46 rows=13486 width=595)
> >> (actual time=137.521..148.132 rows=13415 loops=1)
> >> -> Hash Join (cost=1400.73..3051.31 rows=13486
> width=595)
> >> (actual time=14.298..51.049 rows=13578 loops=1)
> >> Hash Cond: ("outer".cid = "inner".id)
> >> -> Seq Scan on comments (cost=0.00..1178.72
> >> rows=13480 width=522) (actual time=0.012..17.434 rows=13418 loops=1)
> >> Filter: (published = 1)
> >> -> Hash (cost=1391.18..1391.18 rows=3818
> width=81)
> >> (actual time=14.268..14.268 rows=3818 loops=1)
> >> -> Seq Scan on news (cost=0.00..1391.18
> >> rows=3818 width=81) (actual time=0.021..10.072 rows=3818 loops=1)
> >>
> >> The same is being requested from different sessions . So why is it not
> >> being cached .
> >>
> >>
> >> Because the query results are not cached only the RAW tables are. The
> >> query is rerun every time it is requested.
> >>
> >> What is the group by clause accomplishing???
> >> The sorting and hash Aggregate is eating up all the time
> >
> > So this should mean that having say a 5 mb table in memory doing such
> query
> > above takes 2 secs in memory ?
>
> Nope. But as others have pointed out, you need to figure out why it's
> taking 2.5 s but EXPLAIN ANALYZE is only saying 300 ms.
>
> There's other things you can do to optimize this query; for example:
>
> 1. Try creating an index on comments (createdate), and don't forget to
> ANALYZE the table afterward, or
>
> 2. Modify the query to remove the probably-unnecessary GROUP BY.
>
> But figuring out the times may be the first thing. My guess is that
> the 2.5 s time is a time from your logs, maybe at a time when the
> system was busy, and the 300 ms time was what you got it when you ran
> it some other time. But maybe there's some other explanation. You
> should try to figure it out.
>
> ...Robert
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2009-06-22 23:06:48 Re: same query in high number of times
Previous Message Robert Haas 2009-06-22 03:23:33 Re: same query in high number of times