Re: Query tuning

From: Kevin Kempter <kevink(at)consistentstate(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query tuning
Date: 2009-08-19 17:36:55
Message-ID: 200908191136.55428.kevink@consistentstate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wednesday 19 August 2009 11:17:26 Scott Carey wrote:
> On 8/19/09 9:28 AM, "Kevin Kempter" <kevink(at)consistentstate(dot)com> wrote:
> > Hi all;
> >
> > we've been fighting this query for a few days now. we bumped up the
> > statistict target for the a.id , c.url_hits_id and the b.id columns below
> > to 250 and ran an analyze on the relevant tables. we killed it after
> > 8hrs.
> >
> > Note the url_hits table has > 1.4billion rows
> >
> > Any suggestions?
>
> Have you tried setting work_mem higher for just this query?

Yes, we upped it to 500Meg

>
> The big estimated cost is the sequential scan on url_hits. But in reality,
> if the estimates are off the sort and index scan at the end might be your
> bottleneck. Larger work_mem might make it choose another plan there.
>
> But if the true cost is the sequential scan on url_hits, then only an index
> there will help.
>
> > $ psql -ef expl.sql pwreport
> > explain
> > select
> > a.id,
> > ident_id,
> > time,
> > customer_name,
> > extract('day' from timezone(e.name, to_timestamp(a.time))) as day,
> > category_id
> > from
> > pwreport.url_hits a left outer join
> > pwreport.url_hits_category_jt c on (a.id = c.url_hits_id),
> > pwreport.ident b,
> > pwreport.timezone e
> > where
> > a.ident_id = b.id
> > and b.timezone_id = e.id
> > and time >= extract ('epoch' from timestamp '2009-08-12')
> > and time < extract ('epoch' from timestamp '2009-08-13' )
> > and direction = 'REQUEST'
> > ;
> >
> > QUERY
> > PLAN
> > -------------------------------------------------------------------------
> >-----
> > -------------------------------------------------------------------------
> >----- --------------------------------------------------------
> > Merge Right Join (cost=47528508.61..180424544.59 rows=10409251
> > width=53) Merge Cond: (c.url_hits_id = a.id)
> > -> Index Scan using mt_url_hits_category_jt_url_hits_id_index on
> > url_hits_category_jt c (cost=0.00..122162596.63 rows=4189283233 width=8)
> > -> Sort (cost=47528508.61..47536931.63 rows=3369210 width=49)
> > Sort Key: a.id
> > -> Hash Join (cost=2565.00..47163219.21 rows=3369210 width=49)
> > Hash Cond: (b.timezone_id = e.id)
> > -> Hash Join (cost=2553.49..47116881.07 rows=3369210
> > width=37)
> > Hash Cond: (a.ident_id = b.id)
> > -> Seq Scan on url_hits a (cost=0.00..47051154.89
> > rows=3369210 width=12)
> > Filter: ((direction =
> > 'REQUEST'::proxy_direction_enum) AND (("time")::double precision >=
> > 1250035200::double precision) AND (("time")::double precision <
> > 1250121600::double precision))
> > -> Hash (cost=2020.44..2020.44 rows=42644
> > width=29) -> Seq Scan on ident b (cost=0.00..2020.44 rows=42644
> > width=29)
> > -> Hash (cost=6.78..6.78 rows=378 width=20)
> > -> Seq Scan on timezone e (cost=0.00..6.78
> > rows=378 width=20)
> > (15 rows)
> >
> >
> > --
> > Sent via pgsql-performance mailing list
> > (pgsql-performance(at)postgresql(dot)org) To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Kempter 2009-08-19 17:37:58 Re: Query tuning
Previous Message Nikolas Everett 2009-08-19 17:31:30 Re: Query tuning