From: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
---|---|
To: | "Sergio Charpinel Jr(dot)" <sergiocharpinel(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Low perfomance SUM and Group by large databse |
Date: | 2010-06-21 22:06:47 |
Message-ID: | 4C1FE277.5030603@postnewspapers.com.au |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 21/06/10 22:42, Sergio Charpinel Jr. wrote:
> Hi,
>
> I'm getting low performance on SUM and GROUP BY queries.
> How can I improve my database to perform such queries.
> -> Sort (cost=3499863.27..3523695.89 rows=9533049
> width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1)
> Sort Key: ip_src, port_src, ip_dst, port_dst,
> tcp_flags, ip_proto
> -> Seq Scan on acct_2010_25 (cost=0.00..352648.10
> rows=9533049 width=50) (actual time=0.038..50860.391 rows=9494165 loops=1)
> Filter: ((stamp_inserted >= '2010-06-20
> 10:10:00'::timestamp without time zone) AND (stamp_inserted <
> '2010-06-21 10:10:00'::timestamp without time zone))
Provide an index on at least (ip_src,port_src,ip_dst,port_dst). If you
frequently do other queries that only want some of that information you
could create several individual indexes for those columns instead, as Pg
will combine them for a query, but that is much less efficient than an
index across all four columns.
CREATE INDEX ip_peers_idx ON acct_2010_25(ip_src,port_src,ip_dst_port_dst);
Every index added costs you insert/update/delete speed, so try to find
the smallest/simplest index that gives you acceptable performance.
> Another one just summing bytes (still low):
>
> => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags,
> ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE
> "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21
> 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto
> LIMIT 50 OFFSET 0;
Same deal. You have no suitable index, so Pg has to do a sequential scan
of the table. Since you appear to query on stamp_inserted a lot, you
should index it.
--
Craig Ringer
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-06-21 22:14:39 | Re: PostgreSQL as a local in-memory cache |
Previous Message | Scott Marlowe | 2010-06-21 18:08:05 | Re: mysql to postgresql, performance questions |