Re: Low perfomance SUM and Group by large databse

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Sergio Charpinel Jr(dot)" <sergiocharpinel(at)gmail(dot)com>
Cc: Russell Smith <mr-russ(at)pws(dot)com(dot)au>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Low perfomance SUM and Group by large databse
Date: 2010-07-02 17:21:47
Message-ID: AANLkTimmthOhlu2nflXje4noQouyiRwgZvkq60PRAJOB@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Jun 29, 2010 at 7:59 AM, Sergio Charpinel Jr.
<sergiocharpinel(at)gmail(dot)com> wrote:
> One more question about two specifics query behavior: If I add "AND (ip_dst
> = x.x.x.x)", it uses another plan and take a much more time. In both of
> them, I'm using WHERE clause. Why this behavior?

With either query, the planner is choosing to scan backward through
the acct_2010_26_pk index to get the rows in descending order by the
"bytes" column. It keeps scanning until it finds 50 rows that match
the WHERE clause. With just the critieria on stamp_inserted, matches
are pretty common, so it doesn't have to scan very far before finding
50 suitable rows. But when you add the ip_dst = 'x.x.x.x' criterion,
suddenly a much smaller percentage of the rows match and so it has to
read much further into the index before it finds 50 that do.

A second index on just the ip_dst column might help a lot - then it
could consider index-scanning for the matching rows and sorting them
afterwards.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Deborah Fuentes 2010-07-02 19:07:58 Re: Extremely high CPU usage when building tables
Previous Message Benjamin Krajmalnik 2010-07-02 16:40:03 Re: Question about partitioned query behavior