Skip site navigation (1) Skip section navigation (2)

Re: Performance question 83 GB Table 150 million rows, distinct select

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Tory M Blue <tmblue(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select
Date: 2011-11-17 03:02:56
Message-ID: CAOR=d=3dN+7gXYKCqZkFAJnEPAACoF-att42cE4rUeokWL6Wmg@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue <tmblue(at)gmail(dot)com> wrote:
> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> On 17 Listopad 2011, 2:57, Scott Marlowe wrote:
>>> On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>>>
>>>> But you're right - you're not bound by I/O (although I don't know what
>>>> are
>>>> those 15% - iowait, util or what?). The COUNT(DISTINCT) has to actually
>>>> keep all the distinct values to determine which are actually distinct.
>>>
>>> Actually I meant to comment on this, he is IO bound.  Look at % Util,
>>> it's at 99 or 100.
>>>
>>> Also, if you have 16 cores and look at something like vmstat you'll
>>> see 6% wait state.  That 6% represents one CPU core waiting for IO,
>>> the other cores will add up the rest to 100%.
>>
>> Aaaah, I keep forgetting about this and I somehow ignored the iostat
>> results too. Yes, he's obviously IO bound.
>
> I'm not so sure on the io-bound. Been battling/reading about it all
> day. 1 CPU is pegged at 100%, but the disk is not. If I do something

Look here in iostat:

> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00     3.50 3060.00    2.00 49224.00    20.00
> 16.08     2.21    0.76   0.33  99.95

See that last column, it's % utilization.   Once it hits 100% you are
anywhere from pretty close to IO bound to right on past it.

I agree with the previous poster, you should roll these up ahead of
time into a materialized view for fast reporting.

In response to

Responses

pgsql-performance by date

Next:From: Scott MarloweDate: 2011-11-17 03:04:38
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select
Previous:From: Tory M BlueDate: 2011-11-17 02:42:55
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group