Re: Performance question 83 GB Table 150 million rows, distinct select

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Tory M Blue <tmblue(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select
Date: 2011-11-17 03:02:56
Message-ID: CAOR=d=3dN+7gXYKCqZkFAJnEPAACoF-att42cE4rUeokWL6Wmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue <tmblue(at)gmail(dot)com> wrote:
> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> On 17 Listopad 2011, 2:57, Scott Marlowe wrote:
>>> On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>>>
>>>> But you're right - you're not bound by I/O (although I don't know what
>>>> are
>>>> those 15% - iowait, util or what?). The COUNT(DISTINCT) has to actually
>>>> keep all the distinct values to determine which are actually distinct.
>>>
>>> Actually I meant to comment on this, he is IO bound.  Look at % Util,
>>> it's at 99 or 100.
>>>
>>> Also, if you have 16 cores and look at something like vmstat you'll
>>> see 6% wait state.  That 6% represents one CPU core waiting for IO,
>>> the other cores will add up the rest to 100%.
>>
>> Aaaah, I keep forgetting about this and I somehow ignored the iostat
>> results too. Yes, he's obviously IO bound.
>
> I'm not so sure on the io-bound. Been battling/reading about it all
> day. 1 CPU is pegged at 100%, but the disk is not. If I do something

Look here in iostat:

> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00     3.50 3060.00    2.00 49224.00    20.00
> 16.08     2.21    0.76   0.33  99.95

See that last column, it's % utilization. Once it hits 100% you are
anywhere from pretty close to IO bound to right on past it.

I agree with the previous poster, you should roll these up ahead of
time into a materialized view for fast reporting.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2011-11-17 03:04:38 Re: Performance question 83 GB Table 150 million rows, distinct select
Previous Message Tory M Blue 2011-11-17 02:42:55 Re: Performance question 83 GB Table 150 million rows, distinct select