Quick Links

Re: Performance question 83 GB Table 150 million rows, distinct select

From:	"Tomas Vondra" <tv(at)fuzzy(dot)cz>
To:	"Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
Cc:	"Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Tory M Blue" <tmblue(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Performance question 83 GB Table 150 million rows, distinct select
Date:	2011-11-17 02:27:35
Message-ID:	57c79e7822e93c2325245ab4f5606acc.squirrel@sq.gransy.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 17 Listopad 2011, 2:57, Scott Marlowe wrote:
> On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>
>> But you're right - you're not bound by I/O (although I don't know what
>> are
>> those 15% - iowait, util or what?). The COUNT(DISTINCT) has to actually
>> keep all the distinct values to determine which are actually distinct.
>
> Actually I meant to comment on this, he is IO bound. Look at % Util,
> it's at 99 or 100.
>
> Also, if you have 16 cores and look at something like vmstat you'll
> see 6% wait state. That 6% represents one CPU core waiting for IO,
> the other cores will add up the rest to 100%.

Aaaah, I keep forgetting about this and I somehow ignored the iostat
results too. Yes, he's obviously IO bound.

But this actually means the pre-aggregating the data (as I described in my
previous post) would probably help him even more (less data, less CPU).

Tomas

In response to

Re: Performance question 83 GB Table 150 million rows, distinct select at 2011-11-17 01:57:54 from Scott Marlowe

Responses

Re: Performance question 83 GB Table 150 million rows, distinct select at 2011-11-17 02:42:55 from Tory M Blue

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tory M Blue	2011-11-17 02:42:55	Re: Performance question 83 GB Table 150 million rows, distinct select
Previous Message	Andy Colson	2011-11-17 02:11:18	Re: Performance question 83 GB Table 150 million rows, distinct select