Re: Performance question 83 GB Table 150 million rows, distinct select

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Tory M Blue <tmblue(at)gmail(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance question 83 GB Table 150 million rows, distinct select
Date: 2011-11-17 16:55:40
Message-ID: CAGTBQpaAEApuVB6Z7BcCVs1O-t2L3ZSLbCPyqG5LL__sjUoukA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Nov 17, 2011 at 11:17 AM, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:
> But remember, you're doing all that in a single query.  So your disk
> subsystem might even be able to perform even more *througput* if it
> was given many more concurrent request.  A big raid10 is really good
> at handling multiple concurrent requests.  But it's pretty much
> impossible to saturate a big raid array with only a single read
> stream.

The query uses a bitmap heap scan, which means it would benefit from a
high effective_io_concurrency.

What's your effective_io_concurrency setting?

A good place to start setting it is the number of spindles on your
array, though I usually use 1.5x that number since it gives me a
little more thoughput.

You can set it on a query-by-query basis too, so you don't need to
change the configuration. If you do, a reload is enough to make PG
pick it up, so it's an easy thing to try.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jon Nelson 2011-11-17 17:10:56 external sort performance
Previous Message Tomas Vondra 2011-11-17 16:05:31 Re: Performance question 83 GB Table 150 million rows, distinct select