Re: Slow count(*) again...

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Joshua Tolley <eggyknap(at)gmail(dot)com>
Cc: mladen(dot)gogala(at)vmsinfo(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow count(*) again...
Date: 2010-10-11 00:51:43
Message-ID: 4CB25F9F.4050204@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 10/11/2010 08:27 AM, Joshua Tolley wrote:

> One thing a test program would have to take into account is multiple
> concurrent users. What speeds up the single user case may well hurt the
> multi user case, and the behaviors that hurt single user cases may have been
> put in place on purpose to allow decent multi-user performance. Of course, all
> of that is "might" and "maybe", and I can't prove any assertions about block
> size either. But the fact of multiple users needs to be kept in mind.

Agreed. I've put together a simple test program to test I/O chunk sizes.
It only tests single-user performance, but it'd be pretty trivial to
adapt it to spawn a couple of worker children or run several threads,
each with a suitable delay as it's rather uncommon to have a bunch of
seqscans all fire off at once.

From this test it's pretty clear that with buffered I/O of an uncached
700mb file under Linux, the I/O chunk size makes very little difference,
with all chunk sizes taking 9.8s to read the test file, with
near-identical CPU utilization. Caches were dropped between each test run.

For direct I/O (by ORing the O_DIRECT flag to the open() flags), chunk
size is *hugely* significant, with 4k chunk reads of the test file
taking 38s, 8k 22s, 16k 14s, 32k 10.8s, 64k - 1024k 9.8s, then rising a
little again over 1024k.

Apparently Oracle is almost always configured to use direct I/O, so it
would benefit massively from large chunk sizes. PostgreSQL is almost
never used with direct I/O, and at least in terms of the low-level costs
of syscalls and file system activity, shouldn't care at all about read
chunk sizes.

Bumping readahead from 256 to 8192 made no significant difference for
either case. Of course, I'm on a crappy laptop disk...

I'm guessing this is the origin of the OP's focus on I/O chunk sizes.

Anyway, for the single-seqscan case, I see little evidence here that
using a bigger read chunk size would help PostgreSQL reduce overheads or
improve performance.

OP: Is your Oracle instance using direct I/O?

--
Craig Ringer

Attachment Content-Type Size
testio.c text/plain 918 bytes
results_buffered_ra256 text/plain 1.1 KB
results_buffered_ra4096 text/plain 1.1 KB
results_odirect_ra256 text/plain 1016 bytes
results_odirect_ra4096 text/plain 1019 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Christensen 2010-10-11 02:25:53 Re: Which file does the SELECT?
Previous Message Joshua Tolley 2010-10-11 00:27:53 Re: Slow count(*) again...

Browse pgsql-performance by date

  From Date Subject
Next Message Mladen Gogala 2010-10-11 03:14:43 Re: Slow count(*) again...
Previous Message Joshua Tolley 2010-10-11 00:27:53 Re: Slow count(*) again...