Quick Links

Re: Slow count(*) again...

From:	Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To:	Joshua Tolley <eggyknap(at)gmail(dot)com>
Cc:	mladen(dot)gogala(at)vmsinfo(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Slow count(*) again...
Date:	2010-10-11 00:51:43
Message-ID:	4CB25F9F.4050204@postnewspapers.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

On 10/11/2010 08:27 AM, Joshua Tolley wrote:

> One thing a test program would have to take into account is multiple
> concurrent users. What speeds up the single user case may well hurt the
> multi user case, and the behaviors that hurt single user cases may have been
> put in place on purpose to allow decent multi-user performance. Of course, all
> of that is "might" and "maybe", and I can't prove any assertions about block
> size either. But the fact of multiple users needs to be kept in mind.

Agreed. I've put together a simple test program to test I/O chunk sizes.
It only tests single-user performance, but it'd be pretty trivial to
adapt it to spawn a couple of worker children or run several threads,
each with a suitable delay as it's rather uncommon to have a bunch of
seqscans all fire off at once.

From this test it's pretty clear that with buffered I/O of an uncached
700mb file under Linux, the I/O chunk size makes very little difference,
with all chunk sizes taking 9.8s to read the test file, with
near-identical CPU utilization. Caches were dropped between each test run.

For direct I/O (by ORing the O_DIRECT flag to the open() flags), chunk
size is *hugely* significant, with 4k chunk reads of the test file
taking 38s, 8k 22s, 16k 14s, 32k 10.8s, 64k - 1024k 9.8s, then rising a
little again over 1024k.

Apparently Oracle is almost always configured to use direct I/O, so it
would benefit massively from large chunk sizes. PostgreSQL is almost
never used with direct I/O, and at least in terms of the low-level costs
of syscalls and file system activity, shouldn't care at all about read
chunk sizes.

Bumping readahead from 256 to 8192 made no significant difference for
either case. Of course, I'm on a crappy laptop disk...

I'm guessing this is the origin of the OP's focus on I/O chunk sizes.

Anyway, for the single-seqscan case, I see little evidence here that
using a bigger read chunk size would help PostgreSQL reduce overheads or
improve performance.

OP: Is your Oracle instance using direct I/O?

--
Craig Ringer

Attachment	Content-Type	Size
testio.c	text/plain	918 bytes
results_buffered_ra256	text/plain	1.1 KB
results_buffered_ra4096	text/plain	1.1 KB
results_odirect_ra256	text/plain	1016 bytes
results_odirect_ra4096	text/plain	1019 bytes

In response to

Re: Slow count(*) again... at 2010-10-11 00:27:53 from Joshua Tolley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David Christensen	2010-10-11 02:25:53	Re: Which file does the SELECT?
Previous Message	Joshua Tolley	2010-10-11 00:27:53	Re: Slow count(*) again...

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Mladen Gogala	2010-10-11 03:14:43	Re: Slow count(*) again...
Previous Message	Joshua Tolley	2010-10-11 00:27:53	Re: Slow count(*) again...