Re: Slow count(*) again...

From: Neil Whelchel <neil(dot)whelchel(at)gmail(dot)com>
To: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
Subject: Re: Slow count(*) again...
Date: 2010-10-11 04:15:56
Message-ID: 201010102115.58729.neil.whelchel@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Sunday 10 October 2010 15:41:16 you wrote:
> On 10/11/2010 01:14 AM, Mladen Gogala wrote:
> > I can provide measurements, but from Oracle RDBMS. Postgres doesn't
> > allow tuning of that aspect, so no measurement can be done. Would the
> > numbers from Oracle RDBMS be acceptable?
>
> Well, they'd tell me a lot about Oracle's performance as I/O chunk size
> scales, but almost nothing about the cost of small I/O operations vs
> larger ones in general.
>
> Typically dedicated test programs that simulate the database read
> patterns would be used for this sort of thing. I'd be surprised if
> nobody on -hackers has already done suitable testing; I was mostly
> asking because I was interested in how you were backing your assertions.
>
> PostgreSQL isn't Oracle; their design is in many ways very different.
> Most importantly, Oracle uses a redo log, where PostgreSQL stores old
> rows with visibility information directly in the tables. It is possible
> that a larger proportion of Oracle's I/O costs are fixed per-block
> overheads rather than per-byte costs, so it seeks to batch requests into
> larger chunks. Of course, it's also possible that 8k chunk I/O is just
> universally expensive and is something Pg should avoid, too, but we
> can't know that without
> dedicated testing, which I at least haven't done. I don't follow
> -hackers closely, and wouldn't have seen discussion about testing done
> there. The archives are likely to contain useful discussions.
>
> Then again, IIRC Pg's page size is also it's I/O size, so you could
> actually get larger I/O chunking by rebuilding Pg with larger pages.
> Having never had the need, I haven't examined the performance of page
> size changes on I/O performance.

This is a really good idea! I will look into doing this and I will post the
results as soon as I can get it done.

Right now, I am building a test machine with two dual core Intel processors
and two 15KRPM mirrored hard drives, 1 GB ram. I am using a small amount of
ram because I will be using small test tables. I may do testing in the future
with more ram and bigger tables, but I think I can accomplish what we are all
after with what I have. The machine will be limited to running the database
server in test, init, bash, and ssh, no other processes will be running except
for what is directly involved with testing. I will post exact specs when I
post test results. I will create some test tables, and the same tables will be
used in all tests. Suggestions for optimal Postgres and system configuration
are welcome. I will try any suggested settings that I have time to test.
-Neil-

>
> >> The Linux kernel, at least, does request merging (and splitting, and
> >> merging, and more splitting) along the request path, and I'd
> >> personally expect that most of the cost of 8k requests would be in the
> >> increased number of system calls, buffer copies, etc required.
> >> Measurements demonstrating or contradicting this would be good to see.
> >
> > Even the cost of hundreds of thousands of context switches is far from
> > negligible. What kind of measurements do you expect me to do with the
> > database which doesn't support tweaking of that aspect of its operation?
>
> Test programs, or references to testing done by others that demonstrates
> these costs in isolation. Of course, they still wouldn't show what gain
> Pg might obtain (nothing except hacking on Pg's sources really will) but
> they'd help measure the costs of doing I/O that way.
>
> I suspect you're right that large I/O chunks would be desirable and a
> potential performance improvement. What I'd like to know is *how*
> *much*, or at least how much the current approach costs in pure
> overheads like context switches and scheduler delays.
>
> > Does that provide enough of an evidence and, if not, why not?
>
> It shows that it helps Oracle a lot ;-)
>
> Without isolating how much of that is raw costs of the block I/O and how
> much is costs internal to Oracle, it's still hard to have much idea how
> much it'd benefit Pg to take a similar approach.
>
> I'm sure the folks on -hackers have been over this and know a whole lot
> more about it than I do, though.
>
> > Oracle counts 400 million records in 2 minutes and Postgres 9.01 takes
> > 12.8 seconds to count 1.2 million records? Do you see the disparity?
>
> Sure. What I don't know is how much of that is due to block sizes. There
> are all sorts of areas where Oracle could be gaining.
>
> > It maybe so, but slow sequential scan is still the largest single
> > performance problem of PostgreSQL. The frequency with which that topic
> > appears on the mailing lists should serve as a good evidence for that.
>
> I'm certainly not arguing that it could use improvement; it's clearly
> hurting some users. I just don't know if I/O chunking is the answer - I
> suspect that if it were, then it would've become a priority for one or
> more people working on Pg much sooner.
>
> It's quite likely that it's one of those things where it makes a huge
> difference for Oracle because Oracle has managed to optimize out most of
> the other bigger costs. If Pg still has other areas that make I/O more
> expensive per-byte (say, visibility checks) and low fixed per-block
> costs, then there'd be little point in chunking I/O. My understanding is
> that that's pretty much how things stand at the moment, but I'd love
> verification from someone who's done the testing.
>
> >If you still claim that it wouldn't make the difference,
> >
> > the onus to prove it is on you.
>
> I didn't mean to claim that it would make no difference. If I sounded
> like it, sorry.
>
> I just want to know how _much_ , or more accurately how great the
> overheads of the current approach in Pg are vs doing much larger reads.
>
> --
> Craig Ringer

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2010-10-11 06:09:54 Re: patch: psql variables tabcomplete
Previous Message Craig Ringer 2010-10-11 04:11:58 Re: Slow count(*) again...

Browse pgsql-performance by date

  From Date Subject
Next Message AI Rumman 2010-10-11 06:38:45 join order
Previous Message Craig Ringer 2010-10-11 04:11:58 Re: Slow count(*) again...