Re: How to improve db performance with $7K?

From: "Mohan, Ross" <RMohan(at)arbinet(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to improve db performance with $7K?
Date: 2005-04-14 15:02:27
Message-ID: CC74E7E10A8A054798B6611BD1FEF4D30625DA58@vamail01.thexchange.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Imagine a system in "furious activity" with two (2) process regularly occuring

Process One: Looooong read (or write). Takes 20ms to do seek, latency, and
stream off. Runs over and over.
Process Two: Single block read ( or write ). Typical database row access.
Optimally, could be submillisecond. happens more or less randomly.

Let's say process one starts, and then process two. Assume, for sake of this discussion,
that P2's block lies w/in P1's swath. (But doesn't have to...)

Now, everytime process two has to wait at LEAST 20ms to complete. In a queue-reordering
system, it could be a lot faster. And me, looking for disk service times on P2, keep
wondering "why does a single diskblock read keep taking >20ms?"

Soooo....it doesn't need to be "a read" or "a write". It doesn't need to be "furious activity"
(two processes is not furious, even for a single user desktop.) This is not a "corner case",
and while it doesn't take into account kernel/drivecache/UBC buffering issues, I think it
shines a light on why command re-ordering might be useful. <shrug>

YMMV.

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Kevin Brown
Sent: Thursday, April 14, 2005 4:36 AM
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Greg Stark wrote:

> I think you're being misled by analyzing the write case.
>
> Consider the read case. When a user process requests a block and that
> read makes its way down to the driver level, the driver can't just put
> it aside and wait until it's convenient. It has to go ahead and issue
> the read right away.

Well, strictly speaking it doesn't *have* to. It could delay for a couple of milliseconds to see if other requests come in, and then issue the read if none do. If there are already other requests being fulfilled, then it'll schedule the request in question just like the rest.

> In the 10ms or so that it takes to seek to perform that read
> *nothing* gets done. If the driver receives more read or write
> requests it just has to sit on them and wait. 10ms is a lifetime for a
> computer. In that time dozens of other processes could have been
> scheduled and issued reads of their own.

This is true, but now you're talking about a situation where the system goes from an essentially idle state to one of furious activity. In other words, it's a corner case that I strongly suspect isn't typical in situations where SCSI has historically made a big difference.

Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that has tagged queueing to the disks and one that doesn't. The only difference is where the queueing happens. In the case of SCSI, the queueing happens on the disks (or at least on the controller). In the case of SATA, the queueing happens in the kernel.

I suppose the tagged queueing setup could begin the head movement and, if another request comes in that requests a block on a cylinder between where the head currently is and where it's going, go ahead and read the block in question. But is that *really* what happens in a tagged queueing system? It's the only major advantage I can see it having.

> The same thing would happen if you had lots of processes issuing lots
> of small fsynced writes all over the place. Postgres doesn't really do
> that though. It sort of does with the WAL logs, but that shouldn't
> cause a lot of seeking. Perhaps it would mean that having your WAL
> share a spindle with other parts of the OS would have a bigger penalty
> on IDE drives than on SCSI drives though?

Perhaps.

But I rather doubt that has to be a huge penalty, if any. When a process issues an fsync (or even a sync), the kernel doesn't *have* to drop everything it's doing and get to work on it immediately. It could easily gather a few more requests, bundle them up, and then issue them. If there's a lot of disk activity, it's probably smart to do just that. All fsync and sync require is that the caller block until the data hits the disk (from the point of view of the kernel). The specification doesn't require that the kernel act on the calls immediately or write only the blocks referred to by the call in question.

--
Kevin Brown kevin(at)sysexperts(dot)com

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
message can get through to the mailing list cleanly

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-04-14 15:05:45 Re: Foreign key slows down copy/insert
Previous Message Greg Stark 2005-04-14 14:54:45 Intel SRCS16 SATA raid?