Re: Need help with 8.4 Performance Testing

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Jean-David Beyer <jeandavid8(at)verizon(dot)net>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Need help with 8.4 Performance Testing
Date: 2008-12-09 23:34:49
Message-ID: C5644099.F9A%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Prefetch CPU cost should be rather low in the grand scheme of things, and does help performance even for very fast I/O. I would not expect a very large CPU use increase from that sort of patch in the grand scheme of things - there is a lot that is more expensive to do on a per block basis.

There are two ways to look at non-I/O bound performance:
* Aggregate performance across many concurrent activities - here you want the least CPU used possible per action, and the least collisions on locks or shared data structures. Using resources for as short of an interval as possible also helps a lot here.
* Single query performance, where you want to shorten the query time, perhaps at the cost of more average CPU. Here, something like the fadvise stuff helps - as would any thread parallelism. Perhaps less efficient in aggregate, but more efficient for a single query.

Overall CPU cost of accessing and reading data. If this comes from disk, the big gains will be along the whole chain: Driver to file system cache, file system cache to process, process specific tasks (cache eviction, placement, tracking), examining page tuples, locating tuples within pages, etc. Anything that currently occurs on a per-block basis that could be done in a larger batch or set of blocks may be a big gain. Another place that commonly consumes CPU in larger software projects is memory allocation if more advanced allocation techniques are not used. I have no idea what Postgres uses here however. I do know that commercial databases have extensive work in this area for performance, as well as reliability (harder to cause a leak, or easier to detect) and ease of use (don't have to even bother to free in certain contexts).

> On 12/9/08 2:58 PM, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:

> I don't believe the thesis. The gap between disk speeds and memory
> speeds may narrow over time, but I doubt it's likely to disappear
> altogether any time soon, and certainly not for all users.

Well, when select count(1) reads pages slower than my disk, its 16x + slower than my RAM. Until one can demonstrate that the system can even read pages in RAM faster than what disks will do next year, it doesn't matter much that RAM is faster. It does matter that RAM is faster for sorts, hashes, and other operations, but at the current time it does not for the raw pages themselves, from what I can measure.

This is in fact, central to my point. Things will be CPU bound, not I/O bound. It is mentioned that we still have to access things over the bus, and memory is faster, etc. But Postgres is too CPU bound on page access to take advantage of the fact that memory is faster (for reading data pages).

The biggest change is not just that disks are getting closer to RAM, but that the random I/O penalty is diminishing significantly. Low latencies makes seek-driven queries that used to consume mostly disk time consume CPU time instead. High CPU costs for accessing pages makes a fast disk surprisingly close to RAM speed.

> Besides which, I believe the CPU overhead of that patch is pretty darn
> small when the feature is not enabled.

> ...Robert

I doubt it is much CPU, on or off. It will help with SSD's when optimizing a single query, it may not help much if a system has enough 'natural' parallelism from other concurrent queries. However there is a clear CPU benefit for getting individual queries out of the way faster, and occupying precious work_mem or other resources for a shorter time. Occupying resources for a shorter period always translates to some CPU savings on a machine running at its limit with high concurrency.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gregory Stark 2008-12-10 00:45:24 Re: Need help with 8.4 Performance Testing
Previous Message Tom Lane 2008-12-09 23:27:23 Re: query plan with index having a btrim is different for strings of different length