Re: Shared buffers, db transactions commited, and write IO on Solaris

From: Erik Jones <erik(at)myemma(dot)com>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: Dimitri <dimitrik(dot)fr(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Shared buffers, db transactions commited, and write IO on Solaris
Date: 2007-03-30 16:19:09
Message-ID: 449E5B52-0D77-454C-B548-BE9B77641B27@myemma.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Mar 30, 2007, at 10:05 AM, Kenneth Marshall wrote:

> On Fri, Mar 30, 2007 at 04:25:16PM +0200, Dimitri wrote:
>> The problem is while your goal is to commit as fast as possible -
>> it's
>> pity to vast I/O operation speed just keeping common block size...
>> Let's say if your transaction modification entering into 512K -
>> you'll
>> be able to write much more 512K blocks per second rather 8K per
>> second
>> (for the same amount of data)... Even we rewrite probably several
>> times the same block with incoming transactions - it still costs on
>> traffic, and we will process slower even H/W can do better. Don't
>> think it's good, no? ;)
>>
>> Rgds,
>> -Dimitri
>>
> With block sizes you are always trading off overhead versus space
> efficiency. Most OS write only in 4k/8k to the underlying hardware
> regardless of the size of the write you issue. Issuing 16 512byte
> writes has much more overhead than 1 8k write. On the light
> transaction
> end, there is no real benefit to a small write and it will slow
> performance for high throughput environments. It would be better to,
> and I think that someone is looking into, batching I/O.
>
> Ken

True, and really, considering that data is only written to disk by
the bgwriter and at checkpoints, writes are already somewhat
batched. Also, Dimitri, I feel I should backtrack a little and point
out that it is possible to have postgres write in 512byte blocks (at
least for UFS which is what's in my head right now) if you set the
systems logical block size to 4K and fragment size to 512 bytes and
then set postgres's BLCKSZ to 512bytes. However, as Ken has just
pointed out, what you gain in space efficiency you lose in
performance so if you're working with a high traffic database this
wouldn't be a good idea.

erik jones <erik(at)myemma(dot)com>
software developer
615-296-0838
emma(r)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Xiaoning Ding 2007-03-30 20:25:04 scalablility problem
Previous Message Tom Lane 2007-03-30 16:13:00 Re: Wrong plan sequential scan instead of an index one