Re: Load distributed checkpoint

From: "Takayuki Tsunakawa" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Load distributed checkpoint
Date: 2006-12-08 02:05:18
Message-ID: 014901c71a6d$4f4c33e0$19527c0a@OPERAO
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Hello,

As Mr. Mayer points out, which of step 2 or 3 actually causes I/O
depends on the VM settings, and the amount of RAM available for file
system cache.

"Ron Mayer" <rm_pg(at)cheapcomplexdevices(dot)com> wrote in message
news:45786549(dot)2000602(at)cheapcomplexdevices(dot)com(dot)(dot)(dot)
> It seems to me that the most portable way postgresql could force
> the I/O to be balanced would be to insert otherwise unnecessary
> fsync()s into step 2; but that it might (not sure why) be better
> to handle this through OS-specific tuning outside of postgres.

I'm afraid it is difficult for system designers to expect steady
throughput/response time, as long as PostgreSQL depends on the
flushing of file system cache. How does Oracle provide stable
performance?
Though I'm not sure, isn't it the key to use O_SYNC so that write()s
transfer data to disk? That is, PostgreSQL completely controls the
timing of data transfer. Moreover, if possible, it's better to bypass
the file system cache, using such as O_DIRECT flag for open() on UNIX
and FILE_FLAG_NO_BUFFERING flag for CreateFile() on Windows. As far as
I know, SQL Server and Oracle does this. I think commercial DBMSs do
the same thing to control and anticipate the I/O activity without
being influenced by VM policy.
If PostgreSQL is to use these, writing of dirty buffers has to be
improved. To decrease the count of I/O, pages adjacent on disk that
are also adjacent on memory must be written with one write().

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
> Would the background writer be disabled during this extended
checkpoint? How is it better to concentrate step 2 in an extended
checkpoint periodically rather than consistently in the background
writer?
> Will there be any affect on PITR techniques, in terms of how current
the copied WAL files would be?

Extending the checkpoint can also cause extended downtime, to put it
in an extreme way. I understand that checkpoints occur during crash
recovery and PITR, so time for those operations would get longer. The
checkpoint also occurs at server shutdown. However, distinction among
these might be made, and undesirable extension could be avoided.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Sherry 2006-12-08 02:12:21 Re: Weak passwords and brute force attacks
Previous Message Tom Lane 2006-12-07 23:49:46 Re: Dead code in _bt_split?

Browse pgsql-patches by date

  From Date Subject
Next Message Neil Conway 2006-12-08 02:15:25 Re: ShowStats
Previous Message Gavin Sherry 2006-12-08 01:56:01 ShowStats