Re: Seq scans roadmap

From: Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>
To: "CK Tan" <cktan(at)greenplum(dot)com>
Cc: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Luke Lonergan" <LLonergan(at)greenplum(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, "Simon Riggs" <simon(at)enterprisedb(dot)com>
Subject: Re: Seq scans roadmap
Date: 2007-05-11 10:47:24
Message-ID: E1539E0ED7043848906A8FF995BDA5790211EA6A@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> Sorry, 16x8K page ring is too small indeed. The reason we
> selected 16 is because greenplum db runs on 32K page size, so
> we are indeed reading 128K at a time. The #pages in the ring
> should be made relative to the page size, so you achieve 128K
> per read.

Ah, ok. New disks here also have a peak at 128k with no other concurrent
IO.
Writes benefit from larger blocksizes though, 512k and more.
Reads with other concurrent IO might also benefit from larger
blocksizes.

Comment to all: to test optimal blocksizes make sure you have other
concurrent IO on the disk.

> Also agree that KillAndReadBuffer could be split into a
> KillPinDontRead(), and ReadThesePinnedPages() functions.
> However, we are thinking of AIO and would rather see a
> ReadNPagesAsync() function.

Yes, you could start the aio and return an already read buffer to allow
concurrent cpu work.
However, you would still want to do blocked aio_readv calls to make sure
the physical read uses the large blocksize.
So I'd say aio would benefit from the same split.

In another posting you wrote:
> The patch has no effect on scans that do updates.
> The KillAndReadBuffer routine does not force out a buffer if
> the dirty bit is set. So updated pages revert to the current
> performance characteristics.

Yes I see, the ring slot is replaced by a standard ReadBuffer in that
case, looks good.

I still think it would be better to write out the buffers and keep them
in the ring when possible, but that seems to need locks and some sort of
synchronization with the new walwriter, so looks like a nice project for
after 8.3.

Andreas

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Smet 2007-05-11 11:29:09 Re: Logging checkpoints and other slowdown causes
Previous Message Heikki Linnakangas 2007-05-11 10:01:38 Re: Logging checkpoints and other slowdown causes