Re: Use pread and pwrite instead of lseek + write and read

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oskari Saarenmaa <os(at)ohmu(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use pread and pwrite instead of lseek + write and read
Date: 2016-08-18 18:39:59
Message-ID: CA+TgmobQbcyHiMLA-7aBz9hcEy3PeGXLie+GzH4gyvEtEe6-Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 17, 2016 at 3:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I don't understand why you think this would create non-trivial
>> portability issues.
>
> The patch as submitted breaks entirely on platforms without pread/pwrite.
> Yes, we can add a configure test and some shim functions to fix that,
> but the argument that it makes the code shorter will get a lot weaker
> once we do.
>
> I agree that adding such functions is pretty trivial, but there are
> reasons to think there are other hazards that are less trivial:
>
> First, a self-contained shim function will necessarily do an lseek every
> time, which means performance will get *worse* not better on non-pread
> platforms. And yes, the existing logic to avoid lseeks fires often enough
> to be worthwhile, particularly in seqscans.
>
> Second, I wonder whether this will break any kernel's readahead detection.
> I wouldn't be too surprised if successive reads (not preads) without
> intervening lseeks are needed to trigger readahead on at least some
> platforms. So there's a potential, both on platforms with pread and those
> without, for this to completely destroy seqscan performance, with
> penalties very far exceeding what we might save by avoiding some kernel
> calls.
>
> I'd be more excited about this if the claimed improvement were more than
> 1.5%, but you know as well as I do that that's barely above the noise
> floor for most performance measurements. I'm left wondering why bother,
> and why take any risk of de-optimizing on some platforms.

Well, I think you're pointing out some things that need to be figured
out, but I hardly think that's a good enough reason to pour cold water
on the whole approach. The number of lseeks we issue on many
workloads is absolutely appalling, and I don't think there's any
reason at all to assume that a 1.5% gain is as good as it gets. Even
if it is, a 1% speedup on a benchmark where the noise is 5-10% is just
as much of a speedup as a 1% speedup on a benchmark on a benchmark
where the noise is 0.1%. Faster is faster, and 1% improvements are
not so numerous that we can afford to ignore them when they pop up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-08-18 18:40:46 Re: Bug in to_timestamp().
Previous Message Jim Nasby 2016-08-18 18:35:23 Re: [PATCH] bigint txids vs 'xid' type, new txid_recent(bigint) => xid