Re: lseek/read/write overhead becomes visible at scale ..

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tobias Oberstein <tobias(dot)oberstein(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: lseek/read/write overhead becomes visible at scale ..
Date: 2017-01-25 19:52:38
Message-ID: 20170125195238.2sjh477x2zvmjm5a@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017-01-25 10:16:32 +0100, Tobias Oberstein wrote:
> > > Using pread instead of lseek+read halfes the syscalls.
> > >
> > > I really don't understand what you are fighting here ..
> >
> > Sure, there's some overhead. And as I said upthread, I'm much less
> > against this change than Tom. What I'm saying is that your benchmarks
> > haven't shown a benefit in a meaningful way, so I don't think I can
> > agree with
> >
> > > "Well, my point remains that I see little value in messing with
> > > long-established code if you can't demonstrate a benefit that's clearly
> > > above the noise level."
> > >
> > > I have done lots of benchmarking over the last days on a massive box, and I
> > > can provide numbers that I think show that the impact can be significant.
> >
> > since you've not actually shown that the impact is above the noise level
> > when measured with an actual postgres workload.
>
> I can follow that.
>
> So real prove cannot be done with FIO, but "actual PG workload".

Right.

> Synthetic PG workload or real world production workload?

Both might work, production-like has bigger pull, but I'd guess
synthetic is good enough.

> Also: rgd the perf profiles from production that show lseek as #1 syscall.

You'll, depending on your workload, still have a lot of lseeks even if
we were to use pread/pwrite because we do lseek(SEEK_END) to get file
sizes.

> You said it wouldn't be prove either, because it only shows number of
> syscalls, and though it is clear that millions of syscalls/sec do come with
> overhead, it is still not showing "above noise" level relevance (because PG
> is such a CPU hog in itself anyways;)

Yep.

> So how would I do a perf profile that would be acceptable as prove?

You'd have to look at cpu time, not number of syscalls. IIRC I
suggested doing a cycles profile with -g and then using "perf report
--children" to see how many cycles are spent somewhere below lseek.

I'd also suggest sharing a profile cycles profile, it's quite likely
that the overhead is completely elsewhere.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-01-25 19:58:31 Re: [BUGS] Problem in using pgbench's --connect(-C) and --rate=rate(-R rate) options together.
Previous Message Jim Nasby 2017-01-25 19:48:03 Re: Proposal : For Auto-Prewarm.