Re: fstat vs. lseek

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Kohei Kaigai <Kohei(dot)Kaigai(at)emea(dot)nec(dot)com>
Subject: Re: fstat vs. lseek
Date: 2011-08-08 17:10:05
Message-ID: 2366521.2k2cV9r50e@alap2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, August 08, 2011 11:33:29 Robert Haas wrote:

> On Mon, Aug 8, 2011 at 10:49 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I don't think its a good idea to replace lseek with fstat in the long
> > run. The likelihood that the lockless generic_file_llseek will get
> > included seems rather high to me. In contrast to that fstat will always
> > be more expensive than that as its going through a security check and
> > then the fs' getattr implementation (which actually takes a lock on
> > some fs).
> *scratches head* I understand that stat() would need a security
> check, but why would fstat()?
That I am not totally sure of either. I guess Kaigai might know more about
that.
I guess it might be that a forked process possibly is not allowed anymore to
access the information from an inherited file handle? Also I think a process
can change its permissions during runtime.

> I think both of you raise good points. I wasn't too enthusiastic
> about this approach either. It's not very appealing to adopt an
> approach where the right performance decision is going to depend on
> operating system, file system, kernel version, core count, and
> workload. We could add a GUC, but it would be pretty annoying to have
> a setting that won't matter for most people at all, except
> occasionally when it makes a huge difference.
>
> I wasn't aware that was any current activity around this on the Linux
> side. But Andres' comments made me Google it again, and now I see
> this:
>
> https://lkml.org/lkml/2011/6/16/800
>
> Andes, any idea what the status of that patch is? I'm not clear on
> how Linux works in terms of things getting upstreamed.
There doesn't seem to have been any activity to inlude it in 3.1. The merge
window for 3.1 just ended. The next one will open for about a week after the
release.
Its also not yet included in linux-next which is a "preview" for the currently
worked on release + 1. A release takes roughly 3 months.

For upstreaming somebody needs to be persistent enough to convince one of the
maintainers of the particular area to include the code so that linus then can
pull that.
I guess citing your numbers would go a long way in that direction. Naturally
it would be even better to inlcude results with the patch applied.
My largest machine I can reboot often enough to test such a thing has only two
sockets (4cores E5520). I guess you cannot reboot your loaned machine with a
new kernel easily?

Greetings,
Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-08-08 17:16:49 Re: [RFC] Common object property boards
Previous Message Alvaro Herrera 2011-08-08 16:49:49 Re: [RFC] Common object property boards