Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction
Date: 2020-03-29 20:12:15
Message-ID: 20200329201215.GM20103@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 29, 2020 at 01:22:04PM -0400, Tom Lane wrote:
> Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> > On Sun, Mar 29, 2020 at 12:37:05PM -0400, Tom Lane wrote:
> >> After looking at the callers of pg_ls_dir_files, and noticing that
> >> it's already defined to ignore anything that's not a regular file,
> >> I think switching to lstat makes sense.
>
> > Yea, only pg_ls_dir() shows special file types (and currently the others even
> > hide dirs).
>
> > The essence of your patch is to ignore ENOENT, but you also changed to use
> > lstat(), which seems unrelated. That means we'll now hide (non-broken)
> > symlinks. Is that intentional/needed ?
>
> Well, the following comment says "ignore anything but regular files",
> so I'm supposing that that is the behavior that we actually want here
> and failed to implement correctly. There might be scope for
> additional directory-reading functions, but I'd think you'd want
> more information (such as the file type) returned from anything
> that doesn't act this way.

Maybe pg_stat_file() deserves similar attention ? Right now, it'll fail on a
broken link. If we changed it to lstat(), then it'd work, but it'd also show
metadata for the *link* rather than its target.

Patch proposed as v14-0001 patch here may be relevant:
https://www.postgresql.org/message-id/20200317190401.GY26184%40telsasoft.com
- indicating if it is a directory. Typical usages include:
+ indicating if it is a directory (or a symbolic link to a directory).
...

> In practice, since these directories shouldn't contain symlinks,
> it's likely moot. The only place in PG data directories where
> we actually expect symlinks is pg_tablespace ... and that contains
> symlinks to directories, so that this function would ignore them
> anyway.

I wouldn't hesitate to make symlinks, at least in log. It's surprising when
files are hidden, but I won't argue about the best behavior here.

I'm thinking of distributions or local configurations that use
/var/log/postgresql. I didn't remember or didn't realize, but it looks like
debian's packages use logging_collector=off and then launch postmaster with
2> /var/log/postgres/... It seems reasonable to do something like:
log/huge-querylog.csv => /zfs/compressed/...

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shay Rojansky 2020-03-29 21:31:42 Creating a database with a non-predefined collation
Previous Message Kartik Ohri 2020-03-29 19:18:00 GSoC Proposal