Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Date: 2022-03-15 02:37:25
Message-ID: 20220315023725.GW28503@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 14, 2022 at 01:53:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> > I also changed pg_ls_dir_recurse() to handle concurrent removal of a dir, which
> > I noticed caused an infrequent failure on CI. However I'm not including that
> > here, since the 2nd half of the patch set seems isn't ready due to lstat() on
> > windows.
>
> lstat() has been a subject of many issues over the years with our
> internal emulation and issues related to its concurrency, but we use
> it in various areas of the in-core code, so that does not sound like
> an issue to me. It depends on what you want to do with it in
> genfile.c and which data you'd expect, in addition to the detection of
> junction points for WIN32, I guess.

To avoid cycles, a recursion function would need to know whether to recurse
into a directory or to output that something is isdir=false or type=link, and
avoid recursing into it.

> pg_ls_dir_recurse(), but that's a WITH RECURSIVE, so we would not
> really need it, do we?

Tom disliked it when I had written it as a recursive CTE, so I rewrote it in C.
129225(dot)1606166058(at)sss(dot)pgh(dot)pa(dot)us

> Hmm. I am not convinced that we need a new set of SQL functions as
> presented in 0003 (addition of meta-data in pg_ls_dir()), and
> extensions of 0004 (do the same but for pg_ls_tmpdir) and 0005 (same
> for the other pg_ls* functions). The changes of pg_ls_dir_files()
> make in my opinion the code harder to follow as the resulting tuple
> size depends on the type of flag used, and you can already retrieve
> the rest of the information with a join, probably LATERAL, on
> pg_stat_file(), no? The same can be said about 0007, actually.

Yes, one can get the same information with a lateral join (as I said 2 years
ago). But it's more helpful to provide a function, rather than leave that to
people to figure out - possibly incorrectly, or badly, like by parsing the
output of COPY FROM PROGRAM 'ls -l'. The query to handle tablespaces is
particularly obscure:
20200310183037(dot)GA29065(at)telsasoft(dot)com
20201223191710(dot)GR30237(at)telsasoft(dot)com

One could argue that most of the pg_ls_* functions aren't needed (including
1922d7c6e), since the same things are possible with pg_ls_dir() and
pg_stat_file().
|1922d7c6e Add SQL functions to monitor the directory contents of replication slots

The original, minimal goal of this patch was to show shared tempdirs in
pg_ls_tmpfile() - rather than hiding them misleadingly as currently happens.
20200310183037(dot)GA29065(at)telsasoft(dot)com
20200313131232(dot)GO29065(at)telsasoft(dot)com

I added the metadata function 2 years ago since it's silly to show metadata for
tmpdir but not other, arbitrary directories.
20200310183037(dot)GA29065(at)telsasoft(dot)com
20200313131232(dot)GO29065(at)telsasoft(dot)com
20201223191710(dot)GR30237(at)telsasoft(dot)com

> The addition of isdir for any of the paths related to pg_logical/ and
> the replication slots has also a limited interest, because we know
> already those contents, and these are not directories as far as I
> recall.

Except when we don't, since extensions can do things that core doesn't, as
Fabien pointed out.
alpine(dot)DEB(dot)2(dot)21(dot)2001160927390(dot)30419(at)pseudo

> In the whole set, improving the docs as of 0001 makes sense, but the
> change is incomplete. Most of 0002 also makes sense and should be
> stable enough. I am less enthusiastic about any of the other changes
> proposed and what we can gain from these parts.

It is frustrating to hear this feedback now, after the patch has gone through
multiple rewrites over 2 years - based on other positive feedback and review.
I went to the effort to ask, numerous times, whether to write the patch and how
its interfaces should look. Now, I'm hearing that not only the implementation
but its goals are wrong. What should I have done to avoid that ?

20200503024215(dot)GJ28974(at)telsasoft(dot)com
20191227195918(dot)GF12890(at)telsasoft(dot)com
20200116003924(dot)GJ26045(at)telsasoft(dot)com
20200908195126(dot)GB18552(at)telsasoft(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-03-15 02:41:29 Re: Estimating HugePages Requirements?
Previous Message shiy.fnst@fujitsu.com 2022-03-15 02:08:50 RE: Column Filtering in Logical Replication